Categories
C++ Realtime Signal Processing

Efficient Realtime Signal Processing C++ Templates: Part II – Signals and Computational Operators

Introduction

Part I introduced a C++ template strategy for efficiently and effectively implementing general real-time signal processing computational pipelines or networks. These pipelines can form a hierarchy of different operator types connected in a network, operating on signals of various types. As an initial example, we focused on computational pipelines with multi-channel linear digital filters. We demonstrated how to construct these filters with relative ease, without the need for complex software machinery or heap allocation.

In this installment, we turn our attention to the two main components of a digital signal processing computational pipeline: signals and computational operators. We discuss their properties and our implementation models for them in more detail, demonstrating both concrete and abstract tactics and strategies for their efficient realization.

Signals in the context of digital processing by a computer, are 1D, 2D, 3D, or nD data arrays or other application-specific data structures or types, sampled or updated at discrete time points. These signals typically begin as sensor measurements of physical phenomena, such as voltages, sounds, or images, and may then progress through multiple computational stages to yield higher-level or derived properties from the initial measurements, such as waveform features or classifications. For example, in an electrocardiogram (ECG), signals might include the location and classification of each QRS complex as a normal beat or as indicative of a specific type of arrhythmia. Data may be acquired at different sample rates or processed so that it is spaced at non-uniform time intervals. In such cases, a time dimension or data structure field explicitly contains the non-uniformly spaced time values of the array data samples.

In most cases, a 1D mathematical vector:

\mathbf{u}[k] = \left[u_0[k], u_1[k], \ldots, u_{M-1}[k] \right]^T

is sufficient to represent M data samples (e.g., from M sensors) taken at the same point in time, where integer k identifies the time and

k = 0, 1, \ldots,

distinguishes each vector sample collected sequentially in time. Given the data sampling frequency f_s in units of hertz (Hz), the true time of the data sample relative to the first sample at k=0 would be

t_k = k / f_s

The highest value of k represents the total number of vector data samples collected. The time varying vector \textbf{u}[k] is thus defined as a discrete time mathematical signal. Specifically in our context, it is an input signal for a computational operator \mathbf{G} that processes it to produce output signal \mathbf{y}[k] :

\mathbf{y}[k] = \mathbf{G}(\mathbf{u}[k])

More complex signals may consist of sample vectors that are multidimensional or non-uniformly spaced in time, requiring additional indices to represent or a more intricate organization, such as explicit time values for varying time intervals between samples. Our templates are designed to handle these complexities effectively, and we will explore these aspects in detail when necessary. For simplicity, however, we can often conceptualize higher-dimensional data types as flattened into vectors because the fundamental processing implementation and strategy remains largely unchanged. When dealing with higher order tensors, for example, the main implementation difference is usage of a tensor sample type instead of a vector type in the computational operator template.

Time Windows and State

We can observe real-time signal samples from the present and past but not from the future. The history of these observations is referred to as a time “window” which, in theory, could extend infinitely into the past. However, in practice, it starts when data acquisition begins in a physical instrument. Real-time computation typically requires processing data within a fixed-size time window of N data samples. At any given current time k, the data samples available for computation are

\mathbf{u}[k-N+1], \mathbf{u}[k-N+2], \ldots, \mathbf{u}[k-1], \mathbf{u}[k] .

The current data window accessible to an operator can be represented as an N \times M 2D array or matrix, where each row is a vector data sample.

Saving and processing an ever-growing data buffer, instead of using a fixed sized window for example, could not only risk memory issues but also increase computational latency and overhead as the system operates. This could potentially lead to unexpected real-time failures that are generally undesirable.

To account for past data outside the current time window, a common strategy is to use a state variable vector,

\mathbf{x}[k] = \left[x_0[k], x_1[k], \ldots, x_{P-1}[k] \right]^T

to represent and summarize the net effect of the past via a state update operator:

\mathbf{x}[k+1] = \mathbf{F}(\mathbf{x}[k], \mathbf{u}[k])

The operator \mathbf{F} computes the next state \mathbf{x}[k+1] from the current state \mathbf{x}[k] and the input \mathbf{u}[k]. When summarizing past data, \mathbf{F} often assigns greater weight to recent data compared to data from the more distant past, as is common in linear filters. The state also serves as an additional input for the output operator, which now computes the output signal \mathbf{y}[k] from both \mathbf{u}[k] and \mathbf{x}[k]:

\mathbf{y}[k] = \mathbf{G}(\mathbf{x}[k], \mathbf{u}[k])

Ring Buffers

To avoid unnecessary data copying, a common software tool for managing the N \times M time window matrix is the “ring buffer”. In its simplest form, the ring buffer maintains a row index, n \in \{ 0, 1, \ldots, N-1 \} , which points to the oldest sample vector in the matrix. The following code lines from the boxcar filter shown in Part I illustrate a simple ring buffer. pState is initialized with maxN rows and M columns, while idx (n) identifies the oldest sample vector in the ring.

The operator uses the current n to (1) initiate an iterative computation cycle with the oldest sample, (2) replace the oldest sample with the newly acquired input sample vector, and (3) increment n modulo N so that it continues to point to the oldest sample when the operator begins again at step (1) in the next iteration. Many of the basic filter (operator) examples in Part I followed this pattern.

The use of a ring buffer eliminates the need to shift matrix rows to maintain the current time window in the same row locations, thereby avoiding a potentially significant overhead that can occur from inner computation loops to throughout the computational pipeline. However, using a ring buffer requires that processing be designed to accommodate the circular buffer index n, which accesses data starting at a different row on each iteration and wraps back to the first matrix row after reaching the last row.

Signal Class

In our implementation model, processing any time sequence of any data type using a ring buffer is so central that we support it as a general implementation design pattern via the Signal class, a portion of which is shown below.

This class provides support for circular data processing to reduce the need for data copies. The Signal is viewed as having a fixed number of samples representing a finite window of time. A new sample generally replaces the oldest sample, identified by the index field. Initially, the samples may contain random data, leading to a transient period until the buffer is full. After the buffer is filled, replacing old samples with new ones represents steady-state operation. Class methods include support for (a) retrieving the oldest or newest data samples, (b) moving samples into and out of the Signal, (c) determining how far the current sample is from the end of the Signal (is the number of samples that are contiguous before wrapping), and (d) providing a count of the total samples processed as well as finding a sample index that accounts for Signal buffer wraps or multiple overflows.

Note that the linear filter operator examples presented previously did not use the Signal class internally to implement their signal ring buffers. Although this could have been done, we generally choose to employ abstract machinery only when simple and direct methods are insufficient. The objective of any abstract software machinery is to make complex implementations easier, not to make simple implementations harder. For this reason, abstract code is not used for simple tasks where it may technically apply but tends to obscure rather than assist.

General Computational Operator Model

Figure 1 illustrates this real-time computational operator model. In addition to maintaining a finite N \times M signal time window for the input \textbf{u}[k], it also keeps a finite L \times P signal time window for the state vector \textbf{x}[k] by storing the last L state vectors.

Figure 1: General model for a real-time computational operator

Although the general multi-input, multi-output functions \mathbf{F}(\mathbf{x}[k],\mathbf{u}[k]) and \mathbf{G}(\mathbf{x}[k],\mathbf{u}[k]) are depicted as functions of the entire input and state signals, they should be understood as only having access to the signal and state values stored in the respective signal ring buffers connected to their inputs. This general computational model is applicable to many signal-processing functions, including linear filters, Kalman filters, nonlinear regression, support vector machines, convolutional neural networks, etc.

There is a transient operation period during which the signal ring buffers fill up, after which the operation enters a steady state. If the designs of \mathbf{F} and \mathbf{G} employ algorithms with deterministic computational characteristics and latencies, and if implementations exclude sources of highly variable latencies—such as heap based dynamic memory allocation—then the computational performance statistics will be more stable in steady state operation, with reduced variability around the mean.