Given the glowfi.sh team’s recent experiences with some of the sensor data from our photons and previous posts on the forum (e.g., Why does network signal strength go high erratically?), we thought it might be useful to demonstrate signal despiking.
As the name suggests, despiking is an automated process for detecting and eliminating short-term outlier values in a time series. For glowfi.sh, we are interested in removing spikes (i.e., noise) so we can more accurately solve machine learning problems like anomaly detection.
Despiking consists of identifying spurious data in a series and replacing these data with suitable estimates. Most methods involve filtering or statistical separation of a signal into binary groups (e.g., normal and outlier). The algorithm outlined by Feuerstein et al.(http://www.ncbi.nlm.nih.gov/pubmed/19449858) is a good general approach that uses both polynomial interpolation (Savitzky-Golay Filtering) and statistical thresholding (Otsu’s Method) to (1) detect the position and width of a spike and (2) remove spikes in the signal by excision, linear interpolation, and smoothing of the identified regions.
Enough about the details…lets look at some Photon data. Below we have graphed two sensor time-series from the photon that we have posted to the glowfi.sh API via Particle webhook. The top plot is raw temperature data using the DS18B20 sensor. The third plot is wifi signal strength in dB. From the raw sensor series, we can see that we get severe spikes in both the temperature signal (-196 deg F) and wifi signal (2dB) due to time-out errors. Although these spikes are obvious and could be removed by simple thresholding, spike amplitude and frequency are often unknown (see “Pass 1” intermittent spikes in bottom plot for wifi signal after 2dB spike removal). Having an automated method for removal is quite useful in practice. After applying our despiking algorithm to both temperature and wifi signals as described above, the resulting despiked signals are shown in the second and fourth plots below. For the wifi signal, we ran two additional passes of our algorithm to remove non-obvious spikes (shown as Pass 2 and Pass 3 in the bottom plot). Note about sampling…this method assumes uniform and monotonic sampling, so a pre-processing resampling step may be useful in practice.
If there is a desire in the community, we can make public in the glowfi.sh gitHub repo a Python implementation for people to play with. Just reply to this post to let us know if you are interested. We hope this was useful to some of you.
The glowfi.sh Team
For free access to glowfi.sh API, just sign up here.