# Digital Signal Processing performance improvement help

I created a function which apply weighted moving average and Loglikehood ratio test and Binary Integration to process my XBand Radar signal.

I flashed and ran this function on particle Photon with wifi on and using micro() function to get time before and after execute this function. The result I got is ~12600 microseconds. I know many ppl have better way to optimize this code, so here it is. thanks for all your help.

`````` int Detection::Detector(volatile float *XBsignal,volatile float *PIRsignal, int lenXBsignal, int lenPIRsignal, float noise_Var, float Vt, float signal_Var,int Fs,float meanXBVal,int *PIRsum)
{
int Fo = 100;  //default for stage 2, 10 observation
int Fo1 = 10;  // default stage 1, 10 observation
int *stage1Detection = NULL;
stage1Detection = (int*) malloc(Fo*sizeof(int));
int window = 7;  //moving average window, using Matlab to calculate window
float *binomialCoeff = NULL;
int lenbinomialCoeff;
float h = {0.5,0.5};
binomialCoeff = (float*) malloc(window*sizeof(float));

conv(h,h,2,2,binomialCoeff,&lenbinomialCoeff);

for (int n = 1;n<=window-3;n++)
{
conv(binomialCoeff,h,lenbinomialCoeff,2,binomialCoeff,&lenbinomialCoeff);
}

float lrt =(2*pow(noise_Var,2)*pow(signal_Var,2)/(pow(signal_Var,2) - pow(noise_Var,2)))*log(Vt) - (Fs/Fo)*log(noise_Var/signal_Var);

// Detection::SetValues(XBsignal, lenXBsignal);
mean = meanXBVal;
// XFiltered = (float*) malloc(Fs*sizeof(float));
// Detection::WMA(XBsignal,XFiltered,lenXBsignal,7);
// do not need mean value anymore free memory
//120 usecond up to here
int sumXB = 0;
int sumPIR = 0;
int stage2Observation = 0;
int sumXBandstage2 = 0;
int sumPIRstage2 = 0;
int k;
for (int observation = 0;observation < Fo; observation++)
{
float sumxobssq = 0;
// calculate weighted moving average for each element
for (int i = 0; i < Fs/Fo; i++)
{
float WMA_k = 0;
k  = i + (Fs/Fo)*observation;

if (k < window - int(window/2))
{
for (int n = (window/2)-i;n < window;n++)
{
WMA_k = WMA_k+binomialCoeff[n]*(XBsignal[n-window/2 + k]-mean);
}
// wsignal[k] = WMA_k;  //should be devide by sum of weight, but sum of weight = 1 in this case
}
else if (k < Fs - int(round(window/2)))
{
for (int n = 0;n<window;n++)
{
WMA_k = WMA_k+binomialCoeff[n]*(XBsignal[n+k-int(window/2)]-mean);
}
// wsignal[k] = WMA_k;  //should be devide by sum of weight, but sum of weight = 1 in this case
}
else
{
WMA_k = XBsignal[k] - mean;
}
//end weighted moving average
sumxobssq = sumxobssq + pow(WMA_k,2);  // calculate N*R^2 for log likelihood ratio test
}
if (sumxobssq > lrt)
{
sumXB = sumXB + 1;
}
sumPIR = sumPIR + PIRsignal[observation];
stage2Observation++;
if (stage2Observation == Fo1)
{
if (sumXB > 7)
sumXBandstage2 =  sumXBandstage2 + 1;
if (sumPIR > 7)
sumPIRstage2 = sumPIRstage2 + 1;

stage2Observation = 0;
sumPIR = 0;
sumXB = 0;
}
}

(*PIRsum) = sumPIRstage2;
free(stage1Detection);
//  free(XFiltered);
stage1Detection = NULL;
//  XFiltered = NULL;
return (sumPIRstage2 + sumXBandstage2);
}
``````

Don’t use floats! ( Look into fixed point math )

If you need to do large amounts of DSP you might want to just dump the raw data to a server and have it crunch numbers. Otherwise you need to know how microcontrollers handle mathematical operations.

Basically :
floats = evil
multiplying ints = sometimes evil
dividing ints = almost always evil
multiplying floats = very evil
dividing floats = very very very evil

Thanks @jakeypoo, that help. One of my objective is doing all DSP on this Photon, so I try to optimize as much as possible without going use a server to do all the processes.

I will look into data type and dividing, I understand all math instruction function does not have equally run time. Faster to do multiply than divide.

None of the above quoted code depends on the inputs so I would either calculate the binomial coefficients once in setup() and store them in a global array or more likely just write out the precomputed values, probably as integers as @jakeypoo said. Try scaling by 64 for your current window length the see the integer values.

I would try to find out how much actual precision your RADAR return signal has and use integers as @jakeypoo said. I am not sure if pow(x,2) is faster or slower than x*x but I would guess slower. Similarly log(x) can be approximated in a variety of ways that might be faster than what you are using now.

1 Like

Thanks @bko. I plan to make an Initialized() function to get those common, input independence initialized. Just curious, how do you know 64 is good value to scale? I varied it in Matlab and 64 is exactly what I need to make them integer.

look like it is = (2^window)/2?

1 Like

I guess I would think of this as (1/2)^(window-1), giving you 1/64, but that’s the right idea!