Spark Core Execution Speed

I’ve been wondering exactly how fast the Spark Core sketches will run. Since the STM32 is running at 72MHz right? I thought it should be blazing away… I was hoping that it would at least be as fast as an Arduino Uno in how it executes a sketch.

Just as a simple test, I’m running this sketch:

void setup()
{
  pinMode(D7,OUTPUT);
}

void loop() 
{
  digitalWrite(D7,HIGH);
  digitalWrite(D7,LOW);
  digitalWrite(D7,HIGH);
  digitalWrite(D7,LOW);
  digitalWrite(D7,HIGH);
  digitalWrite(D7,LOW);
  digitalWrite(D7,HIGH);
  digitalWrite(D7,LOW);
}

This just bangs away as fast as it can… and what I’m seeing is:

High for 2.05 microseconds.
Low for 2.05 microseconds.
(repeats 4 times)
But then there is a long delay from the end of the loop() back to the start of the loop() for 5 to 6.5 milliseconds. This number bumps around quite randomly, I’m assuming as background cloud tasks are interrupting for more or less time.

This basically means the fastest loop execution is about 200 times per second, which is pretty slow for such a beefcake micro like the STM32.

I wanted to perform some direct port manipulation from the Sketch as well like you can in an arduino to really test the clock speed, but I wasn’t sure how to do this on the Core. Is there a way to essentially do bit banging directly from the sketch, like you can with the arduino.

Could you guys please tell me if what I’m seeing is correct, and if so is this as fast as we can expect sketches to run? I know various peripherals that are handled via STM32 hardware (SPI, I2C and networking) would probably not suffer from this loop() delay, but a real time executive of 1ms tick is pretty standard for low end micros and it seems like we’d only reliably be able to create a 7ms tick without running out of real-time.

3 Likes

Hi,
I see the same beahiour as you. It seem that system loop take some time and can be an issue with real time application.

I read in some other thread about the regular cloud ping about every 10sec which takes place in the system loop.
It would be great if it were possible to wrap some system actions in this sys loop into IF clauses to enable/disable these actions temporarily or permanently from within user codes
e. g.

System code:
if (CLOUD_PING == true)
{

}

User code:
loop()
{

CLOUD_PING = !(bTimeCritical == true); // during time critical actions, don’t do CLOUD_PINGing

}

@BDub I don’t believe we’ve done any tests on this but if that’s what you’re seeing in real-life tests then it’s hard to argue with. However there are plenty of opportunities for optimization; we’ve spent little time thinking about how to speed things up and lots of time thinking about how to make things easier for the user.

One of the major tasks we have on the horizon is integrating an RTOS like FreeRTOS which will do a better job juggling the system loop (handling the connectivity stuff) and the user loop. However we’re already using a decent amount of RAM for our RSA handshake, so we’re trying to figure out how to achieve this within the limits of our chip.

@BDub @lemouchon @ScruffR If you’re finding that our firmware stack isn’t meeting your needs for real-time applications, PLEASE hack our firmware! This is exactly why it’s open source, and if you’re comfortable in embedded land, I’m sure you will find plenty of opportunities for improvement without looking too hard, since speed optimization hasn’t been high on our list.

Hi @zach,
I hope you do not take my comments as criticism. It is only to be sure to understand for which applications your fabulous device and the concept may be used. And to let may be other user clearly understand some behaviour that seem to be strange (but logical in the way the firmware work) at first approach.

And as you say this is an open source project and we are free to change it for our own usage. :slight_smile:

@zach There’s no doubt you have made it easy for the user, that work is clearly visible :wink:

Perhaps make some background goals to achieve at least the performance of the Arduino Uno as far as speed goes. I’m pretty sure the digitalWrite() execution time is already faster. I remember the Arduino taking 3 to 5 microseconds to set a pin high or low. It might not be possible to whittle that loop() delay down to nothing, but certainly there is room optimize it like you said. Giving the user a way to bit bang pins would be sweet with the speed of the STM32.

I will agree with @ScruffR as well that a way to know when the long background tasks were about to occur would be goo to know. A way to temporarily disable them would be nice as well, like disabling interrupts for time critical routines.

Can you tell me what the longest blocking delay might be for the background tasks?

I haven’t used FreeRTOS in any of my embedded designs, so I can’t speak too much for it… but my gut tells me it will suck up a lot of RAM and real-time in trade for convenience. In the Arduino it’s already pretty easy to roll your own state machine or timed processes without much effort. You just need to provide some decent examples like this one for timed processes.

//----------------------------------------------------
// Temperature filtering with a 1/16th Dilution Filter
// BDub 12-21-2013 tested and working on Spark Core
//
// 1/16th of the new reading gets added to the ongoing 
// running total of 16 virtual readings, with a little 
// correction for the truncation process.  Very fast 
// filter for slow 8-bit uC's that don't have multiply 
// or divide instructions.
//
// avg = (new + (avg * 16) - avg +/- offset) / 16;
// avg = (new + (avg * 15) +/- offset) / 16;
//----------------------------------------------------

uint8_t TEMP_PIN = A0;
uint8_t LEDPIN = D7;
uint8_t DEBUG = false;
uint16_t rawTemp = 0;
uint16_t avgTemp = 0;
int16_t offset = 0;
uint32_t lastTime = 0;
uint8_t msCounter = 0;
uint16_t threshTEMP = 2048; // approx. 3.3V/2

// The larger the update interval, the heavier the filter will be.
uint32_t UPDATE_INTERVAL = 10; // in milliseconds

void setup()
{
  // for debug
  if(DEBUG) Serial.begin(115200);                   
  pinMode(TEMP_PIN, INPUT);
  pinMode(LEDPIN, OUTPUT);
  // seed the average reading
  avgTemp = analogRead(TEMP_PIN);
}

void loop() {
  // Update the filter every 10ms (default)
  if(millis() - lastTime > UPDATE_INTERVAL) {
    // Set a new last time
    lastTime = millis();
    
    // Read the temperature input
    rawTemp = analogRead(TEMP_PIN);
    // Add or subtract the offset based on new reading
    if(rawTemp >= avgTemp)
      offset = 15;
    else
      offset = -15;
    // Filter the ADC every 10 ms (will resolve in approx. 740ms worst case 0-5V)
    avgTemp = (uint16_t)((rawTemp + (avgTemp << 4) - avgTemp + offset ) >> 4);
    // You can see this is a fast way to multiply by 15.
    
    // Debug
    if(DEBUG) {
      Serial.print("RAW: ");
      Serial.print(rawTemp);
      if((rawTemp > 99) && (rawTemp < 1000))
        Serial.print(" ");
      else if((rawTemp > 9) && (rawTemp < 100))
        Serial.print("  ");
      else if(rawTemp < 10)
        Serial.print("   ");
      Serial.print(" AVG: ");
      Serial.println(avgTemp);
    }
    
    // Process Temperature Reading every 100ms
    // Every time through is 10ms, 10 x 10ms = 100ms.
    if(++msCounter > 10) {
      msCounter = 0;

      // If temperature is above thresh, light the LED
      if(avgTemp > threshTEMP) {
        digitalWrite(LEDPIN,HIGH);
      }
      // else keep the LED off
      else {
        digitalWrite(LEDPIN,LOW);
      } 
    } // End inner timing loop (100 ms) 
  } // End outer timing loop (10 ms)
} // End main loop (currently runs every 5-6 ms

One other large hit to performance which can not be predicted is the delay when connecting to the cloud. If your internet has a high ping it could be a huge delay. Ive seen wireless pings higher then 50ms before causing a huge off set

Thanks for the feedback guys. This is very high on our priority list (specifically decoupling our code and the user code). Here's a quick note from @satishgn, who's working on this:

The plan is to have 2 super loops instead of main loop. The Spark
main loop will be a part of the kernel process and the Arduino's
setup() and loop() will go as a user task independent from the main
loop. There's some assembly progamming involved to do achieve this
part.

Hopefully by next weekend, the firmware will be suited for real
time user applications.

2 Likes

My internet connection at home can have ping times 700 - 900 ms on bad days. Even on good days, 150 - 200 ms is not uncommon. I would consider 50 ms to be fast!

My home is in a rural area, and the phone line will not support any variation of DSL. One day I hope we will get fibre to the roadside cabinet, when I hope the last couple of miles of copper will allow something, but until then I am forced to rely on the 3G network for my connection.

That could cause drop outs causing massive delays. As you said, almost 1 second of delay just to connect to the cloud. And even the risk of it losing connection and totally stopping. The option to enable and disable the server checks would be nice. Even if it came to me being able to run a local cloud server to do the cloud stuff, atleast the core should always be connected through lan even if it has no internet.

The one concern I have is, my lights are LEDs. So if my power goes out I can still run them. Once connected to these, even with a battery connected to power it. Unless I have internet it wont run. Or even demonstrating my project at university, no wifi means no demonstration. It could be a huge draw back, especially with a lot of univeristies using arduinos. These would be much better to use but have the huge flaw of needing the cloud. Its a big oppertunity to miss out on

1 Like

Once the user firmware and Spark firmware are decoupled, this should no longer be an issue; your user code will no longer be blocked by connectivity issues, and vice versa.

Just wondering if this was addressed in a firmware update between Dec and now? I’m currently looking at a system where a 6 millisecond loop delay will kill the performance. Thanks.

Im not sure if I got you correct but delay() now calls for SPARK_wlan_loop() so connection to the cloud is not dropped

Kennethlimcp, SPARK_LAN_Loop() can take 5-6ms to run!

I’m not looking to use any delay() commands, I want the main loop to be performing as fast as possible, I don’t need to be connecting to the cloud or to my internal network at any point during the loop for this function, everything is being output serially via USB.

The OP noted 2 microsecond response time for digital HIGH/LOW which is fine (though I’d be happier with 1 microsecond or faster!) for what I want to do within the loop - but a multi-millisecond delay at the end of every loop will be a major issue.

Zach had noted in his response in Dec 13 this might be addressed in a firmware update, but I couldn’t find a list of changes to firmware since then or any notes if this had been addressed.

If anyone has any detailed information on the timing within and between loops, and any way to speed response time, it would be much appreciated.

Hi @PaulR

You know you turn off the Spark cloud connection and the WiFi connection now, right? I think trying that would be your best bet.

#include "spark_disable_wlan.h"
#include "spark_disable_cloud.h"

I did not know that - thank you!

Hi @PaulR - at this point it looks like we won’t be able to change the delays in the CC3000 host driver anytime soon. That said, besides the #includes that @bko mentioned, in the next month we will be releasing a number of new methods that will let you take control of the connectivity so that you can process messages to/from the Cloud on your own schedule. It won’t eliminate the delays, but it will at least put them under your control.

2 Likes

Hi,
Will the 5ms delay still persist even after disconnecting from the cloud by WiFi.on() and Spark.disconnect()?
5 ms is around 200Hz and i need to sample a signal at 250Hz. Where in the documentation could i possibly check the clock configuration? I couldnt possibly find out the same.
Thanks and Regards

@sakethurambhatla, I am not sure what you mean by the “clock configuration”. However, it does sound like you may need a hardware timer running at 250Hz to create an interrupt in order to sample a signal at that rate. I recommend using the SparkIntervalTimer library I created. It is available as a library in the web IDE. :smile:

1 Like