My core has been dropping off my local net once or twice a day. I cannot get a response from it. It’s still “breathing” cyan. I need to push the reset button and the Core reconnects and its fine for for a while.
What is the underlying mechanism of the breathing? I suspect its an interrupt. Can I duplicate the process to create a watchdog to test for network connection, or periodically reset the Core via software.
I have a similar problem with the beehive monitor. Once a week or so it fails to make a good WiFi connection when it makes its hourly wake from deep sleep and sits there inaccessible and doing very little. If I don’t notice, the Spark runs its batteries flat in a few hours. If I notice I can power cycle and everything springs back to life. Since both alternatives involve putting on a bee suit I’d rather have a watchdog recognize I’m stuck and try a reboot. Is there any news on the watchdog timer front?
unsigned int testConnectCount=0 ;
loop(){
if( testConnectCount==0) { //Every couple thousand or so loops, check to see if the Sparks connected
if (!Spark.connected()) { // If the Spark's not connected, turn off wifi, sleep 10 seconds and turn it on again
Spark.sleep(10);
while( WiFi.status() != WIFI_ON ) delay(1000000) ; //Wait a second and check status,
}
testConnectCount++ ;
} else {
testConnectCount = testConnectCount == 2000 ? 0 : testConnectCount + 1 ;
}
...
}
I put it in last night and haven’t had to physically reset the Core since. In my case, I wait in a “while()” loop until the Spark reconnects. This is fine for my application. You might want to do something different.
Sorry about the code legibility, I don’t know how to make it look nice.
@ronm, delay() does NOT block the background task anymore (for quite a while actually). It will “appear” to block the user code but not the background task.
Thanks for the clarification. It must have been an old post, I didn’t look at the date.Then the code is working as I expected, except it’s sitting around about a 1000x longer than I thought.
@ronm Thanks - I had overlooked the Spark.connected() test. That is just what I want to trap with my watchdog.
I gave your code a try. I modified it slightly as follows:
void loop() {
//check that we are connected before proceeding - suggested by ronm :)
if (!Spark.connected()){
digitalWrite(led, HIGH);
Spark.sleep(10); //wifi off for a bit
while (WiFi.status() != WIFI_ON) delay(1000); //wait for wifi to resume
delay(5000); //allow time for re-connection
digitalWrite(led, LOW);
}
else{
if (udp.parsePacket()>1){
…
I put in a 5 sec delay and tested every time round the loop rather than test for a connection every 2000 times round the loop bearing in mind that if I do have a connection I only go 4 times round the loop before going back into deep sleep and if I don’t have a connection I’m round the loop in microseconds. Maybe I need to give the Spark a bit longer than 5 secs to connect here - we’ll see.
I’ll let you know how this version goes. One plus is that if I don’t have a connection at least the Wifi is off for 2/3 of the time so the battery won’t go flat so fast.
Your code toggles the WiFi if the connection is lost. If this doesn’t improve the connection reliability I wonder whether I could connect one of the digital pins to the reset pin with a 10k resistor and pull it down if I have a long interval with no connection.
I’ve been having trouble with Spark.connected(). It seems my Spark is hanging on a TCP read or write. Spark.connected() is still true, but TCP is hung. I can’t get status on the client or server, so I’m a little stuck. Since the spark is still connected I can reload the firmware, effectively rebooting it. Im currently trying to get the cli on my PI to let it reboot the Spark when the tcp comms go dead.
@ronm did you get a lasting solution to this? Unfortunately, after a week or so it turned out that my variant of your code didn’t cure my hanging problem. Toggling WiFi doesn’t seem to be a “deep enough” reset and of course the trouble may lie at the router end. I have just tried another variant:
void loop() {
if (millis()-startT > 30*60*1000){ //if Spark has been awake for more than 1/2 hr (i.e. stuck) send to sleep for 30 secs to reset
Spark.sleep(SLEEP_MODE_DEEP, 30);
}
if (udp.parsePacket()>1){
...
...
startT = millis();
...
startT is set to millis() in setup() and on a successful UDP communication.
Correct operation is for the Spark to wake up, listen for UDP instructions and after transmitting some measurements the final instruction sends it to sleep. If it is still awake after 1/2 hour it has got stuck somehow and needs to be reset. Since toggle wifi didn’t help so I’m trying 30 secs deep sleep which seems to be the closest thing to a software reset.
So far so good but I need a couple of week’s running to check.
I’m having a similar problem with TCP hanging up periodically.
It’s a bit hacky but if we can’t find a way to test for the problem in code perhaps just doing a Spark.reset() every 12 hours or so would be a workaround for now.
@mrOmatic - is that new? I couldn’t find a software reset command before. How do you call Spark.reset()? When I tried from the IDE I get: error: 'class SparkClass' has no member named 'reset'
and I can’t find a software reset in the documentation. On the Arduino I’m told the hack is to define a function pointer with the address 0 and call that.
Thanks - I can only program the bee Spark remotely so that would be useful.
Edited: Yes I’ve checked. Spark.reset() is in spark_utilities.cpp but not in the IDE yet however NVIC_SystemReset(); does exactly the same job and can be called from the IDE.
Edit 2: System.reset() is in the IDE and also does the same thing.
Haven’t had a chance to try it on a core yet but from my up to date build environment I can confirm that System.reset(); compiles ok and Spark.reset(); fails.
I’ll do an actual test on a core when i get a moment.
I am using System.reset() using the IDE and can confirm that it works.
I’ve made a small webserver, but the Spark Core is far from stable. Most often the Core stops functioning properly within ten or twenty minutes, taking long times for the core to reset by itself.
To get more insight into its behavior I have added a watchdog function to my loop() which checks every second to see when it was last called. If the function was called more than three seconds ago, it prints a warning message to the serial connection and to a logfile on SD. If it was called more than ten seconds ago, it also prints the message (which mentions the actual duration of the outage) and subsequently call the System.reset function.
In the past days my logfile has logged many of these watchdog resets. I’ve noticed that most of these watchdog resets are preceded by a 60 second outage. Not, 57 seconds, not 62 seconds, always 60 seconds. To be more precise it also prints the milliseconds and then I get these kind of outage durations: 60.0, 60.880, 60.135, 60.679, etc.In other words: these outages took always 60 seconds (rounded to the second) and then control was returned to my loop. In other occasions I sometimes also get 20 second outages, but those are way less frequent.
I can only conclude that there is a 60 second blocking time-out somewhere in the Spark Core firmwarecode or perhaps the WiFi driver, which after time-out returns control to the user code. In earlier versions my server would just not respond for long times and sometimes reset automatically (I guess as instigated by the Spark firmware).
Unfortunately this instability makes the Spark Core unusable for my production environment. Hopefully the firmware will be improved soon.
To investigate a bit more, I went back to the beginning today. I took the sample blink program, added a tiny beep (using an active piezo buzzer) and adjusted it to blink three times, followed by a beep every second. The loop contains nothing but these blinks and the beep. All delays within the loop total to one second and all other code is only the digitalWrite of either the blue LED, or of the buzzer. In summary: this code should give a regular beep every second, accompanied by thee blinks.
I found that when running this code (not containing any TCP calls) the stability of the Core was much better, but the beep was still not as regular as it should. It sometimes skipped a second or more, I guess when executing Spark code.
I didn’t keep it running for a very long time, but long enough while typing this response to notice the irregularities…
I think there is a compiler bug (or at least a difference from the Arduino compiler). I used to be able to do something like this at the beggining of the function to ensure the function ran only once per second
This is a pretty odd gotcha, I wonder if it has to do with the 32 bit operating system? I guess subtracting two unsigned numbers can give you a negative in this compiler!
Edit
I am now seeing the problems you were having. Every once in a while the Sparkcore will just sit there – it won’t blink, it won’t do anything. This is a pretty significant problem – hopefully it can get addressed soon!
The millis() function returns an unsigned long (32 bit unsigned integer) so your casting is throwing away the upper 16-bits. Try unsigned long or uint32_t as the return type for millis();
I ran a regular blink and the code had no problems for a good long while (over 1000 seconds). I then switched back to running my code which uses the TCPClient. My code attempts to connect to the TCPClient every 5 seconds (it never connects because currently the client is down). The code runs great for 310 seconds, and then every 15 seconds it has a long delay. There are a couple of possibilities here:
It is delayed every 3rd attempt at connecting to the server
It fails every 15 seconds for another reason (I don’t think it is my code)
I am going to shorten the connection time to test the first theory and get back to you.