It will reset the STM32 which should reinitialize the CC3000. There is no RESET line on the CC3000 so the only way to hard reset that is with a power cycle. So far I have seen the watchdog timer code work during CFOD though, after which the Core connects to the Cloud… so re-initializing the CC3000 seems to work in lieu of a hard reset.
No, I meant it’s automatically kicked in the main MAIN loop, found in main.cpp. It’s the main loop that basically does this:
while(1) {
runBackgroundTasks();
if(online) {
runUserSetupOnce();
runUserLoop();
// proposing to put watchdog reload here
if(8 seconds elapsed) kickTheDog();
}
}
Now to answer the question of why 8 seconds? I was initially thinking it would save time to not reload it every time we came out of the user code. Basically shortening the background task time before we get back to the user code again. However it appears the reload it really just one command, wrapped in a function call… which is not really much time difference compared to a counter comparison. I think we should just kick the dog each time we leave the user code, but before the background tasks are run again. This gives PLENTY of time to connect to the WLAN or CLOUD (26.208 seconds).
The only thing about resetting the ORANGE/RED breathing state back to CYAN after a fixed uptime period of 6, 12, 24 hours is… if you come back and look at your core after 25 hours, you won’t know that it ever had an issue. I agree that breathing cyan is the best “look”, but somehow you need to permanently throw some kind of status that things went bad. You can always clear this in your user code after you log the condition, but if you’re not logging it… then you need to know about it. Perhaps there could be a different way to view the IWDG_RESET though… like breathe cyan but mix in a RED blip when the cyan fades out completely. It would be subtle, just enough to get the point across.
Only problem with NOT having the IWDG enabled, is if your code (any code) locks up before you enable it… you are toast. So that’s why one of the first things you enable is typically the watchdog. And if your user code locks up immediately on entry, you’ll never loop through the background tasks enough (as in not at all) to catch the OTA update.
Basically the mode button doesn’t do anything until you hold it down for 3 seconds currently… so a shorter press and release of say between 0.250 - 2.99 seconds could reset the USER_CODE_ENABLED flag, and make the RGB breathe magenta to indicate it’s connected to the cloud, but NOT running your user code. Magenta is associated with flashing user code already so it’s a good tie in color, and breathing tends to indicated connected to cloud.
Since there is a very simple loop as depicted above, I think it makes sense to kick the dog every time through that process. But only when it’s ONLINE because of issues like CFOD, or WLAN not connecting. Once CFOD is a vague memory, we could move the kicking of the dog to just before the backgrounTasks(); but if CFOD is not an issue anymore WLAN not connecting still could be… so perhaps in the ONLINE part of the code is the best place overall. If you disable SPARK_WLAN_ENABLE, the IWDG still functions to serve as a watchdog against user code locking up. And technically CAN be disabled in user code if need be. I think there MAY be an issue with SmartConfig here since that takes a while to work, and is not considered ONLINE at that point. I have an old timing diagram that seems to indicate SmartConfig is it’s own tight loop. Anything like that obviously needs it’s own kicking of the dog within it.
How long to wait is a good question… but right now in hardware with the IWDG the longest delay is 26.208 seconds. If we wanted to wait longer than that, I can’t currently think of a good way to add a foolproof software counter that extended it. It would be susceptible to lockups. 26.2 seconds seems like PLENTY of time to connect to your WLAN, or the CLOUD. if not, it gets reset and it can try again for 26.2 seconds.
It’s kind of hard to know WHAT failed when you reset from IWDG… unless we are constantly writing the state of where the code is to non-volatile memory as it’s looping… but I think we would wear out the memory pretty fast. Even if we knew it was USER code that locked up… why should we automatically prevent it from running again? Maybe it just locks up intermittently once in a while… say, on the hour exactly because of some counter wrapping or a bad compare to some time variables. If we just reset and run the code again, the user’s code will run for a whole hour before it locks up again… maybe providing them with precious sensor data. We can latch an indication on the RGB that IWDG has occured, which should help a user to realize there is a problem potentially with their code, or network. If there was an easy way to check uptime, a user could see the IWDG indication and send a request like: https://api.spark.io/v1/devices/?access_token=xxxxx to check uptime to see when it reset last. Uptime is basically the millis(); counter, and could easily be implemented in user code as a variable as well, but wastes one of your available variables.
User code should be able to block up to the IWDG timeout value of course I understand though that if user code blocks for more than 10-15 seconds the Core will drop off the Cloud. So why wait longer than that? To allow plenty of time to get the WLAN and CLOUD connected in the first place.[quote=“zachary, post:56, topic:2693”]
A breathing red LED seems like a good signal something’s wrong. I’m not sure about whether it’s better to run the user code while the red LED breathes or to not run it. Maybe try running it with an orange breathing LED the first time, and if we fail again, breathe red and don’t run the user code.
[/quote]
I’ve experimented with orange on the RGB and it just looks yellow and or red. It’s hard to tell it’s CLEARLY orange if you never stared at all of the other colors. Perhaps the red blip idea weaved into the cyan breathing would be best? Could even blip once, twice, thrice… for number of reset times. Anything over 3 is going to start being too many blips to count, ,so you can just assume it’s reset a lot. I do think we need to run the user code until the user decides not to… pretend everyone is designing a Black Rocket… mission critical stuff.
I think the watchdog code only works properly if it’s always running… i.e. the core-firmware sets it up… and user code can augment it (make it time out faster or disable it completely) … but you should not have to figure out how to setup the watchdog in your “arduino-like” code. You have better things to worry about! Understanding the codes is key though to knowing what’s working and what’s not working.