I’m about to describe something really strange, and I hope someone can help. Our Escape Rooms use Argons running 2.0.1 firmware. Recently, something very strange started happening. Some of the Argons would no longer receive messages. I can call their functions, and those respond correctly, but a published message is not received. Ok, that strange part number 1.
Strange part number 2 is that, once they get into this state, they only way to get them out is to power them off for 10 seconds and then power them back up. Rebooting will not clear the state. Reflashing will not clear the state!
Some notes on the environment. I mentioned that I am using 2.0.1. I use PublishQueueAsyncRK, but, I removed the “retained” operator from the message queue. At first I thought that I hadn’t, and the “retained” was the source of the problem (I still think is related), but when I check I had already removed it. I also use SYSTEM_THREAD(ENABLED). I commented it out to see if it would help, and it didn’t. This is also where I discovered that reflashing doesn’t clear the state.
Another item in the environment is that I use C++ class as the subscribed handler. I don’t know if that matters, but I’m trying to put out as many items as possible to see if it helps.
The problem has occurred on multiple - but not all - my Argons. They all run the same framework, so the message handling, firmware version, and other items are identical, with only the game logic being different. It reoccurs randomly, sometimes within 30 minutes.
I am able to detect the current state by publishing a message that they must respond to, and anyone who doesn’t respond tells me who has the problem. We’ve had to hook up remote power switches to bring them back on line. I have also had to take other measures to handle it (using published functions) for when the problem occurs in the middle of a the game.
Another room using a similar system does not have the problem, but that is running 1.5.2 because we still have some Xenons there.
So, my suspicions:
- It is related to 2.0.1
- It is related to retained (because of the poweroff clear)
Question: Can I do software reboot that dumps the all retained values? If so I could at least clear the situation with a reboot.
Thanks. It’s a real head scratcher, but hopefully someone has an idea.