I have been working on a project for the past 3 months to upgrade or add functionality to an existing product using either the Photon or Redbear Duo.
Out system is a nurse call system that communicates with individual devices using a proprietary protocol very close to Zigbee. Out receivers take this information and return it the main computer via a slow CANbus, 20kbits/s.
My project is to create a wireless link between two CANbus nodes to tie two systems together when no other wired link exists.
What I am finding is that a call to client.write can lock the system up for what appears to be permanent, warranting a push of the reset button, but recently noticed that it clears itself somehow, sometimes within minutes but other times in an hour or so.
I have tried the application watchdog but this apparently has no effect when running in threaded mode as it doesn’t reset the photon after a minute of the loop being stuck.
This is getting long so let me get to some numbers and a few questions
We process between 1.5 - 2 million CANbus messages per day, these arrive in batches of up to 150 messages per second, roughly 6.7ms between messages. These are transferred immediately on receipt over TCP as a bundle of 22 bytes for each CANbus message. Checking the number of bytes written with client.write shows that when this returns 0 that the photon locks up. It no longer runs through the loop. TIme between lockups can be a few minutes to well over a week. The basic system not using TCP to transfer messages is rock solid and has been running for well over 100 million CANbus messages. It’s only now that I’m trying to add TCP that I’m running into trouble.
There are number of other threads ;
that I have looked at and it appears I am not alone.
The last one is probably the final solution and I will try this after I ask a few questions.
When opening a TCP connection when should you close it? I suspect that I should not open and close the connection every 6ms. Our system runs 24 hours a day with a maximum lull in communication of about 30 seconds. I suspect you open the connection and use it till client.connected fails or an artificial keep alive signal fails in which case you close the connection and try to open a new one. I find that sometimes client.connected fails many times a minute and other times it runs for days without failure, I guess depending on how noisy the Wifi environment is.
When client.write returns 0 what should be done? Is the message stuck in a buffer? Is it lost? if this happens should I close the connection?
If the solution offered in system firmware 0.8.0-rc.2 is what I should try then what does the timeout mean. Again is the message stuck in a buffer or is it lost. Should I close the connection after the timeout?
I am running two photons, one server, one client. System firmware 0.7.0-rc.7
Both in threaded mode
RSSI is good between 58 and 62
free memory as report by System.freeMemory() is at 41984 during moments without traffic and dipping to around 32000 when there is traffic.
I’m not a programmer by profession but have been learning over the last year as best I can. Any help with this issue is greatly appreciated. What I’m looking for is a way to keep the photon running without having to reset it as the final product is not readily accessible. The information carried by our CANbus is time sensitive so we can’t afford to either lose messages or delay them by minutes let alone hours.
Thanks in advance,