Photon freezes on startup after power disconnect


#1

I’m working with several protons developing a prototype for an in-home monitoring system.

The issue is that when the units are disconnected from power for a significant period of time, they freeze when starting back up. The location of the freeze will vary if I add in various delays, so I don’t think it’s a particular line of code.

If I enter safe mode, and then hit the reset butting then it boots correctly. If I momentarily disconnect the power and then reconnect it also boots correctly. Once up, they are rock solid, running for days.

I have eliminated the 5ghz network in the location, so they should only be connecting to the 2.4 system. It’s pulling an IP address, and it is getting past:

Particle.connect();
waitUntil(Particle.connected);

in the setup portion, so its getting to the cloud. It doesnt seem to get out of the setup section before freezing.

Here’s the code if that would be helpful, but it’s pretty long as I’m connecting an SPI display, IR receiver, CANbus, some buttons and thermistor. The goal is to have a standard unit that can then be cusomized on the fly for various uses. This code is to test the various subsystems as part of our development process.

PRODUCT_ID(XXXX);
PRODUCT_VERSION(1);

#include "IRremote.h"
#include "papertrail.h"
#include <SPI.h>
#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SH1106.h>
#include "photon-thermistor.h"

int chkID = A2;
int ledR = D7;
int ledB = D0;
int ledG = A6;

int switchPin = A1;
int buttonPin = D5;
int RECV_PIN = D3;
int buzzerPin = A0;

CANChannel can(CAN_D1_D2);

int pwrPin = TX;
int tempPin = A7;

#define OLED_DC     D4
#define OLED_RESET  D6

Adafruit_SH1106 display(OLED_DC, OLED_RESET);
#define LOGO16_GLCD_HEIGHT 16
#define LOGO16_GLCD_WIDTH  16
static const unsigned char PROGMEM logo16_glcd_bmp[] =
{ B00000000, B11000000,
  B00000001, B11000000,
  B00000001, B11000000,
  B00000011, B11100000,
  B11110011, B11100000,
  B11111110, B11111000,
  B01111110, B11111111,
  B00110011, B10011111,
  B00011111, B11111100,
  B00001101, B01110000,
  B00011011, B10100000,
  B00111111, B11100000,
  B00111111, B11110000,
  B01110000, B01110000,
  B01111100, B11110000,
  B00000000, B00110000 };

#if (SH1106_LCDHEIGHT != 64)
#error("Height incorrect, please fix Adafruit_SH1106.h!");
#endif

IRrecv irrecv(RECV_PIN);

decode_results results;

Thermistor *thermistor;

volatile int valID;
volatile int chkButton;
volatile int chkSwitch;
volatile int chkTime;
volatile int chkWeb;

PapertrailLogHandler papertailHandler("logs6.papertrailapp.com", ****, "****");

void setup()   {
  Serial.begin(9600);
  Serial.println("Starting serial port");

  chkButton = 0;
  valID = 0;
  chkSwitch = 0;
  chkTime = 0;
  chkWeb = 0;

  pinMode(ledG, OUTPUT);
  pinMode(ledR, OUTPUT);
  pinMode(ledB, OUTPUT);

  pinMode(chkID, INPUT);
  pinMode(buzzerPin, OUTPUT);
  pinMode(buttonPin, INPUT_PULLUP);
  pinMode(switchPin, INPUT_PULLUP);

  liteLED(1,1,1); // white
  delay(500);
  liteLED(1,0,0); // red
  Particle.connect();
  waitUntil(Particle.connected);
  liteLED(0,1,0); // green
  delay(5000);

  attachInterrupt(buttonPin, localSwitch, RISING);
  attachInterrupt(switchPin, remoteSwitch, RISING);
  Particle.function("webCall", webAlert);

  Serial.println("Loading interrupts");

  irrecv.enableIRIn(); // Start the receiver

  display.begin(SH1106_SWITCHCAPVCC);
  display.display();
  delay(1000);

  valID = analogRead(chkID);

if (valID > 3000 && valID < 4000) {
  pinMode(pwrPin, OUTPUT);
  pinMode(tempPin, INPUT);
  thermistor = new Thermistor(A7, 10000, 4095, 10000, 25, 2700, 5, 20);
  Serial.println("Thermistor engaged");
  }

  // Check LEDs and buzzer
  liteLED(1,0,0); // red
  delay(500);
  liteLED(0,1,0); // green
  delay(500);
  liteLED(0,0,1); // blue
  delay(500);
  liteLED(1,1,1); // white
  delay(500);
  liteLED(0,0,0); // off
  digitalWrite(buzzerPin, HIGH);
  delay(500);
  liteLED(0,0,0); // off
  digitalWrite(buzzerPin, LOW);
  Serial.println("LEDs checked");

// Clear the buffer.
  setDisplay(String("Network:"), String(WiFi.localIP()), 2);
  Serial.println("Network checked");
  delay(2000);

  can.begin(50000);
  switch(can.errorStatus()) {
  case CAN_ERROR_PASSIVE:
    setDisplay(String("CAN bus:"), String("Passive"), 2);
    break;
  case CAN_NO_ERROR:
    setDisplay(String("CAN bus:"), String("Online"), 2);
    break;
  case CAN_BUS_OFF:
    setDisplay(String("CAN bus:"), String("Error"), 2);
    break;
  default:
  setDisplay(String("CAN bus:"), String("No data"), 2);
  Serial.println("CAN network configured");
  }

  delay(2000);

  setDisplay(String("ID value:"), String(valID), 2);
  Serial.println("ID determined");
  delay(2000);
  Serial.println("Starting loop");
}

void loop() {
  CANMessage message;
  if (can.available() > 0) {
      can.receive(message);
      liteLED(1,1,1); // white
      String canMsg = "";
      for(int i = 0; i < message.len; i++) {
        canMsg.concat(char(message.data[i]));
        }
      String twilioMsg = System.deviceID();
      twilioMsg.concat(": ");
      twilioMsg.concat(canMsg);
      Particle.publish("twilio_sms", twilioMsg, PRIVATE);
      setDisplay(String("CAN message:"), canMsg, 2);
  }
  if (irrecv.decode(&results)) {
    Serial.println("IR prompt");
    if (String(results.value) != String("4294967295")) {
      liteLED(0,1,0); // green

      setDisplay(String("IR receive code:"), String(results.value), 2);
    }
    irrecv.resume(); // Receive the next value
  }
  if (chkWeb == 1) {
    liteLED(1,0,0); // red
    chkWeb = 0;
  }
  if (chkSwitch == 1) {
    liteLED(0,0,1); // blue
    if (valID > 3000 && valID < 4000) {
      digitalWrite(pwrPin, HIGH);
      delay(10);
      float tempF = thermistor->readTempF();
      digitalWrite(pwrPin, LOW);
      setDisplay(String("Temp check:"), String(tempF), 2);
      Log.info("Temperature check");
    }
    else {
      setDisplay(String("Firmware version:"), String(System.version()), 2);
      Log.info("Firmware check");
//    Log.warn("This is warning message");
//    Log.error("This is error message");
    }
    chkSwitch = 0;
  }
  if (chkButton == 1) {
    CANMessage message;
    message.id = 0x100;
    message.extended = false;
    message.rtr = false;
    message.len = 5;
    message.data[0] = 0x48;
    message.data[1] = 0x65;
    message.data[2] = 0x6c;
    message.data[3] = 0x6c;
    message.data[4] = 0x6f;
    if (can.transmit(message)) {
      liteLED(0,1,0); // green
      setDisplay(String("CAN message:"), String("TX"), 2);
    }
    else {
      liteLED(1,0,0); // red
      setDisplay(String("CAN message:"), String("TX Error"), 2);
    }
    chkButton = 0;
  }
  if (millis()/60000 > chkTime) {
    Serial.println("Timer prompt");
    chkTime++;
    setDisplay(String("Timer minutes:"), String(chkTime), 2);
    liteLED(0,0,0); // off
    if (chkTime % 15 == 0) {
      Log.info("Heartbeat");
    }
  }
  delay(100);
}

void localSwitch() {
  chkButton = 1;
  Serial.println("chkButton engaged");
}

int webAlert(String alertStr) {
  chkWeb = 1;
  Serial.println("HTML received");
  setDisplay(String("HTML post:"), alertStr, 2);
  return 0;
}

void remoteSwitch() {
  Serial.println("remoteSwitch engaged");
  chkSwitch = 1;
}

void liteLED(int red, int green, int blue) {
  Serial.println("LED updated");
  if (red == 1) {
    digitalWrite(ledR, HIGH);
  }
  else {
    digitalWrite(ledR, LOW);
  }
  if (green == 1) {
    digitalWrite(ledG, HIGH);
  }
  else {
    digitalWrite(ledG, LOW);
  }
  if (blue == 1) {
    digitalWrite(ledB, HIGH);
  }
  else {
    digitalWrite(ledB, LOW);
  }
}

void setDisplay(const char* topMsg, const char* alertMsg, int msgSize) {
  Serial.println("Display updated");
  display.clearDisplay();
  display.setTextSize(1);
  display.setTextColor(WHITE);
  display.setCursor(0,0);
  display.println(topMsg);
  display.println(" ");
  display.setTextSize(msgSize);
  display.println(alertMsg);
  display.display();
}

Thanks in advance.


#2

One thing - not directly addressing the issue tho’.
You should put Particle.function() before your delay(5000) or even before Particle.connect().

And with all your devices connected, maybe one of them does have issues with a cold start.

A clumsy workaround - at least to test the hypothesis - might be to use a retained variable.
First you “setup” all your devices - without actually using them - and then check whether the variable was already initialised.
If it wasn’t you’ll initialise it and then call System.reset().


#3

So continuing to troubleshoot. Still happening. In looking at the logs in both cases (working and freezing) they show that the unit connects to the cloud. However, after the long time powered off (when it freezes) I see the following prior to the connection message:

[comm.sparkprotocol] INFO: Sending TIME request

I do not see this entry when the unit boots correctly.

On the freezing occasions the program stops at the following section:

void setup()   {
  liteLED(1,0,0); // turns diagnostic LED red

  waitUntil(WiFi.ready);
  liteLED(0,0,1); // turns diagnostic LED blue

  Particle.connect();
  waitUntil(Particle.connected);
  liteLED(0,1,0); // turns diagnostic LED green

The LED turns blue so the wifi is working. And I see in the logs that it is “connected” to the cloud:

[system] INFO: Cloud connected

However it never gets to turning the LED green, which would indicate that its not getting past the:

waitUntil(Particle.connected);

Could it be something to do with the system clock not working after a long time out, and that the client and server are unable to sync?


#4

To stay in control you could use waitFor() instead of waitUntil().

The time request should not make a difference. It’s appearence after a long offline phase may just be a precaution for RTC drift or due to power loss invalidating the RTC. When the time between last sync and reconnect is short enough and the RTC signals a valid time the drift will be neglectable and the request be skipped.


#5

Update: Got it working by moving the papertrail logger to the end of setup along with a delay in front of it.

PapertrailLogHandler papertailHandler("logs6.papertrailapp.com", ****, "****");

I’m guessing that the creation of the log handler is somehow interfering with the Photon cloud connection. The changes were an attempt to make sure the cloud connection was up and stable before connecting to the log server.

So far the units are recovering from power disconnections. Main disadvantage is that I don’t get log information from the startup of the unit. Would love to figure our the actual culprit, but at least its working.


#6

You should definetly make sure WiFi.ready() is true before trying any network access - maybe that’s a problem with the implementation of the PapertrailLogHandler which relies on an already established network connection (just assuming tho’).
A common “problem” with C++ code on these devices is the non-deterministic order of constructor execution for globally declared objects. Hence you’ll see well formed libraries to implement a begin(), init() or similar function in addition to the constructor to actually start execution while the constructor merely prepares the object for later execution. This ensures that all other (potentially used) objects will also be up and ready to run once your active code starts doing its job.