Hello,
I am working a device that reads continuously from Serial Port 1 and have been seeing intermittent hard faults. When I remove all references to Serial1 reading from my code, the hard faults seemingly stop. The main issue with these hard faults is that they happen at entirely random times. When I program my devices, a hard fault can be triggered within five minutes of the code starting or the code can run for eight hours before the fault triggers. I am working with multiple Argon devices running device OS version 1.5.2.
The devices are running solely as listening devices on a serial communication network. The particular network runs at a baud rate of 38.4 kHz and is half-duplex. My current hardware has the serial network wire connected to the RX pin of the Argon through a signal buffer to prevent interference and the TX pin is currently left floating.
Due to the size of my code, I will provide the code relevant to its reading of the Serial1 port:
I have been debugging those resets using a few retained variables -
// Retained integer used to debug the reset issue through the repeating loop function
// Search for the flag through PF#
retained uint16_t progressFlag = 0;
// Retained integer used to debug the reset issue, specifically within the readSerial function
// Search for the flag through SPF#
retained uint16_t serialProgressFlag = 0;
// Retained integer used to debug the consistency of the loop and only performing one loop at a time
// Value is incremented at the start of the loop and decremented at the end of the loop
// Value should always be one, if the loop is being called one at a time
retained uint16_t loopCallCheck = 0;
// Retained integer used to debug the consistency of the serial reading function and having that function called one at a time
// Value is incremented at the start of the serial reading and decremented at the end of the serial reading
// Value should always be one, if the function is being one at a time
retained uint16_t serialCallCheck = 0;
// Retained integer used to debug the consistency of the readBytes function in the serial reading
// Value is incremented if the readBytes function returns a buffer size that is not the predefined message length
// Value should always be zero, if the readBytes function is working correctly
retained uint16_t bytesSizeReturn = 0;
- The variable defined as âprogressFlagâ is a retained variable that is published after a reset. Throughout my main loop() function, this variable is updated to specific values that relate to specific lines of code.
- The variable defined as âserialProgressFlagâ is a retained variable that is published to the console after a reset, allowing me to debug where the hard faults are occurring.
- The variable defined as âserialCallCheckâ is a retained variable that is published after a reset. I had to ensure that the serial function was being called only once to prevent possible overlap.
- The variable defined as âloopCallCheckâ is a retained variable that is published after a reset that shows me how many times the main loop() has been called. This variable is incremented at the start of loop() and decremented at the end.
- The variable defined as âbytesSizeReturnâ is a retained variable tat is published after a reset that shows me how many times the device has read a packet with a length that is outside of the preset lengths.
When a hard fault reset occurs, I have my devices publish a block of variables to try and give more information as to why the reset happened -
char resetReason[10];
sprintf(resetReason, "%lu", System.resetReason());
Particle.publish("Reset", resetReason, PRIVATE, WITH_ACK);
delay(500);
sprintf(resetReason, "%lu", System.resetReasonData());
Particle.publish("Reset_Data", resetReason, PRIVATE, WITH_ACK);
delay(500);
sprintf(resetReason, "%d", progressFlag);
Particle.publish("Reset_Loc", resetReason, PRIVATE, WITH_ACK);
delay(500);
sprintf(resetReason, "%d", serialProgressFlag);
Particle.publish("Serial_Reset", resetReason, PRIVATE, WITH_ACK);
delay(500);
uint16_t loopTemp = loopCallCheck;
sprintf(resetReason, "%d", loopTemp);
Particle.publish("Loop_Call", resetReason, PRIVATE, WITH_ACK);
delay(500);
uint16_t serialTemp = serialCallCheck;
sprintf(resetReason, "%d", serialTemp);
Particle.publish("Serial_Call", resetReason, PRIVATE, WITH_ACK);
delay(500);
uint16_t bytesTemp = bytesSizeReturn;
sprintf(resetReason, "%d", bytesTemp);
Particle.publish("Bytes_Return", resetReason, PRIVATE, WITH_ACK);
delay(500);
So far, every reset has had a cause of 130, which relates to a panic hard fault. I print the extra data of that hard fault and it has always been 1, which relates to a true hard fault. My biggest question comes from the reset location, taken from the âprogressFlagâ variable. That variable is always returned as 32, which correlates to my last line of loop(). Within loop, the first line sets the âprogressFlagâ to 0 and sets it to 32 at the end. If my understanding is correct, that means that my code is faulting between calls of the loop() and I have been unable to find the exact cause.
The loop() function operates one cycles of second. Variables and flags are checked and once the loop() reaches the bottom, there is a while-loop that waits until a full second has passed. The while-loop is included below-
while (((micros() - startTime) < 1000000))
{
progressFlag = 800; //PF800
// Read the VR2 Serial Port until a full second has passed
if(!serialBusy)
{
while(!readSerial()){}
}
serialBusy = false;
serialProgressFlag = 0; //SPF0, to clear serial progress when reading serial has finished
// A short delay is added to allow each read of the serial port to be completed before the next call
progressFlag = 801; //PF801
delay(5);
progressFlag = 802; //PF802
// If a Bluetooth command has been received, parse and perform the associated function here
// This action is checked here to ensure a prompt response to any Bluetooth message
if(bluetoothCommand.commandActive)
{
progressFlag = 803; // PF803
LOGINFO(("Bluetooth Command - %s", bluetoothCommand.untilColon));
bleCommandReceive(bluetoothCommand.untilColon);
bluetoothCommand.commandActive = false;
progressFlag = 804; //PF804
}
wd.checkin();
progressFlag = 805; //PF805
}
The variable âstartTimeâ is a variable that is set as the current return of micros() at the start of the loop and this while-loop waits for that to be a full second behind the current time before exiting the loop(). The block of code for Bluetooth communication will not be called until a Bluetooth command has been received and has been passed over throughout my current testing. The âserialBusyâ boolean is a flag that is set to true inside of the serial reading function and is reset to false once the reading is completed. This was done to ensure that the serial reading function is called once.
My best assumption places the cause of the hard fault issue in how I am reading the Serial1 port, with the function below -
uint8_t readSerial(void)
{
serialBusy = true;
serialCallCheck++;
serialProgressFlag = 0; //SPF0
uint8_t availableChars = Serial1.available();
if(availableChars > 0)
{
size_t byteReturnLength = 0;
validHeader = true;
bool validMessage = false;
header = Serial1.read();
messageLength = 0;
// Message from the Power Module that confirms the system as successfully activating.
if(header == PMACTIVE)
{
messageLength = 3;
serialProgressFlag = 10; //SPF10
if(availableChars >= messageLength)
{
byteReturnLength = Serial1.readBytes(VR2dataBuffer, 3);
LOGINFO(("ON"));
validMessage = true;
}
else{}
serialProgressFlag = 11; //SPF11
}
// Message from the nVR2 version of the Power Module that confirms the system as successfully activating.
else if(header == NVR2PMACTIVE)
{
messageLength = 13;
serialProgressFlag = 20; //SPF20
if(availableChars >= messageLength)
{
byteReturnLength = Serial1.readBytes(VR2dataBuffer, 13);
LOGINFO(("NVR2 ON"));
validMessage = true;
}
else{}
serialProgressFlag = 21; //SPF21
}
// Message from the Power Module that contains information related to the battery gauge and lighting of the joystick.
else if(header == TRUCHARGE)
{
messageLength = 5;
serialProgressFlag = 30; //SPF30
if(availableChars >= messageLength)
{
byteReturnLength = Serial1.readBytes(VR2dataBuffer, 5);
validMessage = true;
}
else{}
serialProgressFlag = 31; //SPF31
}
// Message from the Joystick that contains information related to the functionality of the joystick.
else if(header == JOYSTICK)
{
messageLength = 5;
serialProgressFlag = 40; //SPF40
if(availableChars >= messageLength)
{
byteReturnLength = Serial1.readBytes(VR2dataBuffer, 5);
validMessage = true;
}
else{}
byteReturnLength = Serial1.readBytes(VR2dataBuffer, 5);
serialProgressFlag = 41; //SPF41
}
else
{
validHeader = false;
}
if((validHeader) && (byteReturnLength == messageLength) && (validMessage))
{
// If the checksum is calculated to be valid, process the message
serialProgressFlag++; //SPF6
if(loopVR2Receive.Checksum())
{
serialProgressFlag++; //SPF7
loopVR2Receive.onCheck();
serialProgressFlag++; //SPF8
switch(header)
{
case(PMACTIVE):
case(NVR2PMACTIVE):
{
serialProgressFlag = 100; //SPF100
loopVR2Receive.PMActive(); // Handle Power On messages from any type of power module
serialProgressFlag = 101; //SPF101
break;
}
case(TRUCHARGE):
{
serialProgressFlag = 200; //SPF200
loopVR2Receive.TruCharge(); // Handle TruCharge messages, which contain functional wheelchair data
serialProgressFlag = 201; //SPF201
break;
}
case(JOYSTICK):
{
serialProgressFlag = 300; //SPF300
loopVR2Receive.Joystick(); // Handle Joystick messages, which contain functional wheelchair data
serialProgressFlag = 301; //SPF301
break;
}
default:
{
break;
}
}
serialProgressFlag++; //SPF(Progress + 1)
}
// Message packets are tossed if the checksum calculation is invalid
else
{
serialProgressFlag = 400; //SPF400
}
}
else
{
if(byteReturnLength != messageLength)
{
bytesSizeReturn++;
}
}
}
serialCallCheck--;
delay(2);
return 1;
};
The above function reads the Serial1 port and gets message packets from the serial network through pre-defined packet formats. Each packet header has its own defined message length, so this function checks to see if that many bytes are available for reading before actually reading the full packet. During normal operation of my code, the Serial1 buffer is never getting filled. I had code that was printing out the number of bytes available for reading from Serial1.available() and the value was always below 10.
I also saw in previous posts that if the Serial buffer does not receive a null character of â\0â within 64 bytes of the previous, memory corruption can occur. I verified that my code was receiving null termination characters and that those characters were coming frequently.
My Serial1 port is opened in the setup() function with the following code -
//Serial interface (8bits, even parity, 2 stop bits, half-duplex)
Serial1.begin(38400, SERIAL_8E2);
pinMode(RX, INPUT);
Serial1.halfduplex(true);
I have created a class to handle how each packet is processed. The class is defined below -
class VR2_Receive
{
// Create the public prototypes for the VR2 message receive functions
public:
void PMActive(void); // Handle the messages for when a VR2 power module turns on the chair
void TruCharge(void); // Handle the messages defined as TruCharge packets
void Joystick(void); // Handle the messages defined as Joystick packets
uint8_t Checksum(void); // Perform the Checksum actions to verify valid packets
void onCheck(void); // Check to see if the chair is active when the device thinks otherwise
void stopMoving(void); // Universal function that processes when and why the chair has stopped moving
private:
};
Each packet has a checksum that is verified before the processing begins -
uint8_t VR2_Receive::Checksum(void)
{
// The checksum of the packet must be verified before the packet should be decomposed.
// The checksum is defined as: the sum of all previous data packets, including the header.
// The sum from the previous step is then One's Complemented
// The checksum is that onesComplemented sum.
// The calculated checksum is then truncated to the two least significant bytes to fit with the checksum from the packet
uint16_t sum = header;
uint8_t sumLength = 0;
while(sumLength < (messageLength - 1))
{
sum += VR2dataBuffer[sumLength];
sumLength++;
}
uint16_t onesComplement = ~sum;
onesComplement = (onesComplement & 0xFF);
// Check if the calculated checksum matches the checksum sent with the packet and if the device knows that the chair is currently active
// If the chair is on and the device thinks otherwise, the device will perform the actions for when the chair turns on from an inactive state
// This allows the device to collect usage data regardless of when the device is installed on the chair.
if(onesComplement == VR2dataBuffer[messageLength - 1])
{
return 1;
}
return 0;
};
I had thought that the checksum calculations may be the root cause of the hard faults, but when that function is removed from my code, the faults still happen.
The functions that process each packet are defined below -
void VR2_Receive::PMActive(void)
{
// This function is for handling the 'On' packets for the wheelchair sent by the standard and nVR2 Power Modules.
// The data contained within the two packets are different, depending on the type of Power Module, standard or nVR2.
// The data may be different between the two PM packets, but no extra data is taken between the two, so the same function is called.
// The 'On' packets contain information, but the only data relevant to the device is these packets are transmitted.
// The 'On' packets are the responses to the joystick, confirming that the wheelchair will be activated.
// Variables will be set within these blocks if those variables have a setting for when the wheelchair is activated.
longBeep = false;
isOn = true;
singleErrorCode = false;
// Set the offCheck timer to three seconds
offCheck = 3;
if(isCharging)
{
chargerDelay = 3;
}
else
{
if(oneChargeByLights)
{
oneChargeByLights = false;
}
}
if(isTimeAligned)
{
onTime = Time.now();
}
else{}
// Check if the device has locked out charging detection from the lights
if(lightsChargeLock)
{
lightsLockCheck = 3;
LOGINFO(("Lights Detection Countdown Started!"));
}
else{}
formatAndstoredata.jsonify(POWER_ON, 0.0);
return;
};
void VR2_Receive::TruCharge(void)
{
// This block of code calculates all of the required variables for the comparisons to collect the TruCharge data.
powerDown = VR2dataBuffer[0] >> 7;
actuatorLights = (VR2dataBuffer[0] >> 5) & 3;
powerLatch = (VR2dataBuffer[0] >> 4) & 1;
actuatorFlash = (VR2dataBuffer[0] >> 3) & 1;
brakingStatus = (VR2dataBuffer[0] >> 2) & 1;
lockingStatus = (VR2dataBuffer[0] >> 1) & 1;
hornVolume = VR2dataBuffer[1] >> 4;
hornPattern = VR2dataBuffer[1] & 0x0F;
batteryLights = VR2dataBuffer[2] >> 4;
batteryPattern = VR2dataBuffer[2] & 0xF;
speedLights = VR2dataBuffer[3] >> 4;
speedPattern = VR2dataBuffer[3] & 0xF;
// If the chair is activated and the device thinks the joystick is locked, the device will wait for 1-2 seconds and check the lock status
// If the joystick is locked, the battery gauge LED's will be off and the speed LED's will be in the Bi-Ripple state (0 -> 1 -> ... -> 5 -> 4 -> ... -> 0)
if(lockDelay == 0)
{
if(lockStatus)
{
bool stillLocked = false;
if((VR2dataBuffer[2] & 0x0F) == 0)
{
if((speedPattern == 5) && (speedLights == 5))
{
stillLocked = true;
}
}
if(!stillLocked)
{
LOGINFO(("Joystick unlocked by TruC LED's"));
lockStatus = 0;
longBeep = false;
chairState = NORMAL;
// Save the new, unlocked joystick status
formatAndstoredata.jsonify(JOYSTICK_UNLOCK, 0.0);
}
}
// This else-statement is a conditional check to see if the joystick is locked and the device thinks the opposite.
// If the conditions of the joystick being locked are met and thedevice sees the joystick as unlocked,
// the device will change its status to locked and start the locking-unlocking pattern.
else
{
if(batteryPattern == 0)
{
if((speedPattern == 5) && (speedLights == 5))
{
lockStatus = true;
longBeep = false;
LOGINFO(("Joystick locked by TruC LED's"));
chairState = LOCKED;
// Save the new, locked joystick status
formatAndstoredata.jsonify(JOYSTICK_LOCK, 0.0);
}
}
}
}
// This block of code determines the horn pattern of the joystick
// A 'Long Beep' pattern, defined as joystick horn pattern #6, is used to notify users that the joystick has been locked
// The joystick will only be locked if the 'Long Beep' pattern has been sent in this packet during the On/Off cycle
// The longBeep boolean will be reset on each power on event and when the joystick goes from a locked to unlocked state
// If the joystick is current locked, a 'Long Beep' pattern signifies that the joystick has been unlocked
// The device will save both the joystick locking and unlocking actions
if(hornPattern == 6)
{
longBeep = true;
if(lockStatus)
{
LOGINFO(("Joystick unlocked by alarm long beep"));
longBeep = false;
lockStatus = false;
chairState = NORMAL;
// Save the new, unlocked joystick status
formatAndstoredata.jsonify(JOYSTICK_UNLOCK, 0.0);
}
}
// This block of code handles the charger inhibit status of the chair through the TruCharge packet
// On an active charging cycle, the battery gauge LED's will steadily increase, following the 'Uni-Ripple' pattern (#4)
// A delay of 120 seconds is given to allow the charger to bring the battery voltage up to the charging level before the voltage readings can end a charging cycle
if(batteryPattern == 4)
{
if((!isCharging) && (!lightsChargeLock) && (!oneChargeByLights))
{
chargeStart(CHARGE_INHIBIT);
}
}
// This block of code will check for the charging LED pattern on start up of the chair, after a delay to allow the chair status to settle
// Any LED pattern outside of the 'Uni-Ripple' pattern will cause the charging cycle to be complete
if((chargerDelay == 0) && (isCharging) && (batteryPattern != 4))
{
chargeEnd();
}
// This block of code tracks when the chair enters the actuator state
// The actuator state is when the chair is set to manipulate the seating actuators
// The LED's are checked to ensure that the Actuator LED's are on and the Speed LED's are off
// If the chair is not in the actuator state, the device will then note the change in state
bool actuatorStatus = false;
if((actuatorLights != 0) && (speedPattern == 0))
{
actuatorStatus = true;
if(chairState != ACTUATOR)
{
chairState = ACTUATOR;
}
else{}
}
else if((!actuatorStatus) && (chairState == ACTUATOR))
{
chairState = NORMAL;
}
// This block of code collects error codes triggered by the Power Module
// These error codes are sent through the LED numbers and patterns within the packet, including battery, actuator, and speed LED's
if((batteryPattern == 3) && (!singleErrorCode))
{
uint8_t errorCode = VR2dataBuffer[2] >> 4;
// Check if the Joystick error code is for a functional fault (Code 7) or a communications fault (Code 11)
if(errorCode == 7)
{
// Check if the speed percentage LED's are flashing rapidly
if(speedPattern == 3)
{
errorCode = 11;
}
}
// Check if the Control System fault (Code 8) is actually an Actuator fault (Code 12)
else if(errorCode == 8)
{
// Check if the actuator LED's are flashing
if(actuatorFlash == 1)
{
errorCode = 12;
}
}
LOGINFO(("Power Module error code - %d", errorCode));
singleErrorCode = true;
chairState = ERROR;
// Save the error code once
formatAndstoredata.jsonify(ERROR_CODE, errorCode);
// Only save one error code per power cycle
// Prevent a second error code saving action until the chair has been turned off and back on again
}
// This block of code tracks if the wheelchair is in motion, detecting when motion starts and stops, and the duration of the chair movement
// The status of the wheelchair braking and the patterns of each LED set are checked to ensure the wheelchair is in the normal/default state
// Driving actions can only occur when the wheelchair is in the normal/default state as the other states prevent any motor movement
// The displacement of both joystick axes is checked to ensure the wheelchair is moving and the displacement is used to track wheelchair behavior
if(chairState == NORMAL)
{
bool movingStatus = false;
// Check if the braking status to be set as 'not braking'
if(brakingStatus == 0)
{
// Check the battery LED's to be in the 'continuous on' (#1) pattern
// The battery LED's can also be in the 'slow flashing' pattern (#2), which refers to a low-charge battery
// The number of flashing LED's for low-battery can change, so the 'slow flashing' pattern is the important parameter
if((batteryPattern == 1) || (batteryPattern == 2))
{
// Check the speed LED's to in the 'continuous on' pattern
if(speedPattern == 1)
{
// Check if the joystick is currently displaced or has returned to rest from a previous displacement
// The joystick can bounce when returning to rest, so the delay allows the joystick to settle before checking the movement again
if(((xDisplaced) || (yDisplaced)) && (!drivingDelay))
{
if(!chairMoving)
{
drivingStart = Time.now();
LOGINFO(("Chair moving time - %lu", drivingStart));
chairMoving = true;
chargeDetectionLockout = true;
// If the chair starts to move during the lockout countdown, reset that to prevent mistimed threshold calculations
if(chargeDetectionCountdown != 0)
{
LOGINFO(("Reset moving charge lockdown"));
chargeDetectionCountdown = 0;
}
}
movingStatus = true;
}
// If the joystick isn't displaced, check if the device thinks the chair is moving
// The the device thinks the chair is moving, the device will adjust its status to seeing the chair at rest
// The joystick can bounce when returning to rest, so a delay is included to allow the joystick time to settle before checking displacement again
else if(chairMoving)
{
stopMoving();
}
}
}
}
if((!movingStatus) && (chairMoving))
{
stopMoving();
}
}
else
{
if(chairMoving)
{
stopMoving();
}
}
// Check for the Power Key to be latched in the held position and for the power module to power down the chair
// The actions for power off are repeated, so the power off function was made atomic
if((powerLatch) && (powerDown))
{
powerOff();
}
return;
};
void VR2_Receive::Joystick(void)
{
// The joystick coordinate values are sent in the packet as a signed character.
signed char joystickY = VR2dataBuffer[2];
signed char joystickX = VR2dataBuffer[3];
// This block checks the joystick Y coordinate value to determine if the joystick is displaced out of the dead band
// For actuator movement, the actuator will only move when the Y axis of thje joystick has been displaced out of the dead band.
// For tracking chair driving usage, both directional axes will be tracked for displacement.
// The joystick deadband is between (-30, -30) to (30, 30) for actuator and seating movement.
// The joystick deadband is between (-10, -10) to (10, 10) for driving actions and movement.
uint8_t absY = abs(joystickY);
uint8_t deadband = 0;
if(chairState == ACTUATOR)
{
deadband = ACTUATORDISPLACEMENT;
}
else
{
deadband = DRIVINGDISPLACEMENT;
}
if((!yDisplaced) && (absY > deadband))
{
yDisplaced = true;
}
else if((yDisplaced) && (absY <= deadband))
{
yDisplaced = false;
}
uint8_t absX = abs(joystickX);
if((!xDisplaced) && (absX > deadband))
{
xDisplaced = true;
}
else if((xDisplaced) && (absX <= deadband))
{
xDisplaced = false;
}
// This block handles joystick error codes, which are codes triggered specifically by the joystick module.
// These error codes are handled exactly as the power module error codes.
// The error codes are saved and a new error code can only be saved once the chair has been turned off and back on again.
if((joystickY < -100) && (joystickY > -110))
{
signed int baseError = (-1) * (joystickY) - 88;
uint8_t jsErrorCode = (uint8_t)baseError;
LOGINFO(("Joystick Error code!"));
singleErrorCode = true;
chairState = ERROR;
//Save the error code and lock out any subsequent error codes until the chair power has been cycled.
formatAndstoredata.jsonify(ERROR_CODE, jsErrorCode);
}
else if((joystickX < -100) && (joystickX > -110))
{
signed int baseError = (-1) * (joystickX) - 88;
uint8_t dualErrorCode = (uint8_t)baseError;
LOGINFO(("Dual Module Error code!"));
singleErrorCode = true;
chairState = ERROR;
// Save the error code and lock out any subsequent error codes until the chair power has been cycled.
formatAndstoredata.jsonify(ERROR_CODE, dualErrorCode);
}
return;
};
Unfortunately, no matter what method I have attempted to diagnose this issue, I have been unable to determine the cause. Each reset publish shows that the hard fault occurred at random, intermittent times and the location of the hard fault is 32. That location should only be set that way if the faults are happening between calls to the loop() function.
To better understand why this issue is happening, I have restored my code to a previous version that ran a much simpler reading of the Serial1 port, but even that code produced the intermittent resets.
I am unsure as to the cause of this issue and how it can be resolved. If necessary, I can provide more of my code, but I wanted to limit what I included to reduce the complexity of my supporting information. I appreciate any assistance that can be given.
Thank you