Request: Particle endorsed Semi Automatic, Thread mode test case .ino

mterrill · February 5, 2020, 3:33am

Let’s explicitly request this.

Can Particle provide an endorsed Semi Automatic & Thread mode enabled .ino with skeleton code that:

Connects to Wifi
Connects to Particle cloud
Registers cloud functions and variables
Publishes ‘one time startup’ message(s)
Subscribes and calls a basic function
Publishes a routine message every n seconds

There have been a number of breaking changes to happen over the past 18 or so months. We had some bumps around 1.0.0, then after 1.1.1 it seemed a number of things were changed and broken, ie WiFi.macAddress() and requiring explicit wifi.connect / particle.connect instead of simply .connect().

It’s practically a necessity for Particle to open up their test cases to us and show what sketches they’re running to test platform changes and current cloud operation. If it doesn’t exist already, I suggest a Particle git repo.

We can start with semi automatic, thread mode enabled, connecting to cloud with functions and variables registering reliably and subscribe working as you’d expect.

For context, this was brought to my attention today when a customer with a perfectly fine internet connection in the UK was having issues with a device with 1.4.4 firmware. In the setup block it was not calling a cloud.publish and it was not subscribing, despite being wrapped in a waitfor cloud_connected 15000. I fixed both issues by moving the relevant calls into the loop and executing them the first time particle.connected() was true. I’ve never had this issue before and we’re used to factory testing via a dodgy internet connection behind the great chinese firewall.

Particle team, show us what sketches you use to test your platform and firmware so we can implement the ‘golden method’ and have confidence that what worked yesterday will work tomorrow.

hwestbrook · February 5, 2020, 4:57am

I second this request, and propose a cellular version of this test. Mesh seems to have been killed in order to focus efforts on wifi AND cellular. Sorry to bloat your vision @mterrill, but I think the skeleton application outlined would need to include your list plus:

Sleep
SYSTEM_THREAD(ENABLED);
Cellular connection issue handling
PMIC management
Serial
I2C
SPI
Interrupts
Timers
System Events

If there was something like this it would be very easy to see where a user application deviated from Particle standard and how that might cause issues. It would also be a good way for Particle to introduce and explain new features or new limitations.

Maybe this used to be what Tinker was for? Tinker feels a little outdated now. If there was an application like this, I would be happy to run a test rig in our workspace for it to help with testing.

My context for wanting something like this is our very rocky road upgrading from 0.6.1 to 1.4.4. Our application is very stable in 0.6.1, and not so stable in 1.4.4… we’ve needed to make a number of changes in order to get back closer to the level of stability we had before. I think seeing an officially endorsed way of doing things would have made this transition easier.

shanevanj · February 5, 2020, 6:49am

To add to this - I started out with the Arduino eco system in the early days and a favourite of mine to make sure that the basics were working was to use Firmata

It (in my view) is a pretty comprehensive exercising of the hardware and if customised to Gen2 & 3 devices i.e. add in a section for comms tests, variable and function publishing/subscription - I think this could become the “Gold” test for new releases or software or hardware (There did seem seem to be a version around pin the Spark days - but I cannot find this anymore).

mterrill · February 5, 2020, 11:20pm

Great ideas, though for starters I’d love for Particle to focus on cloud connectivity in that particular semi automatic + system thread mode. I’m sure a lot of the more serious deployments are using that combo as it allows you to more easily trigger setup mode and avoid some of the otherwise blocking modes of particle firmware when you need a degree of asynchronous behaviour.

I strongly suspect there are existing current bugs in anything after 1.1.1. There have been multiple fast and loose breaking changes to cloud behaviour and I’d like someone at Particle to make a clear statement on what we can unreservedly expect to work and hold them to account to make sure that golden sketch always works.

Going forward I can see a handful of separate golden sketches following the topics you suggested with I/O, timers, etc etc. Quite a few of those probably should be in a large Swiss Army knife sketch, which could be a new tinker.

Spark/Particle started to connect IoT devices simply online. Let’s have a sketch that shows that working 110% reliably.

mterrill · February 5, 2020, 11:24pm

Whatever happens needs to be owned and driven by Particle. The same sketch has to be used by their firmware CI system, the same sketch that you can only hope is being automatically re-run every 5 minutes to check that Particle cloud is working.

It has to be baked into their systems, publicly available in a repo, on the Particle build libraries, and well communicated to support teams that if someone contacts them that it’s not working that it’s a priority 1.

shanevanj · February 6, 2020, 7:09am

Agreed - there needs to be a set of tests that are used for full regression testing of all hardware on all current releases of DeviceOS and nothing should be released util there is 100% compliance with the tests. There should be a large set of all versions of devices on the test racks somewhere - just running this code in an automated way with a set of applications constantly interrogating these test devices - excessing the I/O etc. The same setup should be replicated in the major regional zones Particle want to play in to ensure the networks and partnerships that they have are rock solid.

There probably (in the Particle’s team’s mind) is such testing but with all the recent issues, it is clearly not rigorous enough or complete enough to catch the fringe issues. What amazes me is that there is clearly a passionate community with centuries of experience in design and deployment, who would happily get involved in validating and would offer valuable experience and guidance from the “field” that would catapult Particle to greater heights - yet it does not seem to be mined at all.

The ETH wing is a prime example - it broke a whole set of valuable peripherals on the device - but instead of taking the feedback and reving a new version of the board (it could be easily detected in the DeviceOS using cheap 4 I/O I2C port expander and some bit addressing) - Particle went silent on the issue and in my case destroyed the advantage of using ETH in my projects - so now (again for me) two major advantageous subsystems have been removed due to poor conceptualising and planning. Again, in my case - it cost me a 30k device project. So many people want Particle to be successful and the potential is there.

tommy_boy · February 6, 2020, 2:40pm

+1 for this. I deployed my first industrial Argon powered application one week ago, and it would give me piece of mind seeing these tests.

mterrill · February 8, 2020, 1:16am

Great points with fleet testing, I’m convinced that testing hasn’t been performed anywhere near the extent that we’d anticipate or would be able to advise on from hard won experience.

Just one of the gems from last year was particle-cli being released that simply had not been run on a device and would fail immediately on execution. Let’s be clear: a program to interact with physical devices had not been connected to a device and had a run through the functions.

Back before cloud was a thing my speciality with testing was end to end functional testing and monitoring. One example is if you’re processing millions of emails a day, it’s a pretty good idea to be automatically sending emails from multiple regional zones and major providers to test accounts on your system and tracking the time it takes for them to be received, grey/white listed, scanned, stored and available to end client software. At any point we knew exactly the volume of emails at each stage, how many/sec were being processed by each stage and how long the end to end tests were taking. Variance by more than a few seconds started lighting up the christmas tree alert board.

The issue I encountered the other day with a client not registering cloud functions and subscriptions? My solid theory is it’s not code, it’s Particle cloud not registering clients fast enough. Hence why I’m asking for a golden sketch that works every time, because that particular flash light of focus will look into a few dark and cobweb lined corners.

mterrill · February 8, 2020, 1:23am

@Dave, who should we tag for the firmware sketch?

I’m working on a new firmware release, I was hoping to get it out next week. I’d like to ensure it follows the golden endorsed method as if a device doesn’t go online it gets expensive quickly to do RMA’s.

I’ve already accidentally had two devices that got upgraded past 1.1.1 and then were flashed with the firmware sketch we used back on 1.1.1 …the simple result of that is the devices are effectively bricked as they don’t have the magical combination of wifi.connect and particle connect that >1.1.1 seems to expect from our trial and error.

Viscacha · February 8, 2020, 5:54pm

OK this thread has me intrigued. I’ve been away from the Particle world since 0.8.x was just about becoming a finished thing. At that point it was fairly clear that our application was not very fond of anything later than 0.6.3/4 and amongst other things it was these kinds of modes that had outstanding issues.

My project got revived, relying on 0.6.3 to maintain support seems dangerous and there are functions in 0.8/1.2 that are interesting from a diagnostics point as well as helpful increases in publish size.

However being away that long means all the little outstanding new or perceived issues that people encountered along that path have passed me by. It sounds like some of them are issues that would apply to the kind of things I was looking at back then (SYSTEM_THREAD(ENABLED); poor cellular signal etc etc). I aware that often some of these issues are down to poor implementation when they appear in this forum but they also arise because some the sample code historically breaks rules described elsewhere (using strings in publish instead of char array) or ignores any kind of mitigation (publishing without establishing a connection is there first) or simply too simple.

I agree a more complex example use case that implements more of the advanced features is something many people would find useful.

Dave · February 10, 2020, 7:09pm

Hi @mterrill,

I will try to set aside some time this week to write something up we can share and run it past the team. I really like this idea of sharing test cases and Particle Device OS Examples that could really help product creators and power users like yourself.

Thanks,
David

mterrill · February 11, 2020, 12:34am

There were some changes after 0.7.3 (? going from memory) that required careful use of waitfor on subscribes, but the major connect etc changes and bugs came in after 1.1.1. I noticed the RC for 1.5.0 has the fix for wifi.macaddress(), the sequencing of making that reliable may have fixed some other edge cases.

I’m wanting to move the fleet to 1.4.4 as we have feature updates to our firmware to deploy and all the beta’s have been on 1.4.x, however I’ve observed flaky cloud behaviour so want to make sure it’s rock solid first. Best way is a validated/endorsed/tested/supported sketch.

mterrill · February 13, 2020, 2:22am

Hi @Dave, hope your week is travelling well. I’d really ask for the sketch again as it’s holding us up and we’ve been seeing errant cloud behaviour with publish/subscribe in the setup block.

Let me know how we can get this going as I’ve got a new app version out with IFTTT integration, but don’t have the matching device firmware published as I don’t want to dig myself into a hole with devices not going online. At the moment if we took down the fleet I imagine Particle Support would simply tell me to go fish. I also imagined that the sketch would be readily available and something @rickkas7 or you would be copy/pasting from a repo full of test cases or simply typing out from memory.

mterrill · February 14, 2020, 1:59am

@avtolstoy do you happen to have a semi automatic system thread enabled sketch that reliably connects to cloud, publishes and subscribes?

Chasing what is the test case for new firmware as we’ve seen instances lately where we had to move a .publish and a .subscribe out of the setup block (it was sitting in a waitfor 15 seconds) into the main loop with a watch for particle.connected. My theory is the cloud is registering instances too slowly occasionally. I presumed there was a test sketch that was used for CI …

calebatch · February 15, 2020, 4:25pm

Here is the Shell Code I start with. I have tested it reliably on the Boron 2G/3G.

mterrill · February 17, 2020, 12:13am

Interesting code, thanks!

I implemented a similar approach with subscribe/publish in the main loop after checking for .connected().

To me it highlights the need for Particle folk to step up and provide code for how they ensure it connects, publishes and subscribes in the most reliable and non blocking way. We’re all in the community simply trying to figure out what’s in the black box of particle cloud and what combination of firmware commands is the best way.

peekay123 · February 17, 2020, 1:00am

@calebatch, thanks for the code! I noticed you have a potential of two consecutive calls to Particle.connect() in setup():

    if(System.resetReason() == RESET_REASON_PANIC || System.resetReason() == RESET_REASON_PIN_RESET || System.resetReason() == RESET_REASON_WATCHDOG){
        Cellular.off();
        delay(200);
        Cellular.on();
        Particle.connect();   <-- this line can be removed!
    }
    Particle.connect();//ready to connect to the cloud
    waitUntil(Particle.connected);//loops until connected or wdt triggers

Since the second Particle.connect() is unconditional, you can remove the Particle.connect() from the if(...) body as the code will fall through to that connect statement anyway.

UMD · February 17, 2020, 8:09am

I am behind you all the way here @mterrill!

For example, the fast flashing cyan (ie Particle connection dropped) which kicks off at random times and only solved by reset is a worry…

In lieu of Particle input, am wondering if a “COIN project” (where “COIN” = Community Of Interest Network) is the way to go here? The group would work together to build a “best practice” minimal application that can be used as test jig and template.

ScruffR · February 17, 2020, 8:32am

Might be good if @oddwires could chime in on here

hwestbrook · February 17, 2020, 6:10pm

Hi @Dave,

Does Particle do long term test running? e.g. have some sort of test rig setup that tests device-os stability over a long period of time?

You would earn a lot of trust with me if you had a long term test rig setup for each of your “production” releases, then logged issues to the Particle Status page when those devices went offline, had to be reset, etc…

Topic		Replies	Views
Maintaining a wifi connection in manual mode with threading enabled Firmware	7	3555	November 8, 2016
Correct way to call Cloud Fucntions when using MANUAL mode with SYSTEM_THREAD(ENABLED) (DeviceOS 1.4.0) Troubleshooting electron	2	711	October 24, 2019
State of running code when Device goes offline Firmware	10	1628	February 14, 2017
[solved] 1.4.2 Red SoS with semi automatic, particle.connect causes hard fault Firmware	15	766	November 28, 2019
Particle Connect blocking Firmware	5	1434	May 15, 2018

Request: Particle endorsed Semi Automatic, Thread mode test case .ino

Related topics