I’m pretty sure I’ve diagnosed the problem, though I had to make some guesses about how the Spark core hardware interacts with sketches uploaded to it.
Summary:
I can’t flash the spark core when the sketch it’s running rarely or never iterates over the Arduino-like main loop(). So long as the extant sketch iterates over loop() frequently, there’s no problem. I’m guessing, but I think this is because the Spark only does certain Wifi admin stuff (e.g. listen for new firmware flashes) when it loop()s.
The obvious workaround is to make sure sketches are constructed such that they frequently loop(). But it appears that iterating over loop() incurs overhead, and in my case the performance differences are dramatic: 2 stepper motors I’m controlling move 5x faster (!) in a sketch that avoids iterating over loop(), vs. an otherwise equivalent sketch that loop()s frequently.
I can cope with the problem by writing sketches that loop() with a frequency that trades off performance vs. access to the core over WiFi. Still, I’d recommend this behavior be documented clearly, because the source of and solution to the symptoms I observed was very non-obvious. I also think the behavior should be changed, i.e. fixed, if possible.
Details:
-
My diagnosis is consistent with the data I’ve gathered, with one important exception: As noted before, ‘spark list’ showed core(s) as online that I was unable to flash to. I don’t have a great explanation for this. But: (1) I’ve noticed that ‘spark list’ reports lag reality, so maybe that’s what I saw. Consistent with this, I now see that ‘spark list’ eventually reports my unflashable cores as either offline or nonexistent. (2) It might be that the core hw works in such a way that ‘flash’ depends on the sketch loop()ing, but ‘list’ does not. But this is pure speculation.
-
As I said, my diagnosis relies on some guesses about how the core hardware works. But try this: put a while(1) loop immediately inside your sketch’s main loop(), so it looks like this: void loop() { while(1) { …code… } }. This keeps the sketch from ever loop()ing. And if your results are like mine, any core running such a sketch will be stubbornly unflashable.
-
If I’m right, the real problem probably has nothing to do with either the CLI or OS X. It’s really about (a) the inability to talk to a core that loop()s rarely or never, and (b) the unexpectedly poor performance of some sketches that loop() frequently.
-
It would be easy to dismiss the problem – just as my initial post was evidently dismissed – as fundamentally about software that’s been designed badly, or at least idiosyncratically. That would be a mistake, because it would ignore the fact that so-deemed “good” software design can badly hurt performance. As an example, I wrote two sketches to control (x,y) stepper motors, moving a printer head about 10cm diagonally and back. One sketch loop()s very frequently, and completes the task in 50s. The other doesn’t loop() at all but is otherwise identical. It takes 11s.
Short-Term Recommendation: Document the Behaviors
It looks to me like there are two noteworthy behaviors:
- Cores running sketches that rarely or never loop() are unreachable over WiFi; and
- loop()ing incurs some overhead such that some frequently loop()ing sketches can be markedly and unexpectedly slow.
I suspect that if you look carefully enough at the Spark core documentation, you can find information that suggests or even explicitly informs readers about one or both of these behaviors. But I personally don’t remember anything like that, and I read the docs pretty carefully. Also, iapparently no one who read my initial post could diagnose the symptoms I described. So my claim is that these behaviors are strange and important enough that you need to document them more thoroughly.
More important, the symptoms I saw when working with the Spark core seemed downright bizarre:
- Very simple sketch variants, with apparently trivial structural differences, yielded huge (5x) performance differences, which took me forever to debug.
- These apparently trivial changes in sketch structure also dictated whether a spark core would subsequently respond to or ignore flash attempts
- Sketches that plodded along on the 32-bit, 72MHz Spark core zipped snappily on an 8-bit 16MHz Arduino Uno! Moreover, sketch variants that produced dramatic performance differences on the Spark seemed to have no effect at all on Arduino.
- For most kinds of sketches, loop()-frequency sketch variants seem to have no effect when run on the Spark core either. I suspect the problem only occurs in sketches that depend on very precise timing, i.e. in microseconds. This is certainly true of my problematic sketches: there are several stepper motor-controlling programs and libraries in the public domain, but common to most of them are delays between writing to output pins on the order or hundreds or even tens of microseconds. For sketches using such very short delays, could an innocuous-seeming loop() overhead end up slowing things down considerably?
Another clue comes from this Stack Exchange post, which explains that Arduino’s loop() overhead is so modest as to be more or less negligible. By constrast – and again I’m guessing – the spark core’s overhead is far expensive enough (presumably in support of its WiFi comms), that in some cases it becomes meaningful relative to the time per typical loop().
Of course, when all the evidence is summarized like this, the likely culprits suggest themselves. But piecing it all together took me many, many, hours. In my opinion, today’s documentation doesn’t adequately (if at all) explain these behaviors, which come and go with small and seemingly arbitrary changes, and which tend to manifest in ways that obscure rather than identify what’s really going on.
So I think it would be better if the official documentation pointed out the gotchas more clearly.
Medium-Term Recommendation: Fix the Bug
I realize that some people will dispute whether these behaviors constitute a “bug”. So as evidence, let me offer the AccelStepper library for Arduino. AccelStepper improves on the standard Arduino IDE Stepper library in a variety of ways, and is by Arduino standards astonishingly well-documented. In my experience, AccelStepper is f***ing awesome. It’s not perfect, but honestly it’s beautiful.
So you can imagine my disappointment – and confusion – when my efforts to port AccelStepper to the Spark core failed miserably. I changed “Arduino.h” to “application.h” (aside: had to dig deeper for that one than seems reasonable), and I was careful to use the right GPIO pins for my various needs. But it just doesn’t work. Among other things, steppers controlled by AccelStepper are way too slow on a Spark core – much, much slower than the same code running on a dumb, slow Arduino.
Why? Until a few hours ago, I was stumped. But now I think I get it: unlike most other stepper-controlling software, AccelStepper employs an architecture centered on frequent calls to a certain administrative function called run(). The documentation advises sketches to call run() as frequently as possible so as to ensure that the controlled motor is stepped at just the right time, accelerating and decelerating, bobbing and weaving, etc. It really is something. But now I strongly suspec that AccelStepper commonly employs some very brief delays, by which the Spark core’s loop() overhead is by comparison material. So everything gets screwed up.
Because this reliance on frequent calls to run() lies at the heart of the AccelStepper library’s software design, porting the library to the Spark core is quite non-trivial. I’m certainly not going to do it. Instead, to migrate from Arduino to the Spark core, I’m going to abandon AccelStepper and either substantially modify some other stepper library, or write my own from scratch.
A big part of the Spark core’s value proposition is that it’s Arduino-compatible, i.e.: code written for Arduino will run on the Spark with little or no modifications. As far as AccelStepper goes, that’s just plain untrue.
Whether AccelStepper is common or rare I don’t know. But it sure feels like a bug to me.
But if All This Is too Much Trouble…
Maybe we can still blame the whole thing on some silly noob and his flaky home WiFi network.