Local cloud - SOS panic flash with user firmware [Solved]

pixelboy · August 5, 2014, 12:42am

@gruvin You are correct even using setup it still sometime goes into SOS cycle. I am running the local server on my laptop… it is OSX and it is 64 bit if that’s relevant. I am going to try running it off a centOS server and see if I get different results.

@kennethlimcp @mdma I’m just going through all the comments now. Is there anything i can do at this point to help? Let me know.

mdma · August 5, 2014, 12:45am

@pixelboy - this is open source - you know your own skills, so you know best where to help! And of course, the help is truly appreciated - loved in fact!

kennethlimcp · August 5, 2014, 1:02am

I would say, try the 2 older tinker firmware posted above. The more people proving that the firmware causes issue the better.

We will have a reproducible test case then

pixelboy · August 5, 2014, 2:19am

Okay I have something, but I’m not sure if it helps at all. I just installed the spark local server on a remote webserver I have. The server is CentOS 64 bit. Everything is working flawlessly. Custom firmware works. Tinker works. I can power everything down and up and it still works.

The interesting thing was before I setup the port mapping so the server could communicate back to the core the core was perfectly happy. I don’t understand this because if the core didn’t get a response from the server it should flash cyan. This is where I wish I really knew what was going on under the hood.

@kennethlimcp I tried the old tinker firmware and it works way better.

Dave · August 5, 2014, 3:19am

I just wanted to pop in and say you guys are awesome, I’m watching this thread very excitedly.

Thanks,
David

kennethlimcp · August 5, 2014, 3:22am

It will be best if someone can compile locally some custom user firmware with all the 3 branches before V0.2.3 was introduced.

If that works well locally without any SOS flashes, we might have found out something

I did no setup my local environment and won’t be doing it anytime soon

kennethlimcp · August 5, 2014, 3:24am

It’s so weird why the firmware for V.0.2.3 works on a remote server but not locally.

I hope that i am right about the firmware causing the issue but it’s bugging me why remote vs local gives a different behavior for SOS.

gruvin · August 5, 2014, 3:25am

@kennethlimcp … FWIW, I found the local build environment setup pretty painless on a Mac.

Meanwhile, I got workaround going for the DEBUG macro to work. But then I found that none of the DEBUG macros are available from down inside spark_protocol.cpp, where the handshake code is.

This is getting frustrating. We’re working in the dark. The people who wrote this code originally would be able to nail it so much faster.

gruvin · August 5, 2014, 3:26am

Me too. I believe it is firmware though, because we're seeing a hard fault. Ideally, a hard fault should never happen -- no matter what the server does or does not do or when.

kennethlimcp · August 5, 2014, 3:27am

ok @gruvin, less talk more work. (kidding)

Go pull the 3 branches before V.0.2.3 and do a compile with some user firmware.

use this!!! @gruvin

gruvin · August 5, 2014, 3:28am

I'll have to figure out how to do that. Back to Git documentation. (I've been a Subversion user for years. Git is still very new to me.)

EDIT: I see the tags in the web browser. But how to clone or pull them? I'm lost.

kennethlimcp · August 5, 2014, 3:36am

You can download all the zip file and use the files.

They contain all the files tagged during the V.0.2.2 release and does not contain commits after that

gruvin · August 5, 2014, 3:39am

OK. Thanks. That works.

gruvin · August 5, 2014, 3:44am

I’ll edit this post as I go.

So far …

spark_6: still fails. Moving to spark_5 …
spark_5: still fails. Moving to spark_4 …
spark_4: still fails. Moving to spark_3, to clean compile locally as a sanity check …
spark_3: still fails! What the?! …

OK … so I’ll try the downloaded binary from spark_3 … works just fine.

HMMM.

Maybe that binary is actually from spark_2? Moving to download and locally compile spark_2 then …

There are no tags earlier than spark_3 for communication-lib or common-lib. spark_2 core-firmware does not compile against those.

kennethlimcp · August 5, 2014, 3:58am

Nah. Spark_4 was the link I paste

gruvin · August 5, 2014, 4:00am

Right. So I'm going to try the Spark HQ compiled binary from spark_7. This could be a local compiler issue. Will edit this post with results shortly.

That still fails. Far out. OK, so I'll try the Spark binary from spark_6 (and keep moving down the chain until I find the latest version that works).

The binary from spark_6 is working. No SOS.

But ALL versions I compile locally, from spark_3 to spark_7 inclusive, fail with the SOS.

So let's put that in our pipes and smoke it for a bit. Gee. Hmm. :-/

(And yes -- I always test at least three resets, to be sure I'm not being mislead. If I say it worked, it means the core connects first time, every time.)

Oh and all versions that I compile locally DO WORK just fine with the global cloud ... though right now, that's not making sense. So I'm gonna double check. ... and that is definitely working fine.

So ... only local builds and local server (and tag:spark_7's Spark compiled binary) are failing on the local server. The plot has thickened.

gruvin · August 5, 2014, 4:43am

spark flash --usb tinker exhibits the SOS on the local cloud server, too. That’s not surprising though, since the same is true for the core-firmware.bin binary in tag:spark_7. (Recalling that the binary from tag:spark_6 appears to be OK.)

I wonder if the Spark server build environment itself got an upgrade recently? Like the compiler and tools and stuff? I believe I am running the very latest release version of arm-gcc and have been since the (my) start.

That version is arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.8.3 20131129 (release) [ARM/embedded-4_8-branch revision 205641]

pixelboy · August 5, 2014, 4:29pm

I want to add one more weird thing that seems to be happening consistently… if I leave the core unplugged for a while (like hours) and plug it in… it seems to work the first time (not subsequent times). Is there some kind of cache? Some bit of connection data that resides in volatile memory?

Dave · August 5, 2014, 4:39pm

Hi @gruvin,

We did update the ARM toolchain recently on the build server so we could get the newlib stubs for the big ram improvement.

HMM. If you’re compiling locally, and you don’t have that newer ARM toolchain, I’m not sure why there would be local vs. public server differences. The handshakes should be essentially identical. Maybe certain size / types of local server / core crypto keys are causing a fault when compiled with different toolchains? – I’m guessing here.

Thanks,
David

Dave · August 5, 2014, 4:41pm

I’ll tryout the spark_7 binary in my local server and see if I can spot anything.

edit: argh. okay, no luck with that tagged binary either. Can someone send me a failing local cloud server key / core key to test, just in case?

Thanks,
David

Topic		Replies	Views
Local cloud server connection problem, spark core led turn red and then reset Cloud	8	2717	August 27, 2014
Local cloud: Persistent hard fault after core disconnection Troubleshooting	17	3255	August 7, 2014
Spark local Cloud (Beta) - Collection of issues Troubleshooting	8	2396	August 7, 2014
Tutorial: Local Cloud 1st Time instructions [01 Oct 15] Tutorials	144	43525	September 29, 2021
Local Cloud Beta Testing [Where is the Source code for the Cloud] Cloud	121	12603	November 9, 2014

Local cloud - SOS panic flash with user firmware [Solved]

Related topics