Fun times! I tried this, saw the same hard fault, and solved it. There are two issues. One is manually handling register state when the assembler isn't helping you. The other is ARM/Thumb-2 interworking.
TL;DR
- Always prepend
0xff, 0xb4
to your machine code. - Always append
0xff, 0xbc, 0x70, 0x47
to your machine code. - Call one byte after your pointer.
Explanation
Handling register state
Your code, whatever it is, changes some register values. When you return to the caller, those registers cause a fault because they don't hold their expected values, and they get used in some unintended way.
In assembly this problem is typically solved by pushing registers onto the stack at the start of a function (b4ff
), and then popping them off the stack at the end (bcff
). The ff
in both cases says to push/pop all the registers (r0–r7), just to be safe. There also exist general registers r8–r12, but there's no push instruction for those.
The final 4770
is just the bx lr
"return" I suggested on March 29.
(Thumb-2 instructions are 2 bytes wide and little-endian, so logical 4770
is stored in memory as 7047
.)
Address-based Interworking
ARM instructions are 4 bytes wide, whereas Thumb instructions are 2 bytes wide. They both are always aligned in memory to start at even addresses.
In general a program can have both ARM and Thumb instructions, and so the processor needs to know which mode it's in. When you call a function, you might be switching between ARM and Thumb mode. Half the types of branch instructions determine what mode to use by looking at the least significant bit of the address. They can do this because the addresses are really always even, so the least significant bit is always zero — thus ARM turned that bit into a mode flag.
Quoting a great article titled Branch and Call Sequences Explained on the ARM Processors blog:
Address-based interworking uses the lowest bit of the address to determine the instruction set at the target. If the lowest bit is 1, the branch will switch to Thumb state. If the lowest bit is 0, the branch will switch to ARM state. Note that the lowest bit is never actually used as part of the address as all instructions are either 4-byte aligned (as in ARM) or 2-byte aligned (as in Thumb).
Particle firmware is built entirely in Thumb mode to save code space, so we always want bit 0 of the branch address to be 1.
You can do the casting dance suggested on stackoverflow to force the compiler to let you turn a data pointer into a function pointer (on other architectures they're not necessarily the same size) and assign it to the right address like this:
void (*f)(); // declare function pointer f
*((void**)&f) = p + 1; // assign f to p+1, simpler syntax won't compile
f(); // call f
However, I personally find the single line of inline assembly simpler:
asm( "blx %0" : /* no outputs */ : "mr" (p+1) );
Example Code
const unsigned char d7on[] = {
0xff, 0xb4,
0x40, 0x21, 0x09, 0x02, 0x02, 0x20, 0x01, 0x43,
0x09, 0x04, 0x20, 0x20, 0x00, 0x02, 0x88, 0x61,
0xff, 0xbc,
0x70, 0x47 };
const size_t CODE_LEN = sizeof(d7on) / sizeof(d7on[0]);
void setup() {
pinMode(D7, OUTPUT);
}
void callMemFunc() {
void *p = malloc(CODE_LEN);
memcpy(p, d7on, CODE_LEN);
// either cast void* to a function pointer
//void (*f)();
//*((void**)&f) = p + 1;
//f();
// or just write the single branch instruction
asm( "blx %0" : /* no outputs */ : "mr" (p+1) );
// Don't forget to clean up after yourself.
free(p);
}
void loop() {
delay(10000);
callMemFunc();
Particle.publish("I called a function in memory.");
}