Confusing loop timing

I’ve got another mystifying problem. Say I take this simple bit of code (note that it seems arbitrary, but my application does actually need this sort of setup) on manual threading:

ATOMIC_BLOCK() {
    digitalWriteFast(D1,HIGH); //for timing
    while(i<(10)){
        i++;
        if(i==1){
            variable=true;
        }
    }
    digitalWriteFast(D1,LOW); //for timing
}

Just counting up to 10, setting a variable when it gets to 1, and timing it. This takes 200 nanoseconds to run. I can make the loop run an arbitrary number of times with almost no change in timing. (tried while(i<(100000)){, still 200ns). Perfect. Next, I try:

ATOMIC_BLOCK() {
    digitalWriteFast(D1,HIGH); //for timing
    while(i<(10)){
        i++;
        if(i==1){
            analogWrite(DAC1, 4000);
        }
    }
    digitalWriteFast(D1,LOW); //for timing
}

Just counting up to 10, and setting a DAC value (this doesn’t actually matter, I’ve tried it with digitalwrites etc.) when it reaches 1. On an oscilloscope, it takes about 4.5 us to run. My timing estimates show that DAC write alone takes 4 us, so that’s perfectly fine.

Now, say I make it loop 100,000 times here. This should, according to my intuition, only take 4 us + 200ns, because it’s only setting the DAC value when it counts to 1.

And yet, this is not what I see. I get a time of 7.5 ms, as though it’s actually setting the DAC value 2000 times instead of once. I’ve replicated this on multiple P1s.

@Arthur_dent_42_121, where do you initialize i ?

Sorry about that, didn’t show that. I have that initialized above outside the ATOMIC_BLOCK.

[quote=“Arthur_dent_42_121, post:1, topic:19553”]
(tried while(i<(100000)){, still 200ns).
[/quote] This does not make much sense to me. I’ll have to time try this later tonight to see what I get.

I think the majority of the 200ns is because of the DigitalWriteFast that I use to time it. I think the loop is much faster than that.

@Arthur_dent_42_121, I missed the “fast” part! However, even at 1ns (highly unlikely) per loop, it should take 100,000ns!

Perhaps I misunderstood, but the DigitalWrite isn’t actually in the loop; I’m just calling it before and after the loop to time it.

ATOMIC_BLOCK() {
    digitalWriteFast(D1,HIGH); //for timing
    while(i<(10)){
        i++;
        if(i==1){
            analogWrite(DAC1, 4000);
        }
    }
    digitalWriteFast(D1,LOW); //for timing
}

This is the code that takes 7.5 ms, where I think it should only take 4 us.

@Arthur_dent_42_121, I suggest you take the loop out and keey the analogWrite() along with the digitalWriteFast() to time JUST the DAC call. You should put that alone in loop().

I did that before, and I got a time of 4 us.

So now add a loop of 10 that contains ONLY the analogWrite() command and time that (no if (i==1)).

That takes 40us, exactly as one would expect. And 100 cycles is 370us.

If I take the same code that I just used to time 100 cycles, and add in the if(i==1){analogWrite},

It takes 10 us.

For analyzing performance at this level of depth and detail is a good idea to compile using the local toolchain. That will output a map file that shows your original source code and the assembler produced by the compiler. This should shed some light on any differences with the different loop cases.

2 Likes

You also have to take the optimizer into consideration.

If it interprets your code as doining nothing useful but only once when (i==1) it might take the liberty to not do the usless stuff at all.

2 Likes