Reducing Photon run-time power consumption (HOWTO)

Not sure if this will help anyone else, but I have found a way to reduce the runtime power consumption of the Photon. These things could be done by modifying the bootloader, but as I am currently unable to reflash the bootloader, I wanted to find a way to reduce power consumption without touching the bootloader.

Note: This is highly experimental, and it messes with the internal clocks, and there are various flow on effects. However I have always been able to recover to DFU mode even when things have gone wrong.

I would be interested in feedback / suggestions / improvements on the following code. I have noted some of the problems below the sample code… In normal operating mode, my Photon(s) tend to use about 32mA with WiFi disabled.

Let me explain how this works… The SYSCLOCK runs at 120MHz with APB1 and APB2 prescalers set to 4 and 2 respectively. So we can reduce the SYSCLOCK to 60MHz by setting the AHB prescaler to 2. Then to retain the same speed for many of the peripherals we reduce APB1 and APB2 to 2 and 1…

void stm32f2_lowpower3(void) {
    RCC->CFGR &= ~0xfcf0;
    RCC->CFGR |= 0x0180;

    SystemCoreClockUpdate();
    SysTick_Configuration();

    FLASH->ACR &= ~FLASH_ACR_PRFTEN;
}

The function SystemCoreClockUpdate() updates the value of the variable SystemCoreClock. This then comes into play with SysTick_Configuration.

The above function reduces power consumption to about 19.5mA (about 33%) - everything seems to operate correctly (although I have yet to test things like I2C or SPI). Of course with the slower CPU speed, your code will run slower.

The FLASH->ACR line is disabling the Prefetch for FLASH memory, which seems to reduce consumption by about 0.3mA. Not much, but every bit counts. Also disabling the RGB LED drops consumption by up to 2mA or so.

Next step we do a similar sort of thing… divide the CPU clock by 2 again. However we can no longer reduce APB2, but we can still reduce APB1 to 1…

void stm32f2_lowpower2(void) {
    RCC->CFGR &= ~0xfcf0;
    RCC->CFGR |= 0x0090;

    SystemCoreClockUpdate();
    SysTick_Configuration();

    RGB.control(true);
    RGB.color(0, 0, 0);

    FLASH->ACR &= ~FLASH_ACR_PRFTEN;
}

The above code has SYSCLK running at 120MHz / 4 = 30MHz. Peripherals attached to APB1 will be running at normal speed, however those attached to APB2 will be running at half speed. Current consumption drops to about 13.8mA (about 55% reduction).

Finally, we can keep reducing the SYSCLK, however there is a diminishing return, so the final bit of code is about the slowest we can operate:

void stm32f2_lowpower1(void) {
    RCC->CFGR &= ~0xfcf0;
    RCC->CFGR |= 0x00c0;

    SystemCoreClockUpdate();
    SysTick_Configuration();

    RGB.control(true);
    RGB.color(0, 0, 0);

    FLASH->ACR &= ~FLASH_ACR_PRFTEN;
}

The third bit of code above reduces consumption to about 8.75mA (about 33% of the original consumption).

There are a couple of problems at this point that need to be overcome:

  • The RGB LED stepping / breathing behaves oddly (effectively flashing rather than breathing).
  • Any timers set prior to the above code will need to be adjusted. The SparkIntervalTimer library needs to be patched to support the above code.
  • With the highest power saving setting (the third code segment above) the serial output can misbehave, so I suspect other things like I2C and SPI will need some messing around.

During my experiments I have also found the timers / clocks seem a little “out”, however I have yet to followup on that (they tend to drift when compared to another independent time source). In my case I need relatively accurate timing and loosing a few seconds every couple of hours is not good.

So… rather than continuing to dig myself… anyone have any other suggestions???

Note: I know I said this earlier… but this is highly experimental code… so take care…

3 Likes

I have done some testing with the OneWire library and a DS18S20 temperature sensor. The OneWire library does need a little patch for the delayMicroseconds… This function normally ends up calling HAL_Delay_Microseconds, however this seems to generate the wrong delay when the clocks are modified.

Does anyone know if HAL_Delay_Microseconds is compiled into the bootloader, or is it compiled into the user code? Either way, rather than call delayMicroseconds within OneWire, it is better to call a function similar to the following:

inline void mydelay(uint32_t uSec) {
    volatile uint32_t DWT_START = DWT->CYCCNT;
    volatile uint32_t DWT_TOTAL = (SYSTEM_US_TICKS * uSec);

    while((DWT->CYCCNT - DWT_START) < DWT_TOTAL)
        HAL_Notify_WDT();
}

Using the above delay loop, I can get the sensor (generally) working with the earlier function stm32f2_lowpower3. However at higher levels of power savings (slower clock), the timing is wrong for OneWire as it is tied to APB2.

Are any of the pins tied to APB1? In which case I might be able to get OneWire functioning with stm32fs_lowpower2.

Note: There are far more CRC errors, as I suspect it is due to OneWire and the delay function. It may need to be slightly more accurate… as SYSCLOCK is slowed, instructions take longer, so a delay of 3ns might actually end up being longer at slower clock speeds.

1 Like

Very nice - thank you!
Just tested your code and could confirm your stated power savings. Just for the first second after rebooting the power consumption is higher (around the original values)… I assume this could only be changed by touching the bootloader, uff?!

I tested these settings in combination with an SPI controlled eInk shield/display - all three settings including the lowest power setting worked out of the box, excellent :wink:

I am slowly working out some of the bugs mainly in the OneWire library. SPI and I2C I think run on hardware, and seem less prone to timing errors. However the more common OneWire library uses software for timing and as such is very prone to errors.

There are a couple of other tricks. For instance if you shut down all the peripheral clocks you can save a tiny bit more… however if you are using something like SPI or I2C you need to figure out which clocks to leave running. The pin_map helps with this - but I am focused on other issues right now. For example, the following will shut everything down except for the LED on pin D7:

/* Disable all the peripheral clocks */
RCC_APB1PeriphClockCmd(0xfffffffe, DISABLE); /* Bit 0x00000001 = LED flash */
RCC_APB2PeriphClockCmd(0xffffffff, DISABLE);
RCC_AHB1PeriphClockCmd(0xffffffff, DISABLE);
RCC_AHB2PeriphClockCmd(0xffffffff, DISABLE);
RCC_AHB3PeriphClockCmd(0xffffffff, DISABLE);

Right now I can get it down to about 8.25mA - and based on some other posts I have been able to change the WiFi to a lower power / higher latency connection - however have no power measurements on this yet…

You are correct - there is a spike on restart as all the settings are coded into the bootloader. While there are recommendations not to change the clock speeds once the device is running, changing the scalers seems to be a bit better of an option. My actual aim is to get the consumption to about 4mA while still be able to use SPI, I2C and OneWire stuff.

Hopefully in a week or so I will have a OneWire delay function that works. The other alternative is to use hardware rather than software for OneWire and see if that fixes the timing problem. If I recall correctly I also had an SD card wired in and working in lower power modes… again, trying to nutt out the OneWire stuff first.

2 Likes

After a lot of testing the following shows the problems with the OneWire library. dM is the built-in delayMicroseconds and md is the same code as delayMicroseconds, but compiled into the user code. By using the clock cycle counter, the below is the actual number of clock cycles to call each function, divided by the clock speed…

normal   	dM(1): 1.56	dM(10): 10.56	md(1): 1.55	md(0): 0.52	md(3): 3.36	md(10): 10.55

lowpower3	dM(1): 3.15	dM(10): 21.17	md(1): 1.83	md(0): 1.08	md(3): 3.68	md(10): 10.88

lowpower2	dM(1): 6.30	dM(10): 42.33	md(1): 3.67	md(0): 2.17	md(3): 4.97	md(10): 12.17

So for example, calling delayMicroseconds(1) under normal conditions results in a delay of 1.56us. No doubt many of you have already noticed the problem - all of these delays are inaccurate - particularly at the sub 10us. In the best case calling my custom delay function for 3us actually results in a delay of 5us using the lowpower2 function from the first post.

Looking through the OneWire library, we can see several delays of 10us or less. Given the inaccuracies in the delayMicroseconds loop (ignoring lower CPU speeds) at smaller delays, I would say things like occasional CRC errors and some of the odd behaviour in OneWire library may be due to the inaccuracies in the delays.

So next I need to see if I can create a delay function for the OneWire library that will create the correct delay (unless someone has something already).

1 Like

Try using the system ticks counter API for these very small delays:
https://docs.particle.io/reference/firmware/photon/#system-cycle-counter

If you encounter any unwanted overhead you can try using it directly like this. Of course this example shows a 1 second delay, but at the time I did this to remove all doubt in timing since SYSTEM_TICK_COUNTER is a macro for a hardware counter.

Defined as such:
#define SYSTEM_TICK_COUNTER (DWT->CYCCNT)

There is entry and exit code for all functions which will add some delay. As @BDub points out, for zero-overhead, precise timing, use the system ticks counter directly and divide by the clock rate.

As it is presently, system routines are unaware of changes to the system clock and will continue to assume the “normal” clock rate. There are steps we can take to allow the application to change the system clock and keep the system informed.

As an alternative to reducing the clock frequency, how about putting the MCU in stop mode when there is no work to do, and set an interrupt to fire for when it should wake? I recall seeing an implementation of OneWire that used the USART to perform the timings needed, which would function well here, allowing the MCU to sleep while delaying.

Try using the system ticks counter API

The problem is delayMicroseconds calls HAL_Delay_Microseconds which actually uses DWT->CYCCNT. As I highlighted earlier, for some reason this function does not adjust to the new clock speed, despite SYSTEM_US_TICKS being defined as SystemCoreClock / 1000000. And I have checked - SystemCoreClock is being adjusted by my call to SystemCoreClockUpdate(). If I copy the HAL_Delay_Microseconds into my application, and call that directly, it does adjust correctly after the CPU scaling is adjusted.

Regardless, it turns out the problem with the OneWire library was two-fold. Firstly the delay function itself is inaccurate at sub 10us, and secondly many of the delay times in the software library are not standard. By correcting these two things, I have OneWire working reliably at 120MHz, 60MHz and 30MHz (less than 1% CRC errors returned on a DS18S20 compared to 5% error rate at 120MHz and 95% error rate at 60MHz previously).

WiFi has been confirmed as working at 60MHz... Consumption for WiFi running normally at 120MHz was 75-105mAh... with the clock running at 60MHz and low power WiFi mode I have been able to reduce consumption to about 30mAh. At 30MHz WiFi generates a fault condition.

As an alternative to reducing the clock frequency, how about putting the MCU in stop mode

I considered that, but at the interrupt frequency and delays during wake-up, I decided to opt for something a little more challenging - but more rewarding. This way you can gain the benefit of WiFi always being on, yet if running off battery get longer battery life.

Lets assume 1000mAh... right now at about 100mAh that gives 10 hours run time. Looking at this I suspect I can get consumption down to 30mAh with WiFi connected, and less than 8mAh with it off. Lets say I can do burst transmissions resulting in 10mAh average... giving 100hours of runtime.

I have already seen the benefit from this work anyway: One of my controllers was always running "hot to touch" drawing over 150mAh. It now draws less than 30mAh and is a smidge warm, yet performing exactly the same function. The next three devices have to run cold, and have long on-time.

1 Like

@Nathan, hope you share your code when you feel it’s ready; I’d love to get that kind of battery life!

Will share the code - most of it is already in the thread, however I am improving and patching heaps of things at the same time. OneWire needs a whole lot of patching to replace the calls to delayMicroseconds with the following:

inline void mydelay(uint32_t uSec) {
    volatile uint32_t DWT_END = DWT->CYCCNT + (SYSTEM_US_TICKS * uSec) - 38;
    while (DWT->CYCCNT < DWT_END)
        ;
}

It is brutal and there are a couple of small problems with it, but works fairly well. To enable lowest power mode on the WiFi I modified wlan_connect_init as follows:

int wlan_connect_init()
{
    wiced_network_up_cancel = 0;
    wwd_wifi_enable_powersave();
    return 0;
}

Again, brutal - you also have to reflash with part1, part2 and your app. Just had no idea how to call the powersave function directly from my app.

Eventually I might chuck everything into a nice neat class and do things like adjust timers etc (if required). I would like to get to the stage where I can dynamically adjust the CPU speed whenever I want. There is actually a bug in lowpower3 function above - 0x0180 should actually be 0x1080.

1 Like

There are two copies of the SYSTEM_US_TICKS value - one in the system module and one in the user application. This is because the CMSIS library in the platform module is statically linked.

Going forward, we can add APIs for adjusting the system clock.

2 Likes

Thank you, Nathan, for sharing this to us.

Recently I am working on some project that is required to conserve battery consumption as much as possible yet the device has to remain responsive to sensors and buttons input. I was using System.sleep(,,) but :frowning: then I am not able to wake up from the buttons and sensors. The solution Nathan posted here to slow down the MPU clock speed is brilliant and it is working perfectly :grinning:

I have constructed two states for the device: normal and low power. In low power state, I used RCC_CFGR_HPRE_DIV64 || RCC_CFGR_PPRE1_DIV1 || RCC_CFGR_PPRE2_DIV1 (that is stm32f2_lowpower1() above) and I am happy to found that all the modules attached to the SPI and I2C buses were all working fine! :sweat_smile: (I have an 128x64 OLED display, a BME280 and a SI7021 on the I2C bus; a character ROM (for the display) and a micro-SD card on the SPI bus. But as pointed out in the post, the Serial output was a mess :slight_smile:

One peculiar behavior I have observed was when I tried to set the MPU to normal state by using 0x9400 that is (RCC_CFGR_HPRE_DIV1 || RCC_CFGR_PPRE1_DIV4 || RCC_CFGR_PPRE2_DIV2) - the default Photon setting, I found that the Photon was either not responsive nor behaving well. This call was the first call in the setup() and it is not suppose to have any effect since I just set the RCC->CFGR to the default setting. I put it there just as a safe guard after Reset (but found that it was not necessary - a Reset will set the Photon back to it’s normal operation speed. This is good! ). Therefore, I have no explanation to the behavior, but, I haven’t dig deeper into it either. Everything was back to normal once I commented out this call.

I’ll see if I have time to look into this weird behavior later.

A more readable implementation of:

RCC->CFGR &= ~0xfcf0;
RCC->CFGR |= 0xXXXX;

would be:

RCC_PCLK1Config(RCC_HCLK_Div1);
RCC_PCLK2Config(RCC_HCLK_Div1);
RCC_HCLKConfig(RCC_SYSCLK_DivXX);

2 months later … Geez! I realized my mistake! Guessed that’s why you shouldn’t code without any alcohol in your blood stream…

(RCC_CFGR_HPRE_DIV1 || RCC_CFGR_PPRE1_DIV4 || RCC_CFGR_PPRE2_DIV2) should be

(RCC_CFGR_HPRE_DIV1 | RCC_CFGR_PPRE1_DIV4 | RCC_CFGR_PPRE2_DIV2)
1 Like

I have been using this code and it is successfully knocking about 2.5mA off of the power consumption!

But.. I think it's extremely odd that all of the hardware I'm using seems to continue functioning as it should...
I2C works fine, USB works fine, GPIO works fine.... so clearly it is not actually having the effect of disabling all of the peripheral clocks. I have done some investigation, but I'm at a loss to understand if some of these calls are simply failing or if there is something coming in behind it and switching most of it back on.

Unfortunately I don't have a JTAG development setup to step through everything and watch the registers.