We're developing a high-speed SPI application on the M-SoM and have hit a performance limitation with GPIO manipulation, need your help resolving.
Current Situation:
Board: M-SoM M404 (RTL8722DM, ARM Cortex-M33 @ 200MHz)
Using direct register access for GPIO control (base address 0x48014000)
Current method: Read-modify-write operations using OR/AND/XOR on the Data Register
Performance: Toggle-toggle timing is ~1.4μs (approximately 280 CPU cycles for what should be 6-8 cycles)
I've resorted to this approach because the Arduino and DeviceOS I/O functions are even slower.
The Problem: The read-modify-write operations are too slow for our data collection application. The 1.4μs toggle time suggests either:
The GPIO peripheral is on a much slower bus than the 200MHz CPU
There are wait states or synchronization delays
I'm missing the correct register access method
What I've Tried: I've tested writing to offsets 0x08, 0x10, 0x14, and 0x18 from the GPIO base address, hoping to find atomic set/clear/toggle registers (similar to STM32's BSRR or Nordic's OUTSET/OUTCLR), but these addresses don't appear to provide atomic bit manipulation on the RTL8722DM.
Questions:
Does Realtek's GPIO peripheral implementation in the RTL8722DM include atomic bit manipulation registers? If so, what are the correct register offsets?
What is the bus architecture for GPIO on the RTL8722DM? Is it on AHB (200MHz) or a slower APB bus? This would explain the timing discrepancy.
Is there detailed register-level documentation available for the RTL8722DM GPIO peripheral? The current Realtek documentation doesn't cover low-level register manipulation.
Are there any Particle-specific optimizations or recommended approaches for high-speed GPIO on the M-SoM?
Why This Matters: I need to achieve reliable SPI communication at several MHz. With the current 1.4μs toggle time, I'm limited to roughly 350kHz, which is insufficient for our application.
I understand the GPIO peripheral design is Realtek's implementation choice rather than an ARM Cortex-M33 feature. Any documentation, register maps, or guidance you can provide would be invaluable.
Thank you for your time and assistance. I'm happy to provide any additional test results or information that might help resolve this.
I’ve looked through the ameba Arduino port and only see the base 0x48014000 and the usual port code in the variant. Any help would be greatly appreciated.
There is no way to interact with the GPIO directly on RTL872x. It’s on a separate, really slow bus.
One option is to use the SPI peripheral for GPIO. See the Neopixel library for an example. This will be pin restricted and there are a limited number of ports.
The only other alternative is to use a separate microcontroller for the separate high-speed GPIO and have it connect back to the RTL872x by serial, I2C, or SPI. It may be possible to make this remotely upgradeable by using Asset OTA.
@holla2040 the MSoM has two SPI buses. SPI runs at 50MHz and SPI1 runs at 25MHz. If you dedicate one of those buses to a single device, you may be able to omit the need for a CS line altogether, depending on the device.
@holla2040 I think there are two solutions you could use depending on whether sustained and long duration SPI transaction are necessary or not. Assuming you do “burst” data collection:
Use an extenal MCU that handles all SPI transactions with the ADC. This could be an ATTiny or an STM32F0 class MCU (which already has Asset OTA support). The “tiny” MCU would connected to the ADC and buffer the entire “burst” of data needed. A GPIO from the MSoM could be used to trigger the data collection. The MCU be connected to the MSoM via a second SPI in slave mode with the MSoM using a “slow” GPIO for the CS. The buffered data on the MCU could be read by the MSoM in a single DMA transaction.
Use an “auto CS generator” that uses the SCK from the MSoM to produce a CS signal that meets the ADC datasheet requirements. This requires using a retriggerable one-shot along with some XOR gates and an schmitt trigger that sligtly delays the MSoM SCK going to the ADC in order to sneak in the CS going LOW prior to the rising SCK edge and going HIGH slightly after the last SCK pulse. (I can provide schematic but see questions below).
Questions:
Is the “data burst” assumption correct? If now, how are you intending to interact with the ADC?
How fast are you hoping to drive the SPI bus given the ADC can operate at 20MHz?
Were you intending to use the IRQ output of the ADC? If you were, I believe the tiny MCU solution is the only viable one.
@holla2040, here is the schematic for the auto-cs generator FYI. It nees 1% resistors and COG/NOD capacitors for stable timing but it should work. I’ve also included the timing diagrams running SPI at 5MHz for anyone interested.