Update:
I posted a related issue here:
It seems that if you keep hammering the modem with Cellular.getDataUsage()
(on the order of twice per second) once an Electron has successfully connected to the internet, then subsequent calls to Particle.publish()
don't block for very long.
My working theory is that if the modem doesn't receive an AT command from the STM32 regularly (say for more than a second) then it goes into some sort of lower power quasi-sleep mode which makes subsequent AT commands take longer to respond to. So hammering it keeps the modem "awake", thus reducing latency associated with synchronous AT commands.