To start with, I wouldn’t worry too much with encoding, and simply try streaming the audio as 8-bit, 8Khz, and see if you can get that to work. The encoding only adds complexity which you can deal with once you’ve got the basic audio rendering working.
Once 8kHz, 8-bit audio is working, you then know you can then look to increase the sample rate or sample width until the system starts jittering. Once you reach that point, you then know the maximum bandwidth the spark can handle. The next step is to then use a codec to squeeze more information into the bandwidth available to improve audio quality (if necessary.)
Given the limited horsepower on the spark, I would suggest you look at either mu-law encoding, or differential-PCM encoding. Both can reduce the size by roughly a factor of 2.
As to hardware to play the audio, you will need at least an 8-bit DAC. I’ll leave it to the hardware gurus to recommend one, but I would probably go for something simple like this MCP4725.
On the software side, the audio rendering can work like this:
- a sound buffer, which is is a shared circular buffer to store sound data
- the main loop pulls data from the network and pushes it to the sound buffer. (This operation blocks if the sound buffer is full.)
- an interrupt driven audio renderer that is set on a timer to fetch the next sample from the sound buffer and send that to the DAC.
With the circular buffer, audio data is pushed in by the main loop, and then read out again by the audio renderer.
I coded a circular buffer in flash memory, you could adapt this to work with a buffer in RAM, if you don’t find a circular buffer implementation elsewhere.
I hope that gets you started! Good luck!
EDIT: Eh, sorry I see now that you said you can handle the app and hardware!
Still, the codec details are there which is what you wanted!