Decoding URLencode

Im confusing myself a bit with URL encoding i think but my understanding is if my string was

this is special char Ö

and i URL encode it becomes

this%20is%20special%20char%20%C3%96

so a %20 is a space, and %C3%96 is the Ö,

On the message torch there is the function to decode the URLencoded message, could someone explain how the line c = (hexToInt(aText[i+1])<<4) + hexToInt(aText[i+2]); works? and then why is the bit looking for the 0xC3 not within the if statement for the 2 chars following the %. maybe I’m missing something that gets done when the spark function transfers the string to the core?

int newMessage(String aText)
{
// URL decode
text = "";
int i = 0;
char c;
while (i<(int)aText.length()) {
if (aText[i]=='%') {
if ((int)aText.length()<=i+2) break; // end of text
// get hex
c = (hexToInt(aText[i+1])<<4) + hexToInt(aText[i+2]);
i += 2;
}
// Ä = C3 84
// Ö = C3 96
// Ü = C3 9C
// ä = C3 A4
// ö = C3 B6
// ü = C3 BC
else if (aText[i]==0xC3) {
if ((int)aText.length()<=i+1) break; // end of text
switch (aText[i+1]) {
case 0x84: c = 0x80; break; // Ä
case 0x96: c = 0x81; break; // Ö
case 0x9C: c = 0x82; break; // Ü
case 0xA4: c = 0x83; break; // ä
case 0xB6: c = 0x84; break; // ö
case 0xBC: c = 0x85; break; // ü
default: c = 0x7F; break; // unknown
}
i += 1;
}
else {
c = aText[i];
}
// put to output string
text += String(c);
i++;
}
// initiate display of new text
textPixelOffset = -ledsPerLevel;
textCycleCount = 0;
repeatCount = 0;
return 1;
} 

int hexToInt(char aHex)
{
if (aHex<'0') return 0;
aHex -= '0';
if (aHex>9) aHex -= 7;
if (aHex>15) return 0;
return aHex;
}

Hi @Hootie81

There are two things going on here: first there is URL encoding which makes any special punctuation characters safe to transmit and then there is the 16-bit unicode character for the letters with umlauts or other diacritical marks.

The code above looks for the URL encoding marker for a special character (the % character) and then decodes the next two characters as hex values into one output character. So %20 becomes a space.

But if you look at the code, there is an else if code that does not handle the % special character, it instead assumes that there is a 0xc3 in the encoded string–but there is not.

Im just playing around a bit now and im wondering if the cloud is playing with the string? im sending a url encoded string, and if i print the string that arrives on the core it prints normally?

I tested this by doing a serial print before the decode bit and then again after

edit: further testing, if i send a % character url encoded the ie %25 then the message torch will remove the 2 characters after the %25
string sent %25chris string recieved %chris string decoded ris

So im now going to assume the cloud decodes the string somewhere along the line

Thats part of the reason i was confused @bko because the urlencoded 16bit character would be %C3%96 so if the message was arriving url encoded then you would look after the % for the C3 and after the % again for the 96.

but because of the else if in the code it just looks for the decoded c3 and 96

Im wondering if the original intention was to look for the C3 after the % and then when it didn’t work he moved it outside the if statement and it worked, but now because of the remnants of the code to check after the % some characters are getting missed after a properly encoded % sign.