martmists
10/17/2024, 7:33 PMtoChar()
but that fails for 3-byte values like 0x122C5 (CUNEIFORM SIGN SHID TIMES IM)Kirill Grouchnikov
10/17/2024, 9:33 PMmartmists
10/17/2024, 9:35 PMokarm
10/17/2024, 9:49 PMokarm
10/17/2024, 9:51 PMokarm
10/17/2024, 9:53 PMTo encode U+10437 (𐐷) to UTF-16:
Subtract 0x10000 from the code point, leaving 0x0437.
For the high surrogate, shift right by 10 (divide by 0x400), then add 0xD800, resulting in 0x0001 + 0xD800 = 0xD801.
For the low surrogate, take the low 10 bits (remainder of dividing by 0x400), then add 0xDC00, resulting in 0x0037 + 0xDC00 = 0xDC37.
Daniel Pitts
10/17/2024, 9:57 PMDaniel Pitts
10/17/2024, 9:57 PM