ansi_notepad.txt 54 65 73 74 3A 20 73 69 B9 9C E6 54 = T, ANSI ASCII 65 = e, ANSI ASCII 73 = s, ANSI ASCII 74 = t, ANSI ASCII 3A = :, ANSI ASCII 20 = [space], ANSI ASCII 73 = s, ANSI ASCII 69 = i, ANSI ASCII B9 = ą, Windows CP-1250 (http://pl.wikipedia.org/wiki/Windows-1250) 9C = ś, Windows CP-1250 E6 = ć, Windows CP-1250 iso_8859_2_gedit.txt 54 65 73 74 3A 20 73 69 B1 B6 E6 0A 54 = T, ANSI ASCII 65 = e, ANSI ASCII 73 = s, ANSI ASCII 74 = t, ANSI ASCII 3A = :, ANSI ASCII 20 = [space], ANSI ASCII 73 = s, ANSI ASCII 69 = i, ANSI ASCII B1 = ą, iso 8859-2 (http://pl.wikipedia.org/wiki/ISO_8859-2) B6 = ś, iso 8859-2 E6 = ć, iso 8859-2 0A = LF (Line Feed), ANSI ASCII utf_8_gedit.txt 54 65 73 74 3A 20 73 69 C4 85 C5 9B C4 87 0A 54 = T, code less than 128, 1 byte per character, equal to ANSI ASCII 65 = e, code less than 128, 1 byte per character, equal to ANSI ASCII 73 = s, code less than 128, 1 byte per character, equal to ANSI ASCII 74 = t, code less than 128, 1 byte per character, equal to ANSI ASCII 3A = :, code less than 128, 1 byte per character, equal to ANSI ASCII 20 = [space], code less than 128, 1 byte per character, equal to ANSI ASCII 73 = s, code less than 128, 1 byte per character, equal to ANSI ASCII 69 = i, code less than 128, 1 byte per character, equal to ANSI ASCII C4 85 = C (16) = 1100 (2), so we have 2 bytes per character coded as 110+xxxxx 10+xxxxxx C4 85 = 110+00100 10+001001 = 00100001001 = 001 0000 1001 = 105 (16) = ą (U+0105) (http://unicode-table.com/en/#0105) C5 9B = C (16) = 1100 (2), so we have 2 bytes per character coded as 110+xxxxx 10+xxxxxx C5 9B = 110+00101 10+011011 = 00101011011 = 001 0101 1011 = 15B (16) = ś (U+015B) C4 87 = C (16) = 1100 (2), so we have 2 bytes per character coded as 110+xxxxx 10+xxxxxx C4 87 = 110+00100 10+000111 = 00100000111 = 001 0000 0111 = 107 (16) = ć (U+0107) 0A = LF (Line Feed), ANSI ASCII utf_8_notepad.txt EF BB BF 54 65 73 74 3A 20 73 69 C4 85 C5 9B C4 87 EF BB BF - BOM 54 65 73 74 3A 20 73 69 C4 85 C5 9B C4 87 - as before (see utf_8_gedit.txt) unicode_big_endian_notepad.txt FE FF 00 54 00 65 00 73 00 74 00 3A 00 20 00 73 00 69 01 05 01 5B 01 07 FE FF = BOM (UTF-16, BE) 00 54 - as before (see utf_8_gedit.txt) 00 65 00 73 00 74 00 3A 00 20 00 73 00 69 01 05 01 5B 01 07 unicode_notepad.txt FF FE 54 00 65 00 73 00 74 00 3A 00 20 00 73 00 69 00 05 01 5B 01 07 01 FF FE = BOM (UTF-16, LE) 54 00 = because we have LE, the coorect order is 0054 = 54 = T 65 00 = 0065 - as before (see unicode_big_endian_notepad.txt) 73 00 = 0073 74 00 = 0074 3A 00 = 003A 20 00 = 0020 73 00 = 0073 69 00 = 0069 05 01 = 0105 5B 01 = 015B 07 01 = 0107