KOI8-RU is an 8-bit character encoding, designed to cover Russian, Ukrainian, and Belarusian which use a Cyrillic alphabet. It is closely related to KOI8-R, which covers Russian and Bulgarian, but replaces ten box drawing characters with five Ukrainian and Belarusian letters ÃÂ, ÃÂ, ÃÂ, ÃÂ, and ÃÂ in both upper case and lower case. It is even more closely related to KOI8-U, which does not include ÃÂ but otherwise makes the same letter replacements. The additional letter allocations are matched by KOI8-E, except for ÃÂ which is added to KOI8-F.
In IBM, KOI8-RU is assigned code page/CCSID 1167.
KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In the future, both may eventually give way to Unicode.
KOI8 stands for Kod obmena informatsiey, 8 bit () which means "Code for Information Exchange, 8 bit".
The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the natural Cyrillic alphabetical order as in ISO 8859-5. Although this may seem unnatural, it has the useful property that if the eighth bit is stripped, the text can still be read (or at least deciphered) in case-reversed transliteration on an ordinary ASCII terminal. For instance, "ÃÂþô ÃÂñüõýð ÃÂýÃÂþÃÂüðÃÂøõù" in KOI8-RU becomes kOD oBMENA iNFORMACIEJ (the Russian meaning of the "KOI" acronym) if the 8th bit is stripped.
The following table shows the KOI8-RU encoding. Each character is shown with its equivalent Unicode code point.
Although RFC 2319 says that character 0x95 should be U+2219 (âÂÂ), it may also be U+2022 (â¢) to match the bullet character in Windows-1251.
Some references have a typo and incorrectly state that character 0xB4 is U+0403, rather than the correct U+0404. This typo is present in Appendix A of RFC 2319 (but the table in the main text of the RFC gives the correct mapping).