Wiki info

Most Windows text files use "ANSI", "OEM", "Unicode" or "UTF-8" encoding. What Windows terminology calls "ANSI encodings" are usually single-byte ISO/IEC 8859 encodings (i. e. ANSI in the Microsoft Notepad menus is really "System Code Page", non-Unicode, legacy encoding), except for in locales such as Chinese, Japanese and Korean that require double-byte character sets. ANSI encodings were traditionally used as default system locales within Windows, before the transition to Unicode. By contrast, OEM encodings, also known as DOS code pages, were defined by IBM for use in the original IBM PC text mode display system. They typically include graphical and line-drawing characters common in DOS applications. "Unicode"-encoded Windows text files contain text in UTF-16 Unicode Transformation Format. Such files normally begin with Byte Order Mark (BOM), which communicates the endianness of the file content. Although UTF-8 does not suffer from endianness problems, many Windows programs (i. e. Notepad) prepend the contents of UTF-8-encoded files with BOM, to differentiate UTF-8 encoding from other 8-bit encodings.