his package provides conversion in both directions between UTF-8 Unicode and more than thirty 7-bit ASCII equivalents, including RFC 2396 URI format and RFC 2045 Quoted Printable format, the representations used in HTML, SGML, XML, OOXML, the Unicode standard, Rich Text Format, POSIX portable charmaps, POSIX locale specifications, and Apache log files, and the escapes used for including Unicode in Ada, C, Common Lisp, Java, Pascal, Perl, Postscript, Python, Scheme, and Tcl.

Such ASCII equivalents are useful when including Unicode text in program source, when debugging, and when entering text into web programs that can handle the Unicode character set but are not 8-bit safe. For example, MovableType, the blog software, truncates posts as soon as it encounters a byte with the high bit set. However, if Unicode is entered in the form of HTML numeric character entities, Movable Type will not garble the post.

It also provides ways of converting non-ASCII characters to similar ASCII characters, e.g. by stripping diacritics.


The -h flag provides fairly detailed usage information. Standard Unix man pages are provided with the source.

LicenseVerified byVerified onNotes
GPLv3Ted Teah3 October 2005

Leaders and contributors

Bill Poser Maintainer

Resources and communication

Audience Resource type URI
Bug Tracking,Developer,Support E-mail mailto:billposer@alum.mit.edu

This entry (in part or in whole) was last reviewed on 8 January 2017.


Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.

