Go to the first, previous, next, last section, table of contents.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.

What is the purpose of this program

This recode program has the purpose of converting files between various character sets and usages. When exact transliterations are not possible, as it is often the case, the program may get rid of the offending characters or fall back on approximations.

Let us coin the term charset to represent, without distinction, a character set "per se" or a particular usage of a character set. This program recognizes or produces around 150 such charsets. Since it can convert each charset to almost any other one, many thousands of different conversions are possible.

This tool pays special attention to superimposition of diacritics for French representation. This orientation is mostly historical, it does not impair the usefulness, generality or extensibility of the program.

Overview: Overview of charsets
Contributing: Contributions and bug reports

Overview of charsets

Recoding is currently possible between most of the charsets described in RFC 1435. See section Charsets from RFC 1345.

Recode also handles some charsets in more specialized ways. These are:

usual 7-bit ASCII: without any diacritics, or else: using backspace for overstriking; Unisys' ICON convention; TeX/LaTeX coding; easy French conventions for electronic mail;
8-bit extensions to ASCII: ISO Latin-1, Atari ST code, IBM's code for the PC, Apple's code for the Macintosh, NeXTSTEP code;
6-bit escaped ASCII based on CDC display code: 6/12 code from NOS; bang-bang code from Universit'e de Montr'eal;
non-ASCII codes: three flavors of EBCDIC.

The recent introduction of RFC 1345 in GNU recode has brought with it a few charsets having the functionnality of older ones, but yet being different in subtle ways. The effects have not been fully investigated yet, so for now, clashes are avoided, the old and new charsets are kept well separate. For example, wizards would be interested in comparing the output of these two commands:

recode -vh ibmpc:applemac
recode -vh ibm437:macintosh

The first command uses only charsets prior to RFC 1345 introduction. Both methods give different recodings, the first also properly recodes end of lines. These differences are annoying, the fuziness will have to be explained and settle down one day.

Contributions and bug reports

Even being the recode author and current maintainer, I am no specialist in charset standards. I only made recode along the years to solve my own needs, but felt it was extendable for the needs of others. Some GNU people liked the program structure and suggested to make it more widely available. I rely on GNU users judgement for what is best to be done next.

Properly protecting GNU recode about possible copyright fights is a pain for me and for contributors, but we cannot avoid addressing the issue in the long run. Besides, the Free Software Foundation, which mandates the GNU project, is very sensible to this matter. GNU standards require that I be cautious before looking at copyrighted code. The safest and simplest way for me is to gather ideas and reprogram them anew, even if this might slow me down considerably. For contributions going beyond a few lines of code here and there, the FSF definitely requires employer disclaimers and copyright assignments.

Many users contributed to GNU recode already, I am grateful to them for their interest and involvement. Some suggestions can be integrated quickly while some others have to be delayed, I have to draw a line somewhere when time comes to make a new release, about what would go in it and what would go in the next. Also, when you contribute something to recode, please explain what it is about. Do not take for granted that I know those charsets which are familiar to you. Your explanations could well find their way into this documentation, too.

Mail suggestions, documentation errors and bug reports to bug-gnu-utils@prep.ai.mit.edu or, if you prefer, directly to Francois Pinard `pinard@iro.umontreal.ca'. Do not be afraid to report details, because this program is the mere aggregation of hundreds of details.

Go to the first, previous, next, last section, table of contents.