TTF 2 PFA Readme


True Type to Postscript Font converter

My mind is still reeling from the discovery that I was able to write this program. What it does is it reads a Microsoft TrueType font and creates a Postscript font. '_A_ postscript font', that is, not necessarily the same font, you understand, but a fair imitation.


Run Like This:

ttf2pfa fontfile.ttf fontname

The first parameter is the truetype filename, the second is a stem for the output file names. The program will create a fontname.pfa containing the Postscript font and a fontname.afm containing the metrics.

The motivation behind this is that in Linux if you do not have a Postscript printer, but only some other printer, you can only print Postscript by using Ghostscript. But the fonts that come with Ghostscript are very poor (they are converted from bitmaps and look rather lumpy). This is rather frustrating as the PC running Linux probably has MS-Windows as well and will therefore have truetype fonts, but which are quite useless with Linux, X or Ghostscript.

The program has been tested on over a hundred different TrueType fonts from various sources, and seems to work fairly well. The converted characters look OK, and the program doesn't seem to crash any more. I'm not sure about the AFM files though, as I have no means to test them.

The fonts generated will not work with X, as the font rasterizer that comes with X only copes with Type 1 fonts. If I have the time I may modify ttf2pfa to generate Type 1s.


Copyright issues

I am putting this program into the public domain, so don't bother sending me any money, I'd only have to declare it for income tax.

Copyright on fonts, however, is a difficult legal question. Any copyright statements found in a font will be preserved in the output. Whether you are entitled to translate them at all I don't know.

If you have a license to run a software package, like say MS-Windows, on your PC, then you probably have a right to use any part of it, including fonts, on that PC, even if not using that package for its intended purpose.

I am not a lawyer, however, so this is not a legal opinion, and may be garbage.

There shouldn't be a any problem with public domain fonts.


About the Program

It was written in C on a IBM PC running Linux.

The TrueType format was originally developed by Apple for the MAC, which has opposite endianness to the PC, so to ensure compatibility 16 and 32 bit fields are the wrong way round from the PC's point of view. This is the reason for all the 'ntohs' and 'ntohl' calls. Doing it this way means the program will also work on big-endian machines like Suns.

I doubt whether it will work on a DOS-based PC though.

The program produces what technically are Type 3 rather than Type 1 fonts. They are not compressed or encrypted and are plain text. This is so I (and you) can see what's going on, and (if you're a Postscript guru and really want to) can alter the outlines.

I only translate the outlines, not the 'instructions' that come with them. This latter task is probably virtually impossible anyway. TrueType outlines are B-splines rather than the Bezier curves that Postscript uses. I believe that my conversion algorithm is reasonably correct, if nothing else because the characters look right.


Problems that may occur

Most seriously, very complex characters (with lots of outline segments) can make Ghostscript releases 2.x.x fail with a 'limitcheck' error. It is possible that this may happen with some older Postscript printers as well. Such characters will be flagged by the program and there are basically two things you can do. First is to edit the .pfa file to simplify or remove the offending character. This is not really recommended. The second is to use Ghostscript release 3, if you can get it. This has much larger limits and does not seem to have any problems with complex characters.

Then there are buggy fonts (yes, a font can have bugs). I try to deal with these in as sane a manner as possible, but it's not always possible.


Encodings

A postscript font must have a 256 element array, called an encoding, each element of which is a name, which is also the name of a procedure contained within the font. The 'BuildChar' command takes a byte and uses it to index the encoding array to find a character name, and then looks that up in the font's procedure table find the commands to draw the glyph. However, not all characters need be in the encoding array. Those that are not cannot be drawn (at least not using 'show'), however it is possible to 're-encode' the font to enable these characters. There are several standard encodings: Adobe's original, ISO-Latin1 and Symbol being the most commonly encountered.

TrueType fonts are organised differently. As well as the glyph descriptions there are a number of tables. One of these is a mapping from a character set into the glyph array, and another is a mapping from the glyph array into a set of Postscript character names. The problems are:

  1. Microsoft uses Unicode, a 16-bit system, to encode the font.
  2. that more than one glyph is given the same Postscript name.
I deal with (1) by assuming a Latin1 encoding. The MS-Windows and Unicode character sets are both supersets of ISO-8859-1. This usually means that most characters will be properly encoded, but you should be warned that some software may assume that fonts have an Adobe encoding. Symbol, or Dingbat, fonts are in fact less of a problem, as they have private encodings starting at 0xF000. It is easy to just lose the top byte.

Postscript fonts can be re-encoded, either manually, or by software. Groff, for example, generates postscript that re-encodes fonts with the Adobe encoding. The problem here is that not all characters in the Adobe set are in the MS-Windows set. In particular there are no fi and fl ligatures. This means that conversions of the versions of Times-New-Roman and Arial that come with MS-Windows cannot be used blindly as replacements for Adobe Times-Roman and Helvetica. You can get expanded versions of MS fonts from Microsoft's web site which do contain these ligatures (and a lot else besides).

I deal with (2) by creating new character names. This can be error-prone because I do not know which of them is the correct glyph to give the name to. Some (buggy) fonts have large numbers of blank glyphs, all with the same name.

(almost every TrueType font has three glyphs called .notdef, one of them is usually an empty square shape, one has no outline and has zero width, and one has no outline and a positive width. This example is not really a problem with well formed fonts since the .notdef characters are only used for unprintable characters, which shouldn't occur in your documents anyway).