A troff to HTML Conversion Program
troff_to_html - a TROFF to HTML Convertor
Paul E. Dunne
Department of Computer Science
University of Liverpool
Liverpool L69 3BX
The following describes a very basic text conversion facility
for translating a troff source file (using the -ms macro set) to
a HTML file suitable for vieweing using mosaic or netscape.
The translator is invoked (locally) via the command
~ped/bin/troff_to_html < input_file > output_file
A (uuencoded compressed) binary (HP7000 Series) is here
Source program (written in Ada) is here
where input_file is a troff source file (that may include directives
for the pre-processors pic, and eqn), built using the -ms macro set.
output_file will be a HTML source file.
A C++ version (compiling under c89) has been produced by Dave McGaw and
may be found here. Please note that this
requires a dictionary file for special characters located here
and an include (.h) file that is here.
-ms macros Recognised and Translated
.NH Numbered headings (parameters to .NH are ignored, e.g. .NH 2 processes as .NH)
.SH Un-numbered headings
.LP New paragraph and serves as break for .IP (see below)
.PP New paragraph and serves as break for .IP (see below)
.IP Indented paragraphs (paragraph labels are ignored); these are processed
as un-ordered lists, the list being terminated when .LP or .PP directive encountered.
.DS/.DE The text between these is left unaltered, hence .DS/.DE translate to HTML <pre> and </pre>.
.EQ/.EN An (extremely crude) attempt is made to process the eqn text between these.
.PS./.PE Translated to an anchor to a given file `picture<n>.gif' (where <n>) starts from 1.
This file should be created (by the user) to hold the relevant picture.
.TS/.TE Simple table handling.
Extensions August 1996: Two artificial directives have been added, to allow printing
of $ symbols: These are:
- .NE (No Eqn) treat $ in text as character not as opening of eqn expression. (default)
- .EO (Eqn On) Treat $ in text as start of eqn expression.
No limit on number of rows/column. New version (July 1996) uses HTML <table>
Also incorporates HTML <table> (errors in previous version corrected, July 1996).
Multiple row formats allowed (max 70 rows/20 columns).
NoteNo font changes or eqn inside table entries. Multi-column spans generally
don't look very good. It is assumed that the table specification is valid (otherwise
conversion program will crash).
troff Directives Recognised and Translated.
In addition diacritical marks (umlaut, grave, and acute accents) are dealt with.
.ce Printed as a 3rd level heading. Parameters to .ce are ignored.
.ft Change font. Roman (R), Italic (I), Bold (B), Courier (C) are all recognised.
.ul Print next line of text in italic font.
.so Will create an anchor point to the file indicated by the .so command. Link in
text will be indicated as `anchor<n>'.
.br/.sp Causes a line break. Parameters to .sp are ignored.
Text in the form $...$ is regarded as introducing an eqn expression. If the first
character after $ is the character ^ then the text is left in its literal form.
Note: All lines opening with a full-stop are ignored unless the line
is a directive recognised by the translator.