vulcanize - Convert LaTeX files to HTML

SYNOPSIS

vulcanize [ filename... ]

DESCRIPTION

vulcanize does its best to convert its input from LaTeX to HTML. If filenames are given on the command line, the input is the concatenation of the named files, in order; if no files are given, the input is the standard input. The resulting HTML is written to the standard output.

BUGS AND CAVEATS

The most important thing to remember about vulcanize is that it does not work, and it will never work, because the problem it is trying to solve is completely intractable. However, it adequately solves a worthwhile subset of this intractable problem. For a program that doesn't work, it is remarkably successful. See the Raison d'Être section below.

vulcanize does not properly handle nested environments. For example,

          {\em italics {\bf bold face } more italics }

becomes

          <em>italics <b>bold face </em> more italics </b>

vulcanize doesn't convert ~ (tilde) properly. It should be converted to &nbsp;, the HTML entity for a non-breaking space. However, Mosaic for X doesn't understand &nbsp;, so vulcanize replaces ~ with a breakable space instead.

vulcanize only converts \verb sequences when the verbatim text is delimited by plus signs.

Like many PERL programs, vulcanize is an abominable morass of unreadable cryptica.

Raison d'Être

Let's consider a simple and common LaTeX construct: $x_2$, which should appear as x with a subscript 2. Now how should we translate this to HTML? HTML doesn't have a way to specify subscripts, and most WWW clients aren't capable of displaying subscripts.

We have two choices. We can ignore such constructs, in the hope that a smart human will come along later and decide what to do about them, or we can do what Nikos Drakos, author of LaTeX2HTML, does, which is:

Nikos' method has a number of serious drawbacks. It is slow and it is a lot of work. Even after it's done, the result is not satisfactory:

This simple example demonstrates that good LaTeX to HTML translation is, in general, impossible.

Even barring typesetting quality problems like this, complete LaTeX to HTML translation is impossible. HTML documents are supposed to be displayable on a wide variety of output devices, and so HTML avoids assumptions about screen width and available fonts; this means that commands like \marginpar and \Big have no HTML equivalents. Tables can be typeset, but typically they will be displayed in fixed width fonts. \hrules and \vrules cannot appear at all. (<HR> is not the same as \hrule.) The bottom line is that TeX and LaTeX are complete typesetting and layout systems of immense power and flexibility, while HTML is a presentation-independent document structuring language with only a few simple structures; there is no good mapping between the two sets of capabilities.

In the face of these insurmountable difficulties, vulcanize sticks its head in the sand. It does the best it can at translating most common, simple LaTeX constructions, and it ignores the rest. For a large class of documents, this is enough. In cases where it is not, there are a number of approaches that may be more satisfactory. One is to make a DVI or PostScript version of your document available. These document formats contain formatting, layout, and font information, and can be printed or viewed on many output devices without any unnecessary loss of quality.

SEE ALSO

The LaTeX2HTML Translator

vulcanize source code

AUTHOR

Mark-Jason Dominus, University of Pennsylvania

EXPLANATION OF NAME

To vulcanize rubber is to improve its strength, resiliency, and freedom from stickiness and odor, by combining it with sulfur or other additives in the presence of heat and pressure.

The rubber molecule is many long parallel polymer chains; the chains can bunch up, and this is why rubber is springy and flexible. Vulcanization cross-links these chains.


M-J. Dominus, mjd@saul.cis.upenn.edu