Eric S. Raymond wrote:
Howard Chu <hyc(a)symas.com>:
> The patch changes much more than your bug report mentions. The error
> message you provide is pretty ambiguous, in particular you haven't
> mentioned exactly which markup element is out of order. Without this
> information we can't confirm what you're fixing.
Globally, I'm trying to fix up the entire Linux manual page corpus so
it can be automatically lifted to clean HTML. I've been working in
this intermittently since 2002, and am not far from the goal. In
approximately 12,000 pages carried in stock Ubuntu, 94% now lift
without errors or warnings. For all but 17 of the 6% remaining I have
fix patches which I'm trying to get merged upstream.
I entered one of those patches in your bugtracker, but failed to notice
that the error code in my bug database entry for slapd.conf.5 was
incorrect (failed to match the patch). I apologize for this;
full explanation follows.
My conversion tool is doclifter, which you can read about at
;. It lifts manual pages to
XML-DocBook, which in turn is very easily rendered to HTML. This
sequence produces better-quality HTML than tools like man2html
can generate, because of the amount of content analysis done
in the doclifter intermediate step.
There are some legal troff constructions that doclifter cannot
parse well. These are the same sorts of things that any man-page
renderer other than groff itself is likely to get wrong; thus, they
are likely to confuse tools such as XMan, TkMan, and Rosetta as well
Hm. I use my own man2html http://highlandsun.com/hyc/man2html.c
pretty good looking output for us. Really, if you're developing a tool that
claims to read troff input, it has to actually do so. I mean, the point of
tools such as this is to be able to convert existing documents without
modifying them, isn't it?
Thhe patch I sent you introduces an .EX/.EE macro pair for framing
code examples, a common extension often found on older Unix manual
pages (I believe it originated in DEC Ultrix). This macro
encapsulates a bunch of low-level troff operations that are hard to
lift well in isolation. There's a ruleset in doclifter that recognizes
.EX/.EE, analyzes the example content, and generates <programlisting>
or <literallayout> tags as appropriate.
In some cases .EX/.EE replaces .RS/.RE pairs; in others it replaces
.nf/.fi pairs, in still others it replaces .TP, and I think there
are a couple of odd combinations of .RS/.RE with .TP in there too. The fact
that it has to mess with all of these is what makes the patch so
complex. But the result is simple: to express the "example" intention
better so it can be structurally translated.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/