[Grok-dev] Re: To a multipage html tutorial

Philipp von Weitershausen philipp at weitershausen.de
Sat Apr 7 06:29:10 EDT 2007


Darryl Cousins wrote:
> I've had a go at scripting the generation of a multi-page html version
> of the grok tutorial.

Great, thanks for looking into this!

> Firstly I tried digging into docutils internals which didn't get me very far
> (though I've learnt some). I found 2 references from docutils mailing list which
> appeared to advise preprocessing the restructured text document. That did get me
> part of the way and I have posted the result of this attempt [1]_

I personally think this is the more preferrable way, the detour via 
LaTeX just adds more and more machinery. On the other hand, the docutils 
guys seem to want this functionality (splitting up documents over 
several HTML files) as well and they seem to think it requires a lot of 
works in docutils itself[1]. Though probably our use case is simple 
enough so that it works with some home-brew code.

I looked at your code in which you try to chunk up the reST file. You're 
looking for lines that start with '====='. This is quite a hack. reST 
doesn't mandate that sections must be underlined with '=====' and it 
would also fail if there was a really short section heading.

It's also a hack because docutils provides a DOM-like representation of 
a parsed document and a way to only publish parts of the DOM. You could 
therefore simply walk each top-level section in the node tree and 
publish them individually.

More information is given in the docutils docs[2]. You probably want to 
pay close attention to the "Modifying the Document Tree Before It Is 
Written" section. Basically, walking a document's sections could look 
like this::

   >>> import docutils.core
   >>> source = open('tutorial.txt').read()
   >>> document = docutils.core.publish_doctree(source)

   >>> for node in document:
   ...     if node.tagname == 'section':
   ...         # do something here...

> What is missing here however is a contents listing of the entire multipage
> document. Back into the docutils internals but without success.

The contents listing is just another node in the document tree and could 
likely be rendered separately. The tricky part will be to adjust the 
links to the different output pages. Perhaps this is best done in a 
post-processing step.

> The second attempt I made with ``latex2html``. This actually looks a little more
> hopeful although I've made no attempt at styling. [2]_
> 
> Although I produced ``tutorial.tex`` using ``rst2latex`` (also borrowing
> from ``grok2pdf.sh``) I found I needed to do some editing of tutorial.tex to get
> rid of some errors and warnings when running ``latex2html``. The sidenotes are
> lost at this stage (I did get ``\fbox`` to work in a separate text.tex file but
> the resulting image was not very pretty). I think they will need to be rendered
> with inline tex markup rather than use the ``\fbox`` markup.

It should be no problem to add a LaTeX stylesheet with a new definition 
of \fbox{} that simply inlines the text or whatever. Manual re-editing 
shouldn't be necessary by any means.

It would be a shame to lose the visual sidebars, though.

[1] http://docutils.sourceforge.net/docs/dev/todo.html#document-splitting
[2] http://docutils.sourceforge.net/docs/dev/hacking.html

-- 
http://worldcookery.com -- Professional Zope documentation and training


More information about the Grok-dev mailing list