[Grok-dev] Re: To a multipage html tutorial

Darryl Cousins darryl at darrylcousins.net.nz
Sat Apr 7 19:20:50 EDT 2007


Thanks for the comments.

On Sat, 2007-04-07 at 12:29 +0200, Philipp von Weitershausen wrote:
> Darryl Cousins wrote:
> > I've had a go at scripting the generation of a multi-page html version
> > of the grok tutorial.
> Great, thanks for looking into this!
> > Firstly I tried digging into docutils internals which didn't get me very far
> > (though I've learnt some). I found 2 references from docutils mailing list which
> > appeared to advise preprocessing the restructured text document. That did get me
> > part of the way and I have posted the result of this attempt [1]_
> I personally think this is the more preferrable way, the detour via 
> LaTeX just adds more and more machinery. On the other hand, the docutils 
> guys seem to want this functionality (splitting up documents over 
> several HTML files) as well and they seem to think it requires a lot of 
> works in docutils itself[1]. Though probably our use case is simple 
> enough so that it works with some home-brew code.
> I looked at your code in which you try to chunk up the reST file. You're 
> looking for lines that start with '====='. This is quite a hack. reST 
> doesn't mandate that sections must be underlined with '=====' and it 
> would also fail if there was a really short section heading.

Yes, I point out this failing in the doc string.

> It's also a hack because docutils provides a DOM-like representation of 
> a parsed document and a way to only publish parts of the DOM. You could 
> therefore simply walk each top-level section in the node tree and 
> publish them individually.
> More information is given in the docutils docs[2]. You probably want to 
> pay close attention to the "Modifying the Document Tree Before It Is 
> Written" section. Basically, walking a document's sections could look 
> like this::
>    >>> import docutils.core
>    >>> source = open('tutorial.txt').read()
>    >>> document = docutils.core.publish_doctree(source)
>    >>> for node in document:
>    ...     if node.tagname == 'section':
>    ...         # do something here...

I did try in this manner. But in order to create a seperate document
tree for each section it seemed to me that I needed the source rest
(perhaps there is another approach?). Unfortunately though the source
rest is not available from the document it seems.

> > What is missing here however is a contents listing of the entire multipage
> > document. Back into the docutils internals but without success.
> The contents listing is just another node in the document tree and could 
> likely be rendered separately. The tricky part will be to adjust the 
> links to the different output pages. Perhaps this is best done in a 
> post-processing step.

Yes, so it seemed to me also.

> > The second attempt I made with ``latex2html``. This actually looks a little more
> > hopeful although I've made no attempt at styling. [2]_
> > 
> > Although I produced ``tutorial.tex`` using ``rst2latex`` (also borrowing
> > from ``grok2pdf.sh``) I found I needed to do some editing of tutorial.tex to get
> > rid of some errors and warnings when running ``latex2html``. The sidenotes are
> > lost at this stage (I did get ``\fbox`` to work in a separate text.tex file but
> > the resulting image was not very pretty). I think they will need to be rendered
> > with inline tex markup rather than use the ``\fbox`` markup.
> It should be no problem to add a LaTeX stylesheet with a new definition 
> of \fbox{} that simply inlines the text or whatever. Manual re-editing 
> shouldn't be necessary by any means.
> It would be a shame to lose the visual sidebars, though.

Yep. More knowledge of latex would be required. This is my first contact
with latex and my knowledge is definitely limited (in docutils also for
that matter).

Sincere regards,

> [1] http://docutils.sourceforge.net/docs/dev/todo.html#document-splitting
> [2] http://docutils.sourceforge.net/docs/dev/hacking.html

More information about the Grok-dev mailing list