Server IP : 127.0.0.2 / Your IP : 18.222.48.95 Web Server : Apache/2.4.18 (Ubuntu) System : User : www-data ( ) PHP Version : 7.0.33-0ubuntu0.16.04.16 Disable Function : disk_free_space,disk_total_space,diskfreespace,dl,exec,fpaththru,getmyuid,getmypid,highlight_file,ignore_user_abord,leak,listen,link,opcache_get_configuration,opcache_get_status,passthru,pcntl_alarm,pcntl_fork,pcntl_waitpid,pcntl_wait,pcntl_wifexited,pcntl_wifstopped,pcntl_wifsignaled,pcntl_wexitstatus,pcntl_wtermsig,pcntl_wstopsig,pcntl_signal,pcntl_signal_dispatch,pcntl_get_last_error,pcntl_strerror,pcntl_sigprocmask,pcntl_sigwaitinfo,pcntl_sigtimedwait,pcntl_exec,pcntl_getpriority,pcntl_setpriority,php_uname,phpinfo,posix_ctermid,posix_getcwd,posix_getegid,posix_geteuid,posix_getgid,posix_getgrgid,posix_getgrnam,posix_getgroups,posix_getlogin,posix_getpgid,posix_getpgrp,posix_getpid,posix,_getppid,posix_getpwnam,posix_getpwuid,posix_getrlimit,posix_getsid,posix_getuid,posix_isatty,posix_kill,posix_mkfifo,posix_setegid,posix_seteuid,posix_setgid,posix_setpgid,posix_setsid,posix_setuid,posix_times,posix_ttyname,posix_uname,pclose,popen,proc_open,proc_close,proc_get_status,proc_nice,proc_terminate,shell_exec,source,show_source,system,virtual MySQL : OFF | cURL : ON | WGET : ON | Perl : ON | Python : ON | Sudo : ON | Pkexec : ON Directory : /usr/share/doc/docutils-doc/docs/dev/ |
Upload File : |
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="generator" content="Docutils 0.12: http://docutils.sourceforge.net/" /> <title>Docutils Hacker's Guide</title> <meta name="author" content="Lea Wiemann" /> <meta name="date" content="2012-01-03" /> <meta name="copyright" content="This document has been placed in the public domain." /> <link rel="stylesheet" href="../../css/html4css1.css" type="text/css" /> </head> <body> <div class="document" id="docutils-hacker-s-guide"> <h1 class="title"><a class="reference external" href="http://docutils.sourceforge.net/">Docutils</a> Hacker's Guide</h1> <table class="docinfo" frame="void" rules="none"> <col class="docinfo-name" /> <col class="docinfo-content" /> <tbody valign="top"> <tr><th class="docinfo-name">Author:</th> <td>Lea Wiemann</td></tr> <tr><th class="docinfo-name">Contact:</th> <td><a class="first last reference external" href="mailto:docutils-develop@lists.sourceforge.net">docutils-develop@lists.sourceforge.net</a></td></tr> <tr><th class="docinfo-name">Revision:</th> <td>7302</td></tr> <tr><th class="docinfo-name">Date:</th> <td>2012-01-03</td></tr> <tr><th class="docinfo-name">Copyright:</th> <td>This document has been placed in the public domain.</td></tr> <tr class="field"><th class="docinfo-name">Prerequisites:</th><td class="field-body">You have used <a class="reference external" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> and played around with the <a class="reference external" href="../user/tools.html">Docutils front-end tools</a> before. Some (basic) Python knowledge is certainly helpful (though not necessary, strictly speaking).</td> </tr> </tbody> </table> <div class="abstract topic"> <p class="topic-title first">Abstract</p> <p>This is the introduction to Docutils for all persons who want to extend Docutils in some way.</p> </div> <div class="contents topic" id="contents"> <p class="topic-title first">Contents</p> <ul class="simple"> <li><a class="reference internal" href="#overview-of-the-docutils-architecture" id="id1">Overview of the Docutils Architecture</a><ul> <li><a class="reference internal" href="#reading-the-document" id="id2">Reading the Document</a></li> <li><a class="reference internal" href="#parsing-the-document" id="id3">Parsing the Document</a></li> <li><a class="reference internal" href="#transforming-the-document" id="id4">Transforming the Document</a></li> <li><a class="reference internal" href="#writing-the-document" id="id5">Writing the Document</a></li> </ul> </li> <li><a class="reference internal" href="#extending-docutils" id="id6">Extending Docutils</a><ul> <li><a class="reference internal" href="#modifying-the-document-tree-before-it-is-written" id="id7">Modifying the Document Tree Before It Is Written</a></li> <li><a class="reference internal" href="#the-node-interface" id="id8">The Node Interface</a></li> </ul> </li> <li><a class="reference internal" href="#what-now" id="id9">What Now?</a></li> </ul> </div> <div class="section" id="overview-of-the-docutils-architecture"> <h1><a class="toc-backref" href="#id1">Overview of the Docutils Architecture</a></h1> <p>To give you an understanding of the Docutils architecture, we'll dive right into the internals using a practical example.</p> <p>Consider the following reStructuredText file:</p> <pre class="literal-block"> My *favorite* language is Python_. .. _Python: http://www.python.org/ </pre> <p>Using the <tt class="docutils literal">rst2html.py</tt> front-end tool, you would get an HTML output which looks like this:</p> <pre class="literal-block"> [uninteresting HTML code removed] <body> <div class="document"> <p>My <em>favorite</em> language is <a class="reference" href="http://www.python.org/">Python</a>.</p> </div> </body> </html> </pre> <p>While this looks very simple, it's enough to illustrate all internal processing stages of Docutils. Let's see how this document is processed from the reStructuredText source to the final HTML output:</p> <div class="section" id="reading-the-document"> <h2><a class="toc-backref" href="#id2">Reading the Document</a></h2> <p>The <strong>Reader</strong> reads the document from the source file and passes it to the parser (see below). The default reader is the standalone reader (<tt class="docutils literal">docutils/readers/standalone.py</tt>) which just reads the input data from a single text file. Unless you want to do really fancy things, there is no need to change that.</p> <p>Since you probably won't need to touch readers, we will just move on to the next stage:</p> </div> <div class="section" id="parsing-the-document"> <h2><a class="toc-backref" href="#id3">Parsing the Document</a></h2> <p>The <strong>Parser</strong> analyzes the the input document and creates a <strong>node tree</strong> representation. In this case we are using the <strong>reStructuredText parser</strong> (<tt class="docutils literal">docutils/parsers/rst/__init__.py</tt>). To see what that node tree looks like, we call <tt class="docutils literal">quicktest.py</tt> (which can be found in the <tt class="docutils literal">tools/</tt> directory of the Docutils distribution) with our example file (<tt class="docutils literal">test.txt</tt>) as first parameter (Windows users might need to type <tt class="docutils literal">python quicktest.py test.txt</tt>):</p> <pre class="literal-block"> $ quicktest.py test.txt <document source="test.txt"> <paragraph> My <emphasis> favorite language is <reference name="Python" refname="python"> Python . <target ids="python" names="python" refuri="http://www.python.org/"> </pre> <p>Let us now examine the node tree:</p> <p>The top-level node is <tt class="docutils literal">document</tt>. It has a <tt class="docutils literal">source</tt> attribute whose value is <tt class="docutils literal">text.txt</tt>. There are two children: A <tt class="docutils literal">paragraph</tt> node and a <tt class="docutils literal">target</tt> node. The <tt class="docutils literal">paragraph</tt> in turn has children: A text node ("My "), an <tt class="docutils literal">emphasis</tt> node, a text node (" language is "), a <tt class="docutils literal">reference</tt> node, and again a <tt class="docutils literal">Text</tt> node (".").</p> <p>These node types (<tt class="docutils literal">document</tt>, <tt class="docutils literal">paragraph</tt>, <tt class="docutils literal">emphasis</tt>, etc.) are all defined in <tt class="docutils literal">docutils/nodes.py</tt>. The node types are internally arranged as a class hierarchy (for example, both <tt class="docutils literal">emphasis</tt> and <tt class="docutils literal">reference</tt> have the common superclass <tt class="docutils literal">Inline</tt>). To get an overview of the node class hierarchy, use epydoc (type <tt class="docutils literal">epydoc nodes.py</tt>) and look at the class hierarchy tree.</p> </div> <div class="section" id="transforming-the-document"> <h2><a class="toc-backref" href="#id4">Transforming the Document</a></h2> <p>In the node tree above, the <tt class="docutils literal">reference</tt> node does not contain the target URI (<tt class="docutils literal"><span class="pre">http://www.python.org/</span></tt>) yet.</p> <p>Assigning the target URI (from the <tt class="docutils literal">target</tt> node) to the <tt class="docutils literal">reference</tt> node is <em>not</em> done by the parser (the parser only translates the input document into a node tree).</p> <p>Instead, it's done by a <strong>Transform</strong>. In this case (resolving a reference), it's done by the <tt class="docutils literal">ExternalTargets</tt> transform in <tt class="docutils literal">docutils/transforms/references.py</tt>.</p> <p>In fact, there are quite a lot of Transforms, which do various useful things like creating the table of contents, applying substitution references or resolving auto-numbered footnotes.</p> <p>The Transforms are applied after parsing. To see how the node tree has changed after applying the Transforms, we use the <tt class="docutils literal">rst2pseudoxml.py</tt> tool:</p> <pre class="literal-block"> $ rst2pseudoxml.py test.txt <document source="test.txt"> <paragraph> My <emphasis> favorite language is <reference name="Python" <strong>refuri="http://www.python.org/"</strong>> Python . <target ids="python" names="python" <tt class="docutils literal"><span class="pre">refuri="http://www.python.org/"</span></tt>> </pre> <p>For our small test document, the only change is that the <tt class="docutils literal">refname</tt> attribute of the reference has been replaced by a <tt class="docutils literal">refuri</tt> attribute—the reference has been resolved.</p> <p>While this does not look very exciting, transforms are a powerful tool to apply any kind of transformation on the node tree.</p> <p>By the way, you can also get a "real" XML representation of the node tree by using <tt class="docutils literal">rst2xml.py</tt> instead of <tt class="docutils literal">rst2pseudoxml.py</tt>.</p> </div> <div class="section" id="writing-the-document"> <h2><a class="toc-backref" href="#id5">Writing the Document</a></h2> <p>To get an HTML document out of the node tree, we use a <strong>Writer</strong>, the HTML writer in this case (<tt class="docutils literal">docutils/writers/html4css1.py</tt>).</p> <p>The writer receives the node tree and returns the output document. For HTML output, we can test this using the <tt class="docutils literal">rst2html.py</tt> tool:</p> <pre class="literal-block"> $ rst2html.py --link-stylesheet test.txt <?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="generator" content="Docutils 0.3.10: http://docutils.sourceforge.net/" /> <title></title> <link rel="stylesheet" href="../docutils/writers/html4css1/html4css1.css" type="text/css" /> </head> <body> <div class="document"> <p>My <em>favorite</em> language is <a class="reference" href="http://www.python.org/">Python</a>.</p> </div> </body> </html> </pre> <p>So here we finally have our HTML output. The actual document contents are in the fourth-last line. Note, by the way, that the HTML writer did not render the (invisible) <tt class="docutils literal">target</tt> node—only the <tt class="docutils literal">paragraph</tt> node and its children appear in the HTML output.</p> </div> </div> <div class="section" id="extending-docutils"> <h1><a class="toc-backref" href="#id6">Extending Docutils</a></h1> <p>Now you'll ask, "how do I actually extend Docutils?"</p> <p>First of all, once you are clear about <em>what</em> you want to achieve, you have to decide <em>where</em> to implement it—in the Parser (e.g. by adding a directive or role to the reStructuredText parser), as a Transform, or in the Writer. There is often one obvious choice among those three (Parser, Transform, Writer). If you are unsure, ask on the <a class="reference external" href="../user/mailing-lists.html#docutils-develop">Docutils-develop</a> mailing list.</p> <p>In order to find out how to start, it is often helpful to look at similar features which are already implemented. For example, if you want to add a new directive to the reStructuredText parser, look at the implementation of a similar directive in <tt class="docutils literal">docutils/parsers/rst/directives/</tt>.</p> <div class="section" id="modifying-the-document-tree-before-it-is-written"> <h2><a class="toc-backref" href="#id7">Modifying the Document Tree Before It Is Written</a></h2> <p>You can modify the document tree right before the writer is called. One possibility is to use the <a class="reference external" href="../api/publisher.html#publish_doctree">publish_doctree</a> and <a class="reference external" href="../api/publisher.html#publish_from_doctree">publish_from_doctree</a> functions.</p> <p>To retrieve the document tree, call:</p> <pre class="literal-block"> document = docutils.core.publish_doctree(...) </pre> <p>Please see the docstring of publish_doctree for a list of parameters.</p> <!-- XXX Need to write a well-readable list of (commonly used) options of the publish_* functions. Probably in api/publisher.txt. --> <p><tt class="docutils literal">document</tt> is the root node of the document tree. You can now change the document by accessing the <tt class="docutils literal">document</tt> node and its children—see <a class="reference internal" href="#the-node-interface">The Node Interface</a> below.</p> <p>When you're done with modifying the document tree, you can write it out by calling:</p> <pre class="literal-block"> output = docutils.core.publish_from_doctree(document, ...) </pre> </div> <div class="section" id="the-node-interface"> <h2><a class="toc-backref" href="#id8">The Node Interface</a></h2> <p>As described in the overview above, Docutils' internal representation of a document is a tree of nodes. We'll now have a look at the interface of these nodes.</p> <p>(To be completed.)</p> </div> </div> <div class="section" id="what-now"> <h1><a class="toc-backref" href="#id9">What Now?</a></h1> <p>This document is not complete. Many topics could (and should) be covered here. To find out with which topics we should write about first, we are awaiting <em>your</em> feedback. So please ask your questions on the <a class="reference external" href="../user/mailing-lists.html#docutils-develop">Docutils-develop</a> mailing list.</p> <!-- Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: --> </div> </div> </body> </html>