Tuesday, February 25, 2020

From Sphinx to DocFX - Generating Python API Documentation with DocFx

Overview


In a previous post Sphinx Quickstart, we covered a very basic setup of Sphinx. In this post, we go farther and talk about Sphinx DocFX YAML, an exporter for the Sphinx Autodoc module. Our goal is to produce YAML files that can be consumed by DocFX, a documentation generator for .NET that also converts YAML files to HTML. Many doc sets at https://docs.microsoft.com/ are generated with DocFX, including Python doc sets that use Sphinx to generate YAML, which is then converted to HTML with DocFX.

Terminology


To understand how to go from Sphinx to DocFX using the Sphinx DocFX YAML exporter, we need to break down some of the terms used...or at least we did to make sense of it all.

Sphinx

  • Sphinx is a documentation generator, it was originally created for Python documentation, but can be used for a range of languages.
  • Sphinx uses reStructuredText (rST) as its markup language. Sphinx's utility comes from the power and straightforwardness of reStructuredText (reST) and its parsing and translating suite, Docutils. reST is used both in .rst files and in docstrings in .py files.
  • In the Quickstart, we created an example reSt file (foo.rst) and built HTML documentation from it. 
  • Autodoc is an extension for Sphinx. (Sphinx is extensible to support the needs of different projects. An extension is simply a Python module.)
  • Autodoc adds directives like "autofunction" and "automodule". These directives determine what API is used to generate docs.
  • When using the autodoc extension (added in the conf.py file) with Sphinx, you are including documentation from Python docstrings. A Python docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.
  • When you run the command sphinx-build (or make html if it was created), Sphinx autodoc generates the API documentation for your Python project code using the index.rst (this is the default name, but it can be any name you want). Sphinx imports the code via standard Python import mechanisms, and then generates the proper reST in the Sphinx Python domain. The reST files are then parsed to create doctree files used internally in Sphinx to generate HTML. If you only want the HTML output from Sphinx (and not DocFX), then you can stop here. This is the point at which the post Sphinx Quickstart stops.

sphinx-build

  • Usage: sphinx-build [options] <sourcedir> <outdir> [filenames...]
  • (If you ran sphinx-quickstart, you had the option of creating a make file so that you can just type make html instead of sphinx-build.)
  • This command creates documentation from files in <sourcedir>and places HTML in <outputdir>.
  • This command command looks for <sourcedir>/conf.py for configuration settings.
  • This command creates documentation in different formats. A format can be specified on the command line, otherwise it defaults to HTML. (Check the conf.py file if in doubt.)
  • By default, everything that is outdated is built. Output only for selected files can be built by specifying individual filenames.
  • Since Sphinx has to read and parse all source files before it can write an output file, the parsed source files are cached as “doctree pickles”. Normally, these files are put in a directory called .doctrees under the build directory.
  • If you didn't run the Sphinx Quickstart and don't have an index.rst file to start with, then you could use the sphinx-apidoc command to create module .rst files that would be equivalent to index.rst.

Sphinx DocFx YAML

  • Sphinx DocFX YAML is an exporter for the Sphinx Autodoc module that produces YAML files adhering to the DocFX YAML metadata specification. For more information, see readthedocs.
  • DocFX YAML describes language metadata for programming languages. The main user scenario for language metadata is to generate reference documentation. Specifically, we can use the YAML as input to DocFX and let DocFX generate HTML.
  • YAML files represent the API documentation. Example.
  • DocFX stands for Document Frameworks. To use it, add the extension to the source\conf.py file like so:

    extensions = ['sphinx.ext.autodoc', 'docfx_yaml.extension']
  • With exporter added to conf.py, use Sphinx DocFx as usual by running the command make html.
DocFX

  • DocFX generates API reference documentation from triple-slash comments in C#\VB  source code. Or, it can consume YAML files and render them as HTML.
  • It also allows you to use Markdown files to create additional topics such as tutorials and how-tos, and to customize the generated reference documentation.
  • The punchline is this: From a Python project using Autodoc and SphinxDocFX YAML exporter, you can generate YML files to be used with DocFX. This is what the example below does.
  • Why? Because HTML generated from DocFX has a number of benefits beyond the HTML generated from Sphinx, including API cross referencing, generating from markdown files (.md) alongside API reference, customizable themes and templates.  

An Example


Prerequisites:


Step 1: Clone the travelmarx-blog repo and start in the sphinx-docfx-example directory.

sphinx-docfx-example folder is the root folder. in subsequent steps. You should have the following:
.
└───mycode
    ├───core_api
    │   ├───package1
    │   └───package2
    └───test_api

Step 2: Create config.py and index.rst files.

See the Sphinx Quickstart for information about running the sphinx-quickstart command. Your folder structure should look like this.
.
├───build
├───mycode
│   ├───core_api
│   │   ├───package1
│   │   └───package2
│   └───test_api
└───source
    ├───_static
    └───_templates

Step 3: Edit source\config.py.

Configure the extensions:
extensions = ['sphinx.ext.autodoc', 'docfx_yaml.extension']
Point to the code folder:
import os
import sys
sys.path.insert(0, os.path.abspath('../mycode'))
Some of these lines in the config.py file may already exist and you'll have to uncomment them.

Step 4: Run sphinx-apidoc to create .rst (reStructuredText) files describing the code.

Starting in the sphinx-docfx-example (root) folder, run:
sphinx-apidoc -o source .\mycode
This creates .rst files in the \source folder.

Step 5: Modify the source\index.rst to include modules to document.

Running sphinx-apidoc will produce a source\modules.rst file by default. The modules.rst file is the entry point for documenting the code in \mycode.

source\index.rst (snippet, add the part in red)

Test documentation
=======================
.. toctree::
    :maxdepth: 4
    :caption: Table of Contents
 
    modules

Step 6: Run sphinx-build to create Sphinx's HTML.
sphinx-build source build
Besides building the Sphinx HTML (which you may not care about), this also creates .yml files in the \build\docfx_yaml folder. These will be used in a later step with DocFx.

To view the Sphinx HTML, starting in the root folder, run:
build\index.html
For comparison with docFx HTML (which is generated in Step 7), here is the Sphinx-generated HTML:



Step 7: Confirm that YAML files were generated.

Starting in root folder, run:
dir build\docfx_yaml
You should see a listing of .yml files like "core_api.package1.someclass.SomeClass.yml".

Step 8: Generate an initial docfx.json file.

Starting in the root folder, run:
docfx init -q
This will create a docfx_project folder with the docfx.json configuration file.

Step 9: Copy the Sphinx YAML files to the \docfx_project folder.

Copy .\build\docfx_yaml\* to .\docfx_project\api\*

Step 10: Build the DocFx HTML and serve the docs.

Starting in docfxtest folder, run:
docfx docfx_project\docfx.json --serve

Step 11: View the HTML docs produced by DocFx.

Go to http://localhost:8080.



Some points to note:

  • The difference in the look between Sphinx HTML and DocFx HTML. Both can be customized as needed.
  • How the link to "AnotherClass" is an active link in the DocFX screenshot. This is one of the benefits of using DocFx, cross reference linking.
  • We didn't add any "Articles" (.md files) but that is also a nice feature of DocFx, to integrate API and conceptual (articles) docs. For example of how that could be done, see our Scrapbook101core site.
  • On subsequent runs through the steps above (say, if you changed a docstring in the code), you will typically:
    • delete content in \build folder
    • run steps 6, 9, and 10.

Next Steps:

  • Customize docfx.json file.
  • Read up on cross-linking with DocFx.
  • Add other markdown files (.md) along with API docs.



Monday, February 24, 2020

Sphinx Quickstart


Generate files


The instructions here are for Windows. With slight modifications, they can be applied to other platforms. The code for this post is at https://github.com/travelmarx/travelmarx-blog/tree/master/sphinx-quickstart.

Make sure you have Sphinx installed, then clone the travelmarx-blog repo to your local environment. Starting in the sphinx-quickstart directory you should have the following:

> tree
│   .gitignore

└───mycode
        myclasses.py
        __init__.py

Run the Sphinx quickstart command.

> sphinx-quickstart

Accept defaults for everything except these parameters.
  • Separate source and build directories (y/n) [n]: Y
  • Project name: MyTestDocs
  • Author name(s): your-alias
  • autodoc: automatically insert docstrings from modules: (y/n) [n]: Y 
The last setting for configuring autodoc is important. When answering the quickstart questions, it can be easy to accept the default for this setting which is not to install it. The autodoc extension is configured in the source\conf.py file like so:

extensions = ['sphinx.ext.autodoc']

Build the HTML. The command make html is a convenience for running the command sphinx-build -b html sourcedir buildir. The make file assumes current directory is source directory, and it creates the build directory "build". HTML is the default doc type produced.

> make html
> tree

> tree
├───build
│   ├───doctrees
│   └───html
│       ├───_sources
│       └───_static

├───mycode
└───source
    ├───_static
    └───_templates

Open the docs.

> build\html\index.html

At this point you have basically a framework to build on, but not much else. The index.html page should look like this.


The index.rst file


In the sphinx-quickstart\source folder there should be an index.rst file. Edit the file to add the automodule to automatically document members of a module myclasses.py.

> type index.rst
.. MyTestDocs documentation master file, created by
   sphinx-quickstart on Thu Jun 20 14:06:30 2019.
   Adapt this file to your liking, but it should at least
   contain the root `toctree` directive.

Welcome to MyTestDocs's documentation!
======================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:

.. automodule:: myclasses
   :members:

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

The index file is the initial documentation file. You can see that lines in the index.rst appear in the index.html shown above. The index.rst file can contain reStructuredText documentation and directives (the same that appear in Python docstrings). In the example above, we are using automodule to indicate that the docstrings in myclasses should be documented.

Add test Python code


If you cloned the repo, you should have the following:

  • mycode\myclasses.py module.
  • mycode\__init__.py file, which signals that the directory contains a package.


Edit the source\config.py to so that Sphinx can find the code. Here are the lines:

import os
import sys
sys.path.insert(0, os.path.abspath('../mycode'))

Make sure the import lines are not commented out, i.e., have a "#" in front of them.

Here are the two files __init__.py and myclasses.py.

mycode\__init__.py
import myclasses

mycode\myclasses.py
class SimpleClass:
    """A simple example class"""
    i = 12345

    def f(self):
        return 'hello world'

class Person:
    """Creates a Person based on name and age."""
    def __init__(self, name, age):
        self.name = name
        self.age = age


Back in the root folder of sphinx-quickstart, rebuild:

> make html

The output should look something like this, which includes docstrings in myclasses.py:


If your code folder is outside the "doctest" folder, make changes to the os.path.abspath in the conf.py file as appropriate.

Your final directory structure should look like this:

>tree
├───build
│   ├───doctrees
│   └───html
│       ├───_sources
│       └───_static
├───mycode
│   └───__pycache__
└───source
    ├───_static
    └───_templates


reStructuredText


Let's add a little more functionality to this quickstart. Suppose we have a file foo.rst that contains documentation we want to include as well. Then we can add foo.rst and make sure it is documented by adding a reference to foo in index.rst:

source\foo.rst

foo module
==========

This is the foo module description.

.. note::

   This is a note.

source\index.rst  (changed part in red)

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   foo

.. automodule:: myclasses
   :members:

The syntax you can use inside of foo.rst is described in Sphinx reStructuredText. At this point in our quickstart, we have HTML generated documentation with some content coming from reStructuredText in an .rst file and some content coming from reStructuredText in docstrings in .py files.

Build the docs again:

> make html

Notice that "foo" appears in doc contents.

If interested, go to the next post in the series: From Sphinx to DocFX - Generating Python API Documentation with DocFx.