Pages

Tuesday, March 10, 2020

Working with Sphinx Extensions and Building to DocFx Output


DocFx output using Sphinx .yml files and Sphinx extensions
DocFx output using Sphinx .yml files and Sphinx extensions

Overview


The previous Sphinx/DocFx posts are:

In those posts, we talked about the Sphinx conf.py configuration file for configuring Sphinx to use extensions. An extension is simply a Python module that can be used to extend Sphinx functionality. We used the "sphinx.ext.autodoc" extension to get docstring comments from Python files and we used the "docfx_yaml.extension" extension to instruct Sphinx to export YAML files. In the Sphinx to DocFx post, we had a conf.py file with:

extensions = ["sphinx.ext.autodoc",  "docfx_yaml.extension"]

In this post, we'll discuss two more extensions you can use so that the we have:

extensions = ["sphinx.ext.autodoc", "sphinx.ext.intersphinx", 
              "sphinx.ext.extlinks", "docfx_yaml.extension"]

where:
sphinx.ext.autodoc Import modules for documentation, including pulling in content from docstrings.

docfx_yaml.extension An exporter for the Sphinx autodoc module to produce YAML files for use with DocFX. Seems to need to be at the end of the extensions list. Order is important. The docfx extension needs to be at the end. 
sphinx.ext.intersphinx Generate automatic links to the documentation in other projects like Python base classes. Depends on variable intersphinx_mapping variable in conf.py. See the interpret community repo for an example.

sphinx.ext.extlinks Allows creating a short link for commonly used links that go to subpages of one site.

sphinx.ext.intersphinx


The intersphinx_mapping configuration value is in the conf.py and it can be used to create mappings so that references to other documentation sets (outside of yours) can be referenced.

We'll use the following intersphinx_mapping:

intersphinx_mapping = {
    'Python': ('https://docs.python.org/3', None),
    'Pillow': ('https://pillow.readthedocs.io/en/latest/', None),
    'NumPy': ('http://docs.scipy.org/doc/numpy/', None),
    'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None),
    'sklearn': ('http://scikit-learn.org/stable', None),
    'matplotlib': ('http://matplotlib.sourceforge.net/', None)
}

To get a link to a Python builtin type at docs.python.org, we need only specify the type name in the type or rtype docstring field as follows:

"""
:type: bool
:rtype: list
"""

And the correct linkages will be made to the docset when Sphinx builds. And, if you wanted to link to these types in text outside of these fields? You can like so:

"""
This is a link to the Python built-in string type: :class:`str`.
"""

It works the same for other doc sets specified in the intersphinx_mapping.

"""
This is a link to :class:`pandas.DataFrame`, this to :mod:`matplotlib.image`,
and this to :func:`numpy.array`.

:type: pandas.DataFrame or numpy.array
:rtype: matplotlib.image
""" 

The big gotcha with using intersphinx_mapping is that if you look at the example above in the Sphinx rendered HTML, you would see pandas.DataFrame, matplotlib.image, and numpy.array are correctly linked to their library types for both the descriptive text and the :type: and :rtype: markup.
However, if you look in the DocFx rendered HTML (from the same docstring), you would see that while the links in the descriptive text do resolve, those in the :type: or :rtype: markup do not. To get around this problem, we need to use a cross reference file in DocFx as described here.  This will be the subject of a future post. (For numpy links, see the stack overflow question.)


This extension is a convenience for avoiding repeatedly typing a URL to a web site you are referencing frequently. The docs for extlinks show how you might use this to link back to GitHub issues. In our tutorial here, we'll create links back to Wikipedia.


In the conf.py file make sure you have extlinks configuration parameter defined like so:

extlinks = {'wiki': 
  ('https://en.wikipedia.org/wiki/%s','Wikipedia: ')
}

Now, suppose in a docstring you want to reference these three Wikipedia pages: https://en.wikipedia.org/wiki/Machine_learning, https://en.wikipedia.org/wiki/Supervised_learning, and https://en.wikipedia.org/wiki/Unsupervised_learning. Your docstring would look like this:

"""
Here are links using markup to make external links easier to
work with. See :wiki:`Machine_learning`, :wiki:`Supervised_learning`,
and :wiki:`Unsupervised_learning`.
"""

This docstring would create the three links like so: Wikipedia: Machine_learning, Wikipedia: Supervised_learning, and Wikipedia: Unsupervised_learning.

You can experiment with the presentation be modifying the extlinks configuration parameter.

Build Example


The steps below follow the post From Sphinx to DocFX - Generating Python API Documentation with DocFx. TIn this tutorial, our goal is to use the new extensions we enabled and show how they appear in the DocFX HTML. The prerequisites are:

  • Sphinx installed
  • DocFx installed
  • Optional: read or ran the previous tutorial

Step 1: Clone the repo travelmarx-blog and start in the \sphinx-extensions-example folder.

Step 2: Create config.py and index.rst files.

Use the instructions in the Sphinx Quickstart post to generate these files, following the suggested answers for the prompts. Or, if you already have these files from that project, you can reuse them here. Your folder structure should look like this.
.
├───build
├───mycode
│   ├───core_api
│   │   ├───package1
│   │   └───package2
│   └───test_api
└───source
    ├───_static
    └───_templates

Step 3: Edit source\config.py to include extensions.

In this tutorial, we are working with four of the extensions mentioned in the intro:

extensions = ['sphinx.ext.autodoc', 'sphinx.ext.intersphinx', 
   'sphinx.ext.extlinks', 'docfx_yaml.extension']

Point to the code folder:

import os
import sys
sys.path.insert(0, os.path.abspath('../mycode'))

Add intersphinx_mapping configuration:

intersphinx_mapping = {
    'Python': ('https://docs.python.org/3', None),
    'Pillow': ('https://pillow.readthedocs.io/en/latest/', None),
    'NumPy': ('http://docs.scipy.org/doc/numpy/', None),
    'pandas': ('http://pandas.pydata.org/pandas-docs/stable/', None),
    'sklearn': ('http://scikit-learn.org/stable', None),
    'matplotlib': ('http://matplotlib.sourceforge.net/', None)
}

Add extlinks configuration:

extlinks = {'wiki': 
  ('https://en.wikipedia.org/wiki/%s','Wikipedia: ')
}

Some of these lines in the config.py file may already exist and you'll have to uncomment them and/or modify them.

Step 4: Run sphinx-apidoc to create .rst (reStructuredText) files describing the code.

In the root folder, run:

sphinx-apidoc -o source .\mycode

This creates .rst files in the \source folder.

Step 5: Modify the source\index.rst file to include the modules.rst file which is the entry point for code to document.

Running sphinx-apidoc will produce a source\modules.rst file by default. The modules.rst file is the entry point for documenting the code in \mycode.

source\index.rst (snippet, add the "modules" line)
Test documentation
=======================
.. toctree::
    :maxdepth: 4
    :caption: Table of Contents  

    modules

Step 6: Run sphinx-build to create Sphinx HTML and create YAML files for DocFx.

In the root folder, run

sphinx-build source build

You may not be interested in Sphinx's HTML, but it's this step which creates the YAML files. It doesn't hurt to check it to see what Sphinx's HTML looks like. To see it, run:

build\index.html

If you are interested in only the Sphinx generated HTML, then you can stop here.

Step 7: Confirm the YAML files were generated.

Starting in root folder, run:

dir build\docfx_yaml

You should see a listing of .yml files like "core_api.package1.someclass.SomeClass.yml".

Step 8: Generate an initial docfx.json file.

In the root folder, run:

docfx init -q

This will create a docfx_project folder with the docfx.json configuration file.

Step 9: Copy the Sphinx YAML files to the docfx_project folder.

Copy .\build\docfx_yaml\* to .\docfx_project\api\*

Step 10: Build the DocFx HTML and serve the docs on localhost.

Starting in the root folder, run:

docfx docfx_project\docfx.json --serve

Step 11: View the HTML docs produced by DocFx.

Go to http://localhost:8080.

You will note that there are some warnings in the DocFx output for things to address. Mostly invalid links.

Build Cycle


So running through the build once is great, but what about an authoring flow that works for the fix-build-verify cycle. Well that's where it gets messy with both Sphinx and DocFx we've found. We find that cleaning out previous build artifacts and building everything works the best. To this end we create a simple batch file (for Windows) that looks like the following:

REM For best results cleaning out directories works the best

del build\* /Q
del build\docfx_yaml\* /Q
rmdir docfx_project\_site /S /Q
rmdir docfx_project\obj\.cache /S /Q

del source\core_*.rst /Q
del source\modules.rst /Q
del source\test_*.rst /Q

REM Run sphinx commands

sphinx-apidoc -o source .\mycode -f
sphinx-build -a source build

REM Copy the sphinx generated yaml files to the docfx folder

copy .\build\docfx_yaml\* .\docfx_project\api\*

REM Build and serve docfx html
docfx docfx_project\docfx.json --serve


No comments:

Post a Comment

All comments go through a moderation process. Even though it may not look like the comment was accepted, it probably was. Check back in a day if you asked a question. Thanks!