Here is another technical note, this time with some code which I keep rewritting in different projects, so I decided to put it all in one place.

Packaging configuration files in python modules

Pckaging configuration files in Python is fairly tricky to get right, though there are a number of tools around now which help. It’s also important to make sure that a user can in fact write their own customisations to the configuration somewhere, and that’s a bit tricky.

First, assuming that you keep your default configuration file in the main directory of your package, you’ll need to include it specifically in the package manifest. The package which I was putting together today is called elk, so you’ll see references to it in the code snippets in this post.

To do that you’ll need to add a line something like this to the file in the root of the project.

include elk/elk.conf

You’ll also need to enable packaging of data in the file for the package.


Now, in order to load the configuration file you’ll need to use the pkg_resources package which python ships with. I’m using the built-in ini config parser here, but you could adapt this fairly easily to use e.g. json format config files.

from pkg_resources import resource_string

    # If we're using Python 2 load ConfigParser
    import ConfigParser as configparser
except ImportError:
    import configparser

Now the configuration file which is installed with the package can be loaded as a string.

default_config = resource_string(__name__, '{}.conf'.format(__packagename__))

That can then be passed-through to the config parser.

config = configparser.ConfigParser()


A config file isn’t a lot of good if someone can’t change it, however, so the package needs to look in some sensible places for user customisations. Standard locations for this include the current working directory, the user’s .config directory (inside their /home directory; this is a defacto standard in many linux distributions), as a dotfile in their /home directory, or in /etc (normally for system-wide configurations).

So we need to check each of these places, and update the configuration of the package depending on what we find there.

config_locations = [os.path.join(os.curdir, "{}.conf".format(__packagename__)),
                                 ".config", __packagename__, "{}.conf".format(__packagename__)),

config_locations.reverse()[conffile for conffile in config_locations])

Configparser will read each file in order, and replace the current values of configuration variables with the latest one it’s found. This means that we need to read them in order of precedence: system-wide defaults, followed by user-defaults, followed by current directory configuration.

And that’s it. A flexible way of reading configuration data.

Mucking-around with emacs lisp

I keep my bibliographical database using org-mode in emacs. Each paper gets its own heading in the file, and then the properties data for the heading contains the data necessary to construct a bibtex format file. Then I can keep notes in the org file without these appearing in the exported bibtex file.

My latest plan is to try and write a script to update any entries in the database which started out as arxiv postings, which have since been published, to include the journal details and the final DOI for the paper.

Well, to do this I need to learn some emacs lisp, which should be a fun challenge. Here’s an initial attempt at a function to parse the database file to pull out the IDs of all of the entries which have arXiv e-prints listed as the journal, and prints them to a temporary buffer.

(defun org-bibtex-get-arxiv-ids ()
  "Produce a list of entries which point to arxiv rather than a journal."
  (let (($headings nil))
     (lambda ()
       (if (string-equal (org-bibtex-get "JOURNAL" ) "arXiv e-prints")
	   (push (org-bibtex-get "CUSTOM_ID") $headings)
    (with-output-to-temp-buffer "*Arxiv IDs*"
      (dolist (customid $headings)
	(print customid)