Automatic Indexing of LaTex Documents

A couple weeks ago I mentioned in a post that I was working on a Python script to automatically generate indexes of books written in the LaTex typesetting system.  At the time I promised to post the script in “a couple of days”.  Predictably, weeks have passed, my little script has ballooned into a full on open-source software project, and the code is now too long to post (or explain) in a single blog article.  If you’re interested, however, you can now download my alpha release from sourceforge.

The package includes two Python programs.  Indexmeister is a console utility which reads a file (in several formats, not just LaTex) and suggests terms for indexing.  It uses three different methods to figure out which terms are important.  Imbrowse is a Curses program which helps you interactively browse multi-file LaTex books and quickly insert the right tags to generate an index.

I made this video tutorial to show how the system works:

In the future I am thinking of adding a plug-in for LibreOffice, and possibly a graphical interface (probably using GTK bindings). Porting it to Windoze is not a priority, however.

Leave a Reply

Your email address will not be published. Required fields are marked *