Automatic Indexing of LaTex Documents

A couple weeks ago I mentioned in a post that I was working on a Python script to automatically generate indexes of books written in the LaTex typesetting system.  At the time I promised to post the script in “a couple of days”.  Predictably, weeks have passed, my little script has ballooned into a full on open-source software project, and the code is now too long to post (or explain) in a single blog article.  If you’re interested, however, you can now download my alpha release from sourceforge.

The package includes two Python programs.  Indexmeister is a console utility which reads a file (in several formats, not just LaTex) and suggests terms for indexing.  It uses three different methods to figure out which terms are important.  Imbrowse is a Curses program which helps you interactively browse multi-file LaTex books and quickly insert the right tags to generate an index.

I made this video tutorial to show how the system works:

In the future I am thinking of adding a plug-in for LibreOffice, and possibly a graphical interface (probably using GTK bindings). Porting it to Windoze is not a priority, however.

New Short Story Release

Last night my new contemporary fantasy short story went live on Amazon.  You can buy it right here:

The folks at Creative Minority were good enough to do a news blurb about it on their website.  Just to be clear, this story is was not actually published by Creative Minority.  Anyway, I think their post sums things up fairly well:

NON-FICTION WRITER BRANCHES INTO FANTASY

Montrose, CA, May, 19, 2015

Kevin A. Straight, best known for blogging about literature and history and for his monograph Freight Forwarding Cost Estimation: An Analogy Based Approach (2014), ventured into new territory, self-publishing The Phylactery, a contemporary fantasy short story.

“I’ve actually been writing fiction, including fantasy, since 4th grade,” explains Mr. Straight, “I thought it was probably time some of it saw the light of day.”

When asked why he chose a to self-publish instead of following a more traditional route, he replied, “I think soon most short fiction, especially genre fiction, is going to be self published. There are relatively few magazines left that handle speculative fiction, and most of them are trying to position themselves in a more literary way. Not that there’s anything wrong with that, but it means that there aren’t really any good intermediaries for pulp sci-fi and fantasy. If you want to sell it, you’re better off selling directly to the readers. Ultimately, I think disintermediation will be a good thing, because readers will have better access and writers will get to keep more of the value from the product.”

The Phylactery is set in a slightly fictionalized Riverside, CA in the present day and follows the misadventures of an evil wizard trying to salvage an evil scheme in which everything seems to be going wrong. It is available world-wide in the Amazon Kindle Store.

Kevin A. Straight is currently writing a non-fiction book, 14/2: A History of Outside Scholarship and the Fourteenth Amendment, which will be published by Creative Minority Productions in 2017.

The Phylactery: A Short Story, Kevin A. Straight, 2015. cover image

The Phylactery: A Short Story, cover image

Five Giveaways that a Book is Self-Published

“This looks really nice for a self-published book,” said my friend as she handed me a novel.  Looking down at it, I too knew immediately that is was self published.  How was it so obvious to both of us?  The cover was professionally done with a well composed photograph and good use of color.  The bar code was in the right place, and the page and cover stock were normal commercial grade.  But to people who knew what they were looking at, the book screamed “self-published”.  We talked about it, and came up with a list of the top five tip-offs that give away a self published book.

Lots of books in a book case
The good news is that most readers are probably not sophisticated enough to pick up on these tells.  My friend is a librarian and I am a writer and sometime editor.  Between us we have handled many thousands of books.  Unfortunately, if you are a writer, we are exactly the sort of people who you need to impress to get your book reviewed or have it added to the order sheet for a major library system.  If serious “book people” read your book you don’t want them to think “this is pretty good for a self published book”, but just “this is good book.”

The Five Giveaways

1. (Lack of) Editing.  I couldn’t finish three of the last five self-published books I read because the editing was so poor.  Being an editor myself, my hand kept jerking uncontrollably towards the cup where I keep my red pens.  It took me completely out of the plot and ruined the books for me.  Nearly all new writers underestimate the role that editors play in a finished book.  In fact, editing is at least as important as writing, and it’s hard to edit your own writing effectively.  If you truly can’t afford to hire an editor, then at least find a fellow writer and trade editing services with them.  Usually, however, hiring a real freelance editor is worth the investment.  If you go this route, be aware that a legitimate editor will be able to provide references from former clients, will have some sort of free trial plan (in which you send them a few pages to edit so you can evaluate their work).  You should also try to hire someone with experience in your genre.  Also, be aware that there are different types of editors:  line editors tell you how to improve your plot, what you should cut out, and how you can improve your style.  Copy editors catch mistakes in grammar, punctuation, and usage (and no, your word processor’s grammar checker is no substitute).  Technical editors (also called technical consultants) will catch mistakes you make in facts and details.  For instance, if you are writing a book about the military and you were never in the military, you absolutely need a military person to read your manuscript and tell you where you screwed up.  Otherwise your readers will point it out–brutally–in your Amazon reviews.  Few people are good at all three kinds of editing, so you may need to find more than one editor.

old time print shop
2. “Typeset” in Microsoft Word.   Many printers these days require a “print ready” .pdf file.  Most self-publishing authors generate it by exporting from the same word processor they use for writing.  Unfortunately, word processors do a horrible job of spacing and justifying text.  True, you can often fix badly spaced lines by manually moving hyphens and adjusting kerning, but its easier just to use software that’s actually designed for the job.  There are plenty of good, relatively affordable, desktop publishing packages available.  Or be like the real power users and typeset in LaTex, a free computer language designed for creating publication-ready documents.  The learning curve in LaTex is a bit steep; expect to spend two or three weeks doing tutorials before you can create anything useful.  I don’t know anyone who has invested the time who has regretted it afterwards, however.

3.  Fonts.  Most writers are sophisticated enough to stay away from tacky and hard to read fonts.  In fact, most seem to be too conservative and opt for Times New Roman (TNR) for their body text.  TNR is a decent all-purpose font, but I would never use it for a self-published book.  First of all, it was designed and optimized for newspaper text, not book text.  It is narrower than most serif fonts, because newspaper columns are relatively narrow.  A font designed for books, such as Minion or Palatino, is more likely to give you an optimum column count for comfortable reading.  More important, however, is that TNR is the default font in most word processing programs.  As soon as I see it I think “word processor = amateur job”.  If you are interested in learning more about typography, you may want to download Peter Wilson’s free, incredibly detailed e-book.

4.  Cheap Binding.  Larger printers use a large, automated “perfect binding” machine which uses hot glue to attach book covers to the pages.  Hardbacks and higher quality trade paperbacks also have each folded signature of pages stitched together before the gluing step, resulting in an extremely durable book. In contrast, smaller print shops and most print on demand (POD) operations rely on a small desktop thermal binding machine to attach covers.  Unfortunately, bindings produced on the smaller machines have a reputation for being less durable and shedding pages.  Pros can usually look at the glue strip of a binding and tell which kind it is.  Public librarians in particular tend to avoid ordering books with cheap bindings, because they worry about pages coming out.  Ask ahead of time about the binding technology a printer uses.  If possible, try to examine another book from the same printer to make sure it’s a quality product.

old time lead type and composing stick
5.  Bad Blurb.  The blurb or summary on the back of your book is one of your most critical pieces of marketing communication.  Imagine that your potential reader is about to get on an air plane and has only a few minutes to pick out a book.  Is she going to read a boring full-page blurb?  Is she going to read at all if the first sentence doesn’t grab her attention?  Actually, if your book is already being sold in an airport then you don’t need any advice from me, but its still useful to visualize the airport scenario. The back covers of many self published books are covered with text that reads like the review the author hopes someone will write.  No one is going to read that. Write three of four good sentences that command attention and say what the book is actually about, and go with that.  My theory about why authors write such long blurbs is because they are self conscious about not having any review excerpts, and they try to fill white space.  They aren’t fooling anyone.

There are other giveaways which I could mention, but I think these are the five biggest red flags.  You’ve worked hard to write your book, maybe for years.  It is worth taking a little more time to attend to the details and publish a professional product that will make a good impression on the people who matter.