Sunday, May 12, 2013

Blogger to LaTeX book converter

Someone I know wanted to print a book from a Blogger blog. We tried the software from one book printing site, but it kept crashing. After some searching, I found a perl script that converts to Adobe InDesign. But InDesign is expensive. So I modified the script to put out LaTeX.

Here is my script.

It's not doubt buggy, and you will probably need to make some manual adjustments to the output. Post comments here if it doesn't work for you.

This may also work for non-Blogger blogs that use Atom xml. Let me know if it does.

Here are some quick instructions.

Download an xml backup of your blog. Then edit config.cfg to point to that file and set conversion options. Edit header.tex to set title, author and similar information, as well as to customize formatting. Since much of the formatting is done via macros, you can customize a lot of it.

After ensuring you have all the needed perl packages--see here for perl and package information--run:

perl format_for_tex.pl

Then process output.tex (or whatever you specified as the output in config.cfg) with a modern LaTeX that fetches needed packages.

You can edit which posts and comments are included by editing output.tex. A post begins with

\begin{blogpost}{1}{other stuff}

The {1} means the post is included. To uninclude it, just change to {0}.

Similar things can be done with comments.

No comments: