
About this site
Published 2013-01-31
I (re)built Axoplasm in about one week in Django.
For the longest time — OK, about four years — this site was built with Drupal. Before that it was on Blogger. Before that it was a pile of PHP files, occasionally loosely connected to a MySQL database.
All the blog posts should (theoretically) be the same as on my old Drupal website. Most of the URLs are the same. None of the comments are here. Tags and a few other bloggy ephemera are “here,” loosely, but not well-wired.
How I did this.
When I set about to migrate my site, I gave myself the requirement that I wouldn’t edit any data by hand. I’m not a “programmer” and database migrations are kind of tricky for me, so I did this in a really goofy roundabout way.
First I built an empty house
I created a Django model that contained what I considered the bare-minimum blog components. Title, date, text body … that was pretty much it.
I also created two views and two templates that would show:
- A detail page like this one.
- A big index of everything.
Then I dumped all my old stuff onto the street
My old Drupal website has all my bloggy content scattered around in a bunch of MySQL tables. There isn’t one big table called “blog” or somesuch. That meant I would have had tons of difficult (for me) joins. I maybe could have learned all that but I’m impatient.
Moreover, Django has groovy database-agnostic data dump/load utilities. Django dumps and loads from easy-for-humans-to-read JSON, without regard to the dabase on the backend.
So I built a Drupal view that output certain data fields (like “title” and “date”) from all 547 blog entries into a really raw form. Basically just HTML <divs> with identifying classes like “title” and “date”. I loaded that view in a browser window and saved the HTML to my desktop.
Then I packed everything up in nice JSON boxes
This was the trickiest part, and it was still pretty easy. In short: I wrote a Python script that walked through the HTML file (using BeautifulSoup), cleaned up some characters that JSON parsers hate, and spit the whole thing out in nice, clean JSON.
Before I cracked open BeautifulSoup, I tried to do this with regular expressions. You know the old joke about regular expressions?
Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.
BeautifulSoup (and, OK, a few regular expressions) turned <divs> into nested JSON for me. So nice.
Finally I let Django move the JSON boxes into my new house
Using loaddata, natch. The only catch was for fields that weren’t unique in my Drupal site but that I needed to be unique on my new one. I built a separate regex that walked back through the JSON and made those objects unique.