Migrating from Wordpress blog to GitHub Pages

Wordpress has done a decent job of running my blog. But I’ve longed for something leaner for writing developer code stories as the rich HTML text editor tends to introduce too much complexity.

Initially I thought about writing a custom blog web app but decided it would be better to consider existing platforms that can handle my requirements. Ultimately I decided to migrate my Wordpress blog to GitHub Pages.

Here’s a bunch of things I like with GitHub Pages compared with my self-hosted Wordpress site:

  • Rather than edit posts in small scroll view I can use a desktop code editor like VS Code which supports full screen editing of text and live code preview. (I’m aware this might not be everyone’s cup of tea but writing developer blog posts inside a developer tool feels right at home for me!)
  • Markdown is much easier for writing developer documentation and inline code snippets, but I can also inject HTML when needed.
  • Use GitHub as server - I don’t need to provide my own hosting or database. Free hosting is nice!
  • GitHub has partnered with Let’s Encrypt so you get HTTPS for free - just check a box!
  • You can setup a custom domain for both an apex domain and www subdomain and GitHub Pages will handle the redirect.

    1. Add A Records with your DNS provider for apex domain (without www.):

      Domain IP address
      YOUR_DOMAIN.com. 185.199.108.153
      YOUR_DOMAIN.com. 185.199.109.153
      YOUR_DOMAIN.com. 185.199.110.153
      YOUR_DOMAIN.com. 185.199.111.153
    2. Add one CNAME record for www subdomain.

      Domain Canonical name
      www USERNAME.github.io.
    3. Update the GitHub Pages > Settings > Custom Domain field to www.YOURDOMAIN.com (this will add a CNAME file to the root directory in the GitHub Pages branch)
    4. You can then enable Enforce HTTPS in GitHub Pages > Settings.

However there are some downsides during this process:

  • Not as easy to setup as Wordpress.
  • Don’t expect the migration of Wordpress HTML to MD files to be perfect! In my case the conversion tool had problems preserving spacing in code blocks and some character conversions may need fixed.
  • By default all updates will be public on github so you have to think about that if you need to support private posts.

How to migrate from Wordpress to GitHub Pages (Jekyll)

GitHub Pages is powered by Jekyll. To import your Wordpress XML archive and run Jekyll locally you will need to install Ruby 2.1 or better.

  1. If you are migrating an existing blog you have to export your content from the Wordpress admin control panel.

    The general form of the URL is as follows: https://YOUR-USER-NAME.wordpress.com/wp-admin/export.php

  2. Import the Wordpress XML archive file as mentioned on the import wordpress to Jekyll docs

    Install Ruby Gems

    gem install jekyll-import
    gem install hpricot
    gem install open_uri_redirections
    

    Convert Wordpress XML archive to HTML files and download images to ‘assets’ directory.

    ruby -rubygems -e 'require "jekyll-import";
    JekyllImport::Importers::WordpressDotCom.run({
      "source" => "C:/Users/USERNAME/Downloads/REPLACE_USING_YOUR_FILE_NAME.wordpress.YYYY-MM-DD.xml",
      "no_fetch_images" => false,
      "assets_folder" => "assets/images"
    })'
    
  3. Convert HTML files to Markdown files. You can try any number of tools to see what works best for you. I tried various ones including the reverse_markdown Ruby gem and the html2text Python script. To help batch process the files I created a Wordpress HTML to MD gist to find any *.html files in the ‘_posts’ directory and convert them all to *.md files using reverse_markdown gem.

    ‘wordpress-html-to-md.rb’ gist usage:

    gem install reverse_markdown
    
    ruby ./wordpress-html-to-md.rb "_posts"
    

    html2text usage:

    ./html2text.py C:/Users/USERNAME/git/blog/_posts/YYYY-MM-DD-filename.html
    

    NB. Don’t use “\” in path otherwise you will get file not found error, use “/” in path instead.

  4. To show code syntax highlights you will need to add some styles for Rouge (GitHub Page’s syntax highlighter). You can use Rougify to copy GitHub’s code syntax highlighting to a stylesheet.

    gem install rouge
    
    rougify style github > _sass/styles/_rouge.scss
    

After this you might decide to apply one of the built-in GitHub Pages themes or use a remote theme or create your own theme. In my case I added the Foundation XY-Grid module for responsive design grid layouts. One thing I would like to see supported in GitHub Pages is support for npm packages. Everyone seems to have their own way for building this out and it would be nice just to provide a package.json file and let GitHub take take of the rest. One nice solution however might be to roll out the node_modules dependencies as part of a remote theme. But at this early stage I prefer to keep it all together in one repo until I have proved everything just works over time.

Resources

You can find the source code of this website on my GitHub Page’s blog repo. I’ve also included a list of references below which I found useful during the creation of this new GitHub Pages blog.

GitHub Pages settings

Jekyll blog

YML Config reference docs

Disqus

Disqus can be added to a Jekyll site to enable comments on blog posts.

Foundation

I added the XY-Grid SASS classes for responsive design layouts.

A Jekyll generated JSON feed can be used as the search index for lunr.js.

Gulp

One gulp command deploys the production build to my gh-pages branch!