loading words...

May 19, 2019 19:57:21

Migrating Off WordPress

by @valentino | 4028 words | 60🔥 | 358💌

Valentino Urbano

Current day streak: 60🔥
Total posts: 358💌
Total words: 170604 (682 pages 📄)

Notice: I've migrated my blog off WordPress and into Jekyll a few years ago, but I never wrote much about it. Many of the choices I took would be different if I was going to do the same thing nowadays (mainly which frameworks I would choose).


This is going to be a series on how I migrated my personal website off Wordpress and what you need to do if you want to embark in the same journey.


I had been thinking about doing it for almost 5 years before actually going through with it. I had always been scared of totally breaking the website by migrating it off something well estabilished as WordPress and that I'd been using for years at that point to something less "mainstream". My process looked something like this: I would write each post in markdown in Byword (both on iOS and Mac OSX), convert it to HTML and paste it manually into the WordPress textbox. Doing all of this each time felt a bit too much friction. It wasn't broke, but it surely could have been improved.


Early on I decided that I would want to go for a static site. The content of the site would not change much and it surely wasn't dynamic. I could easily rebuild all the static pages when I published an article.



Static Website using Markdown - What are the options?


Since I've been interested in doing this for years, I've read about and tried locally many markdown static website generators.

This is a comprehensive list of everything I've tried:

- Jekyll - The standard that started the whole movement of static markdown websites.

- Octopress - Opinionated customization of Jekyll built for blogging.

- Second Crack - A hobby project of a developer I follow. It works well, but it should not be used for production use since it doesn't really have a community behind it.

- Hugo - Really fast builds, but lacks plugins and extensibility.

- Ghost - Great look out of the box, needs a VPS.

- Brunch

- Hexo

- Gatsby - Based on React. Great if you're already using React on other project or if you'd like to learn it.

- Gitbook - Super simple to set up and use, but limited in extensibility and customization. Great for documentation, books, and tutorials.


I'm sure I missed many of them, but I feel like these ones are the most used and/or famous. Of all of these, I particularly liked Gatsby, Jekyll, and Ghost.

A few of these solutions (Ghost and Second Crack for example) need either a VPS to build the website or a manual deployment process where you built the website yourself and deploy the generated website to your server.

Nowadays you can use Netlify to automatically run a "cloud function" after each deploy to generate your website and deploy it so it's less of a hassle, but it still not straightforward to setup. I've tried Netlify with both Jekyll and Gatsby and it works well.


Why I chose Jekyll


In the end, I chose Jekyll over Gatsby and Ghost for multiple reasons:


1. Community

It's the framework with wider adoption, a history of consistent updates over a decade (it was launched in 2008) and most activity on Github. It might be that in the future newer frameworks will overtake it and if that's the case it might be worth migrating, but for now Jekyll is still the king.


2. Automatic integration with Github

Github Pages runs on Jekyll with almost no setup. I could simply host my website there and have all the goodies of Github included for free. Otherwise would have been forced to use two different services instead of one: Github to host the code and Netlify to host the site. I tend to avoid that whenever possible.


3. Longevity

It being old might be seen as a negative point of some people, and in some cases of bloated and old software it is, but I don't think this is the case. The framework is very much alive and used by most static websites, not to forget that just the fact of Github officially supporting it means a lot more people using it and more people contributing code and fixes. I know it is unlikely that any of the other 2 frameworks I was considering (Ghost and Gatsby) are going to shut down any time soon since they're both widely used.


Content


What would be a blog without content? Practically worthless.

During this restructuring I also realized that I didn't like the fact of having 2 separate websites, with their relative blogs, when it could be just the one.

Disclaimer: I used to have one website for my personal blog where I wrote everything that came to mind and one website for my work where I wrote everything about my development gigs

I've thought for a very long time about it when I was deciding if it was worth having two different blogs with their topics clearly separated and I thought it was worth it. Google still penalizes websites that have different kind of content, but their content was not different enough for that to take effect and I'm not in the business of SEO anyway (at least not excessively). I still think I had a solid point at the time, but now I feel like a simple `/patchnotes` page for each of my apps is enough and if I really need to say anything more, I can just write it on my normal blog which will now cover both areas: me, my hobbies, my business and whatever I feel like it's worth sharing. About including the business area to my personal blog, that's a part of me, of what I'm doing and what I am, I'm not trying to sell you anything, I'm just writing about what I'm doing. I'll always strive to find the balance in writing without feeling like I'm pressuring my readership into buying my products or that I'm only talking about them. I think a balance can be found and that I proved in the past years that I have found it. I know that this is a difficult area to tackle, I've unsubscribed to great blogs just because they were self-promoting way too much, even though I still enjoyed the content, only because I couldn't stand all the spam I was forced to read, so I'm definitely aware of that and I will try my best to avoid doing it.


Jekyll


Jekyll is the most widely used Static Site Generator. It is written in ruby and it generates static pages from your files in markdown. You can fully control the styling and plugins of your website, for example, I added a script to show the number of words in an article and another one to show an estimate of the reading time.


I love this part of Jekyll's README:

Put simply, Jekyll gets out of your way and allows you to concentrate on what truly matters: your content.


Truth be told, most static generators are like that. Customizable if needed, but they don't force you to, you can simply choose one of the multiple themes and get started.

To start we will install a local development environment on your Mac so that you can test your website design and content without actually publishing it. The neat thing is that once the website is ready you can use it to preview new articles before syncing the changes to the server to have a real view on its look.


This article will focus on Mac OSX since it's the OS that I'm using. If you need to install Jekyll on a different OS the official website has a guide on how to install it on each OS.


Install RVM and Ruby


Before trying to install any dev tools on a Mac it is a good idea to install the Xcode command line helper from Apple. Open Terminal.app and run "sudo xcode-select --install". It will ask for your Mac's password and install the developer tools.

The version of Ruby that comes installed with OSX is really outdated and it shouldn't be used — ever.

Open Terminal.app again and install RVM (a version manager for Ruby that lets you install and switch between different versions of Ruby with ease), and the latest version of Ruby:

"\curl -sSL https://get.rvm.io | bash -s stable --ruby"

After that is good practice to update your gems (the ruby 'libraries' you have installed), run:

"gem update"

After all of that is done it is time to install Jekyll itself.


Install Jekyll


Open terminal again and run:

"gem install jekyll"

Additionally, install "bundler", we'll need it later:

"gem install bundler"

During the installation it might happen that nokogiri fails to install. If it fails, try installing it without the native extensions. Before trying to install jekyll again run:

"gem install nokogiri -- --use-system-libraries --with-xml2-include=/usr/include/libxml2 --with-xml2-lib=/usr/lib"

If it fails again try following this thread for troubleshooting steps.


Setting up your GitHub pages repository

Login into Github and create a new public repository [Unless you plan on forking a theme], choose a name, ignoring the `Need inspiration? How about verbose-octo-waffle.` proposed by Github. I'm sure you can come up with a better name yourself.

Before doing anything else you need to think about which url you want to use:

Decide if your website is going to be at `http://USERNAME.github.io/` of at `http://USERNAME.github.io/REPOSITORY_NAME`. For now, I plan to have my website at `http://valeIT.github.io/` of course if you plan to redirect to it from your main website URL it is not really relevant. In my case http://www.valentinourbano.com will bring you there so it's not really important where it is located.

Usually, the `http://USERNAME.github.io/REPONAME` is used for singular projects while the `http://USERNAME.github.io/` is used for personal sites not related to a specific project.

Create a new repository there, or just fork whatever them you plan on using (I'm using the 'Mediator' theme at the moment, but kasper and HMFAYSAL OMEGA look great as well).

Please note that this Article is mainly based on forking the mediator theme so that's the easiest way to follow the guide if you're new to Jekyll.



If you forked a theme go here to set up your theme. Otherwise continue with setting up a new repository:


New Repository

If you have a new repository running `jekyll new` will automatically generate a base template for you to get started[^3]. Note that this will create a basic site (best suited for blogs), if you have different needs you should use a [theme](#forked-a-theme) For now leave everything as it is and [skip](#testing-the-website) the next section.

- Open Terminal.app app and run `jekyll new`.

- Open your browser of choice to http://localhost:8000


Forked a theme

If you've forked a theme instead, you should follow the theme's installation instructions. For example, using the mediator theme the process is as follows:

- Fork the repository from Github to your account

- Clone it to your local machine: `git clone https://github.com/USERNAME/mediator`

- Open Terminal.app and navigate (`cd`) to the folder: `cd mediator`

- From the Terminal run `bundle install`

- Run `jekyll serve` to generate your website

- Open your browser of choice to http://localhost:8000

Mediator is a medium-like theme and it's the theme that I'm currently using for my personal website. If that's not the theme for you, a good source is Jekyll Themes.


Jekyll File Structure

The structure of your new website is not super complicated. Everything that starts with an underscore "_" is your source that will generate the website. Once generated the site will live in the "_site" folder.

- _config.yml

This file stores the configuration of your website like the URL, the name, setup defaults and so on.

- _data

This folder includes files used to get data that will be accessible from anywhere else on the site. For example you could have a .yaml file with a list of your projects for a portfolio or a list of links.

- _drafts

The folder to store the drafts of your articles that will not be published on the web until the get moved to the _posts folder.

- _includes

Files here can be included from anywhere else on the website so you usually put here your headers and footers, but also utilities.

- _layouts

Here are stored templates for your pages. You could have one template for your blog posts and another template for your pages.

- _posts

It includes all your published blog posts. Keep in mind that the filename should follow this convenction: YEAR-MONTH-DAY-title.extension

- _sass

It includes all of your css that will be compacted into one output file during the build.

- _site

Once built your site will be generated inside this folder.

- other folders

Any other folder will be copied to your website, so you could have a folder called "folder1" that point to mysite.com/folder1 and so on.


Setting up your machine for SSH access to Github

If you've already set up your machine for SSH access, or you're used to do it, freely skip this section

First you will need to make sure that `openssl` is installed. If not you can install it using homebrew: `brew install openssl`.

Generate a key

If you are on windows you can read my guide on how to set up an SSH key on Windows.

Open up your terminal and generate the key by running (taking care of adding your email instead of the example one):

`ssh-keygen -t rsa -b 4096 -C "[email protected]"`

You can leave the default filename or choose a custom one, if you don't have a need for a custom name you should leave the default one.

Choose a strong password to protect your key and save it in your password manager.


Add the key to the SSH agent

- Start the ssh-agent if it's not already running by typing `eval "$(ssh-agent -s)"` in your Terminal

- Edit `~/.ssh/config` (for example by running `nano ~/.ssh/config`) and set up the SSH agent to automatically load keys on startup on top of allowing it to save passwords to the keychain so you don't have to type it every time:

```

Host *

AddKeysToAgent yes

UseKeychain yes

IdentityFile ~/.ssh/id_rsa

```

- Finally add the key to the agent by running `ssh-add -K ~/.ssh/id_rsa`


Add the key to Github

Now that you have generated a key and added it to the agent you can Log In to Github and add the key to your account.

- Log In to your Github Account

- Go to "Settings" > "SSH and GPG keys"

- Open your terminal and print your public key by running `cat ~/.ssh/id_rsa.pub`, copy the output

- Go back to Github and click on "New SSH key"

- Paste your PUBLIC key (you can also give it a title to remember the name of the device)


Testing out the connection

- Go back to your Terminal and run `ssh -T [email protected]`. Type yes when it prompts you to continue (if you want to check that you're really connecting to Github you can check the RSA fingerprint displayed in your terminal to the ones provided by Github).

- If you see "You've successfully authenticated, but GitHub does not provide shell access." you've set up everything correctly, otherwise check out Github's troubleshooting steps

We're are not going to set up or push to a repository yet. We've set up the key not so that we don't have to go through the lengthy process of setting up a key once everything is ready. Next up is actually setting up the website with Jekyll.


Testing the Website

Once everything is set up:

- Run or `jekyll serve`. If you have a file called Gemfile run `bundle install` to install the dependencies, followed by `bundle exec jekyll serve` otherwise simply run `jekyll serve`.

- Go to http://localhost:4000 and poke around at the website.


Initial Configuration

Open `_config.yml` and set up basic information for your website, on top of most of the configuration details:

NOTE: For the images you can check the average dimension they should be by going to `assets/images`

- **`gems:`** : Plugins in use

- **`title`** : The title of your website

- **`description`** : The "motto" that will appear right under the title

- **`email`** : Your email

- **`logo`** : The image to be used as logo

- **`cover`** : The image to be used as cover image

- **`name`** : Your name or pseudonym

- **`author`** : Your name or pseudonym, or the name of the default writer for the site

- **`author_image`** : Your picture (small)

- **`paginate`** : The number of articles per page

- **`url`** : The URL of your website (root url)

- **`baseurl`** : The url for your jekyll site (usually empty `baseurl: ""`, fill it in if you want only a folder of your site to be managed by Jekyll)

- **`twitter_handle`** : Your twitter handle, starting with `@` (not the url)

- **`social`** : For each social link you want to add you'll need to add the following snippet:

````

- icon: twitter

url: https://twitter.com/valentinourbano

desc: Follow me on twitter

share_url:

share_title:

share_link:

```

icon: The icon to use, go to [FontAwesome](https://fortawesome.github.io/Font-Awesome/icons/) to find all the names (For exaple `github` will become an icon with the github logo).

url: The url of your social profile

description: It wil appear while hovering

share_ : If you want to automatically prefill a tweet or a youtube post with some information

Notice that this can be used also with non social links if you want. I've done something similar here using the apple icon to link to my apps:

````

- icon: apple

url: /apps

desc: Check out my apps

share_url:

share_title:

share_link:

```

- **`permalink`** : Which kind of permalink you want for your articles. Default is `/year/month/day/title.html`, I changed it to `/title.html` using: `permalink: /:title.html`. You could also add a `/blog/title.html` if you prefer, doing so: `permalink: /blog/:title.html`


Plugins


**Note**: According to an answer to my stackoverflow question and answers from Github directly plugins are not officially supported by Github Pages. Use them at your own risk.

**Update**: According to Github **some** plugins are now officially supported

While building the site I noticed that a few important features were missing to the theme. I always wanted to have an estimate on how long it's going to take to read an article and how many words it is.

Doing so I also [opened a pull request](https://github.com/gjtorikian/jekyll-time-to-read/pull/1) to have the first letter made uppercase, cause it looks way better.

Plugins installed on top of the ones already included with the theme:

- jekyll-time-to-read

- jekyll-sitemap

To install them add them both to `_config.yml` and to the `Gemfile` (Example: `gem "jekyll-time-to-read"`). Open the Terminal and run `bundle install`.

To properly configure the `jekyll-time-to-read` plugin I added it after the author field on top of each article. I only want it for articles so I added it to `post.html` in the `_layouts` folder:

```

<h4 class="author-name" itemprop="author" itemscope itemtype="http://schema.org/Person">{{ site.author }}</h4>

on

<time datetime="{{ page.date | date: "%F %R" }}">{{ page.date | date_to_string }}</time>

- {{ page.content | number_of_words }} Words - {{ content | reading_time_as_i }}.

</div>

```


More About Your Site Structure


We've already done a basic introduction on the site structure and what each folder contains. Now we will go more in detail for each section;

- **`_config.yml`** : The main configuration file for your site.

- **`Gemfile`** : Includes all your plugins.

- **`_include`** : The html files for your header and footer, here you can add additional files to be included in any of your layouts. For example I added a file for my navigation menu so I don't have to copy it to every page each time.

- **`_layouts`** : The different layouts, the default ones are for pages, articles and a general one. You can freely add more or modify the existing ones.

- **`_posts`** : Here is where all your articles reside. Every post in this folder will be published, create a `_drafts` folder to store unpublished drafts. Each post to be parsed needs to start with:

```

---

layout: post

title: Title of the Article

date: 2019-05-16 18:19:03.000000000 +01:00

type: post

published: true

status: publish

author: Valentino Urbano

---

Content of the Post

```

(Note the `——-` at the top and after the infos)

- **`assets`** : All the images and downloadable content for your website needs to be put here, it'll be accessible directly from `assets/fileName`

- **`css`** : All the css will go here

- **`_sass`** : All the sass will go here (just leave it be if you don't know what it is and don't want to use it for now). It will automatically be compiled down to css by Jekyll.

- **Other Folders** : You can add pages, for example `/about` by simply adding folders and putting `index.md / index.markdown / index.html` files in it. Note that the file needs to be named "index". Every page to be parsed needs to start with:

```

---

layout: page

title: Title of the page

---

```

- **`index.html`** : The homepage of your website

- **`_site`** : This folder is generated automatically, you can look at it, but don't modify anything (If you do it'll just get overwritten the next time you build)

- **`CNAME`** : If you're using a custom domain name you will need to include it here. The content of mine is: `www.valentinourbano.com`



Migrating

After much theory we reached the exciting part of the journey. Migration all of your posts automatically from your old wordpress blog to Jekyll.

My previous blog was on WordPress and I needed to migrate all of my posts. The current instructions to migrate a `wordpress.org` blog (so the self-hosted one) are bugged so here I'll be using the ones for a `wordpress.com` installation which also supports the self-hosted version.

Go to your wordpress dashboard `http://www.valentinourbano.com/wp-admin/`. Go to `Tools` -> `Export` -> `Export All`. Doing that will download an .xml file to your computer containing all your articles. To automatically import it we can use the Jekyll Importer tool.

Open Terminal and install the Importer `gem install jekyll-import`. After installing it run:

```

ruby -rubygems -e 'require "jekyll-import";

JekyllImport::Importers::WordpressDotCom.run({

"source" => "FILENAME.xml",

"no_fetch_images" => false,

"assets_folder" => "assets"

})'

```

If it complains about missing dependencies run `gem install DEPENDENCY_NAME` for all the missing dependencies and try again. Once finished you should have a few folders, including a `_posts` folder with as many `.html` files as you had articles. A `_pages` folder with all your pages. And in the end and `_assets` folder with all your images.

You can copy those to your Jekyll site and overwrite the default skeleton folders in there and your site should work as it did before with all the posts.


Converting the HTML files to Markdown


Below is illustrated the manual process, I've built markdown.love to make all of this easier.

NOTE: Make a backup of your html files before trying this!!!

Having all the files in html though is not pretty so I tried to find a way to convert them all to markdown. Most of the solutions I found did a terrible job at it and completely broke the formatting in most of the articles. The only decent one i found is in javascript so you will need to have `npm` installed.

Install NPM

Download `node.js` for your OS, after installing it update it to the latest version of `npm` by running `sudo npm install npm -g`.

Install HTMLMD

The tool I found is called `htmlmd` , install it running `npm -g install html-md`. I also wrote a very basic script to automatically convert all of the content of a folder from html to markdown.

- Open Terminal and navigate to the desktop (this is assuming your `_posts` folder is on the desktop)

- Copy this script to your desktop and call the file`scanFolder.sh`


for entry in *

do

htmlmd "$entry" > "$entry.md"

done


- In terminal navigate to the `_posts` folder: `cd _posts`

- Run the script: `bash "../scanFolder.sh"` or wherever you saved the script

- Wait until the script finished (it will take a long time if you have more than a hundred posts)

- Delete all the .html files (just sort them for extension)

- Copy the folder back to your Jekyll installation and run the server

- Check if everything's ok, if it's not restore the html from the backup [And let me know if you find a better converter]

contact: email - twitter / Terms / Privacy