loading words...

Jun 28, 2019 15:27:38

Migrating Off WordPress

by @valentino | 284 words | 60🔥 | 358💌

Valentino Urbano

Current day streak: 60🔥
Total posts: 358💌
Total words: 170604 (682 pages 📄)

This is a continuation of my series on migrating my web site off WordPress and onto Jekyll

Cleaning up the mess left by WordPress

I've used the [WordpPress to Jekyll][11] exporter to convert the articles to Markdown. Once you export all of the markdown files you need might think that the work is done and you just need to drop them into your new Jekyll site and build it. Almost, but not quite.

Metadata

Every single header of every blog post was messed up by some tag that each WordPress plugin or WordPress itself decided to add for its own reason. This was so bad that even the 'author' field was scrambled and Jekyll couldn't properly parse it so it wouldn't show the author in any of the articles if you tried publishing them if it would even build.

A normal Jekyll header looks like this:

---
layout: post
title: Information overload
date: 2012-02-05 18:19:03.000000000 +01:00
type: post
published: true
status: publish
categories:
- Tech
author: Valentino Urbano
---

The headers from the WordPress export looked like this:

layout: post
title: Information overload
date: 2012-02-05 18:19:03.000000000 +01:00
type: post
published: true
status: publish
categories:
- Tech
 -tags: \[\]
 -meta:
 -\_wpcom\_is\_markdown: '1'
 -\_edit\_last: '1'
 -\_yoast\_wpseo\_focuskw: ''
 -\_yoast\_wpseo\_title: ''
 -\_yoast\_wpseo\_metadesc: ''
 -\_yoast\_wpseo\_meta-robots-noindex: '0'
 -\_yoast\_wpseo\_meta-robots-nofollow: '0'
 -\_yoast\_wpseo\_meta-robots-adv: none
 -\_yoast\_wpseo\_sitemap-include: '-'
 -\_yoast\_wpseo\_sitemap-prio: '-'
 -\_yoast\_wpseo\_canonical: ''
 -\_yoast\_wpseo\_redirect: ''
 -author:
 -login: Myshar
 -email: [email protected]
 -display\_name: Valentino Urbano
 -first\_name: ''
 -last\_name: ''
  ---

I made a really dumb and inefficient script to clean them all. I'm not going to share it for now, because it's tailored to my installation and it might do huge amount of damage to your posts if their structure is different from mine. Since things like installed plugins (and more) affect the structure of the export the chance of a mistake is quite high. I don't want that responsibility.

What I'm going to do is give you the structure of the script. It scans the file, and stops one line after categories (I wanted to keep only one category per post, so I'm ditching all the others). It replaces everything before --- with author: Valentino Urbano. Nothing more.

Fortunately all of the other tags did export correctly so they didn't need any tweaking.

contact: email - twitter / Terms / Privacy