Converting WordPress to Hugo

Continuing the discussion from Hugo 0.20 Released: Custom Output Formats!:

There is a WordPress-to-Hugo plugin, but it is basically a copy of the Jekyll Exporter, and many folks suggest using the latter to export, and drop those into your Hugo site.

I was curious as to what that would output, so I ran it on my site and got ~1,200 markdown text files, and spot checking them I realized that my site is all kinds of wonky!

The Jekyll exporter grabs metadata fields from the posts and loads it up in the frontmatter. I know I’ve spun off and remerged my content over the years (going back to 2007 in WordPress), but looking at my frontmatter I can see why some basic features don’t work on my site now. I have GUIDs from other sites, and loads of custom fields on months at a time, and then some posts with just a title and date.

As the exporter only handles the text and no other assets, I am going to need to go through each of these posts and make sure my images are working as expected, but really I need to clean up the frontmatter. And that is just as well, because I’ve long wanted to actually reorganize and redirect a bunch of my content; over the years I’ve had too many posts that include an updated post link that sends folks four clicks to find the most current info, and I can now properly file those under docs or wherever.

I know that I will be converting at least one of my blogger clients to Hugo this summer, so if anyone has WP to Hugo specific questions, this thread is available (in addition to the Hugo forums). :slight_smile:

1 Like

What is frontmatter? Middleman: Frontmatter ?

Could you drop some example of your frontmatter being all over the place? I’m curious =)

In short, frontmatter is a block of text at the top of text files that various static site generators read for metadata. Stuff like title, publish date and tags. But also anything you want, which you can then tell your generator to do something with.

As the exporter grabs meta fields from WordPress, converting to text revealed values for my posts that were not apparent from the UI. Here are some examples, with a variety of examples:

---
id: 786
title: White Light
date: 2007-11-02T02:06:46+00:00
layout: post
guid: http://cloud.interi.org/2007/11/02/white-light/
permalink: /2007/11/white-light/
categories:
  - journal
---
---
id: 55
title: 'A tough day…'
date: 2008-01-15T20:30:00+00:00
layout: post
guid: http://interi.org/2008/01/16/a-tough-day
permalink: /2008/01/a-tough-day/
categories:
  - journal
---
---
id: 4418
date: 2012-09-17T17:21:23+00:00
layout: post
guid: http://interi.solanin.net/?p=4418
permalink: /2012/09/4418/
categories:
  - journal
format: status
---
---
id: 6094
title: Violence in games
date: 2013-02-07T00:07:27+00:00
excerpt: Struggling with violence in games, and how I want to deal with it.
layout: post
guid: http://interi.org/?p=6094
permalink: /2013/02/violence-in-games/
categories:
  - Game Notes
  - journal
  - notes
---
---
id: 7532
date: 2015-06-07T08:41:35+00:00
layout: post
guid: http://interi.org/?p=7532
permalink: /2015/06/7532/
standard_seo_post_level_layout:
  - ""
standard_link_url_field:
  - ""
standard_seo_post_meta_description:
  - ""
categories:
  - journal
format: status
---
---
id: 7627
title: Discourse spam
date: 2016-06-07T10:06:11+00:00
author: maiki
excerpt: Discourse handles spam well, and follows through with all the actions one needs to block it.
layout: post
guid: https://interi.org/?p=7627
permalink: /2016/06/discourse-spam/
mf2_cite:
  - 'a:1:{s:6:"author";a:0:{}}'
prompt_no_email:
  - "1"
prompt_no_featured_image:
  - ""
prompt_excerpt_only:
  - ""
prompt_recipient_ids:
  - 'a:1:{i:0;i:2;}'
categories:
  - journal
---
---
id: 7837
date: 2017-01-09T03:50:31+00:00
author: maiki
layout: post
guid: https://interi.org/?p=7837
permalink: /2017/01/7837/
prompt_no_email:
  - "1"
prompt_no_featured_image:
  - ""
prompt_excerpt_only:
  - ""
prompt_recipient_ids:
  - 'a:2:{i:0;i:2;i:1;i:8;}'
categories:
  - Notable
---

The additional cruft in later posts is expected, because those are from plugins that I turned on for months or years at a time, but some of the older posts are from sites that I merged in, and I was surprised to still see the GUID showing their old URLs. With the REST API being promoted, I wonder what those will do to assumptions on rendering content in other ways; GUID is normally a strong value to key off, but for older sites like mine it won’t be.

And those were pulled over in the WordPress export file (WXR), so this isn’t a maiki-specific issue, like so many of my challenges in computing. :slight_smile:

As my site will be in flux in the immediate future, I am going to detail my process here; I’ll do a post-mortem once it is up and running.

Okay, I have two goals right now, and they are butting heads, so I’ve got to figure it out:

  1. I am applying for more gigs/jobs, and need a portfolio to reflect my work.
  2. I am re-making my site.

I don’t want to wait to apply, so I need to make sure my refresh doesn’t get in the way.

The essential first step is to move my current site onto a sub-domain, so I can still operate on it. https://archive.interi.org seems like a good candidate. That way I can just rewrite the base URL, and everything will still be available. I do this all the time when migrating servers, so I’ve got that down.

So let’s think about portfolios… I could build something in Hugo, but that is taking me a while, and I don’t want to rush that. My thinking is putting up a one-page landing site that introduces me and maybe explains I am rebuilding my site, but pointing to another site for viewing my work. Something like https://portfolio.interi.org.

As for that actual site, I should probably focus less on how I want to ultimately have it look and function, and think of it as marketing, which it of course is. And that means using a WordPress theme I can load with content quickly. I can always redirect that sub-domain to the main portfolio, once I figure out how that will work.

And for the actual apex domain, I will look at one of the one-page Hugo themes so it isn’t naked. Again, this isn’t about me expressing myself and my thoughts, but just a marketing interval.

And if I get those done I should be able to relax while I apply and work on my site. :slight_smile:

After getting too bogged down by the “refresh” side of this, I’ve decided to just replace the current site with a one-page so folks can get a hold of me:

https://allthe.codes/maiki/interi/issues/26

Once I get that out of the way I can continue to generate work, while not feeling like crafting my website is costing me work…

Also! After writing the last reply here, I decided that I am going to build a hybrid site system, using Hugo on the main domain, and using WordPress at https://app.interi.org. There are so many uses for WordPress that make sense for me:

  • Forms: contact, surverys, onboarding, support, etc. So many forms!
  • Search index: there are actually a few options for handling search in Hugo, but they either involve loading the entire site index for a client js library to parse, or a third-party index service. I guess I’d call this a first-party index service…
  • Data source: Mika has written about both serving and embedding data from WP into Hugo, and I already have some excellent uses for this.
  • Support site: for my client work it is useful to have an authenticated way to interact with resources specific to them.
  • Billing: I currently use Freshbooks, but in the past I’ve used WordPress for payments, and this remains an option.
  • Media library: while I dig Hugo for site generation, and I tend to use substantially more words than images (even at a thousand words per image ratio!), I don’t like the idea of keeping my images and videos in version control. Even with Git LFS, my assets are just too large for me to track them that way. Also, I am super happy with the post-processing hooks I use when uploading media in WP. For instance, I can smush images and rewrite URLs for my CDN, making it easier to just make a Hugo shortcode generator into the theme, for easy media inclusion.

I can’t exactly recommend this course for most folks, but it certainly hits all my needs. :slight_smile:

I’ve made the base changes of moving the former WordPress site to https://last.interi.org, and have started publishing new content at https://interi.org.

I still have a bunch of issues, but I feel like I am much further along now.

Also, checkout that CI deploy script! I will break that down eventually. It handles all the SSH stuff so I can build and copy the site over to the web server securely with rsync.

I still need to tweak rsync, to get it to move files without updating the timestamp of the destination files on each upload (I’d like unchanged files to not be updated).

1 Like

Hm, that’s strange. You’re using the a flag, which as I understand it is ‘archive’ mode. It’s not supposed to alter destination timestamps.

Also, the lack of RSS makes me a little sad.

That said, I LOVE the crazy-small payload of your site. 2.4K?! Yessss.

https://allthe.codes/maiki/interi/issues/64

I need to add a redirect, but I am still deciding on how I want to add .htaccess to the server; on one hand it would be useful to keep it in version control, but I have this aversion to shipping server-specific config in my repo…

I am gonna give myself a break and just do that. It is silly to be uptight about that, and if nothing else that config should follow the site around.

Archiving the archive goes into details of how I added the ~1,200 posts to my Hugo site’s content, without rendering/publishing it. Getting there! :slight_smile:

In a twist no one saw coming… I am going back to WordPress!

I still like Hugo, and I will be using it in future projects, specifically for brochure-type sites that don’t need a database waiting behind it to get hacked. But for my own blogging use, WordPress just has too many advantages:

  • Media library
  • Commenting/forms/search
  • Data-driven views that can do math in real-time
  • I just know how to customize it really well

I am super happy about this experiment, though! It has taught me two important lessons: first, I need a professional space where I can work towards my goals of making tech easier and training folks on how to be collaborative information workers, and second, I need a space to tinker and talk nonsense about games and Clover.

So I’ll be going back to an older plan and relaunching https://interi.org, while moving a bunch of my experimenting and blog posts to https://maiki.blog.

2 Likes