Website migration – whether to a new version of your content management system or to an entirely new platform – is among the most stressful projects for an enterprise, one that will challenge the team on every assumption and analytics they have. Before starting down this path to a website migration, there are two key questions you will want to answer:

  1. Does automating the process save me time?
  2. Does automating the process save me money?

Below is our guide to go about answering these questions. Of course, every scenario is different and requires evaluation and analysis before the final decisions are made.

Determining an Approach

When determining the approach for migrating content, there are several factors that come into play when architecting out a solution to find out which one would be best.

How many pages are there? If your website is relatively simple, and there are a small number of pages (<500), it is likely going to be easier/faster/cheaper to copy/paste your content from one website to another. The number of pages will need to be determined by you, but we have found that migrating sites with more than 500 pages start to become more efficient with migrations scripts.

Importance of Site Crawls. Crawling the existing website is a critical step in determining the approach to take for a website migration. This will give you a quick summary of the volume of pages, images, and documents. Tools like Screaming Frog will allow you to export the list to a spreadsheet where you can create a more thorough inventory and conduct a ROT analysis (Redundant, Outdated, Trivial) on your content to determine how much of this content actually needs to be moved to the new platform.

Where do the current pages live? Is the content that you are migrating in a database somewhere or does it live in static HTML files on a server? Where the content lives is important because it limits the options available for importing and working with the content. It also determines the structure of the content.

Database Content. Content that is stored in a database is typically more structured and logically separated. This gives more flexibility when writing migration scripts because it is already in a consumable format. It is unlikely that this will be an “easy” 1:1 mapping from point A to point B, but the first step of getting the content into a consumable format is done.

Static HTML. Content that is in a static HTML format is going to be harder to work with. In these cases, you will likely need to use a web scraper tool to break apart your pages and get them into a consumable format for your migration to Drupal. You will be faced with numerous edge cases based on how each page is built. This process may require a lot of trial & error in order to get right.

How well are the static pages structured/formatted? In any website migration, consistency is key to making the process effective. You will want to determine if the pages follow consistent patterns so that you can create a repeatable process. If the pages have no uniform formatting or consistent markup, the task of creating a repeatable process of a migration script will be difficult and more time-consuming.

Does the content have references to media (images & files)? If your content contains references to media, this adds another step in the process. Your migration scripts will need to not only handle the migration of the assets but also alter the markup to replace the links/references to these assets.

What tools are available on the platform I am building the new website on? Most modern CMS platforms provide some level of migration support for getting content from point A to point B. A majority of the work that we do is in WordPress and Drupal, below is a quick list of migration options for each platform.

WordPress Migration Options

  1. WordPress All Import Tool
  2. WordPress Import Tool (Blogger, BlogRoll, LiveJournal, RSS, Tumblr, WordPress)
  3. FG Drupal to WordPress
  4. HTML Import 2

Drupal Migration Options

  1. Drupal 8 Migrate
  2. Drupal 7 Migrate
  3. Feeds Module
  4. Content Import

Other Useful (Platform Agnostic) Tools to help with content cleanup:

  1. htmLawed
  2. PHP DOM Manipulation
  3. Site Sucker
  4. Example Python Web Scraping

Determining the right solution

Now that you have all of the information you need, you can answer the two questions:

  1. Does automating the process save me time?
  2. Does automating the process save me money?

You are now equipped to make an informed decision on the approach you should be taking. You have a good grasp on what your source content looks like, how easy it is going to be to work with and what tools you have to give you a kickstart. Your next step is to crunch some numbers and get some high-level estimates on LoE for writing these migrations.

Trying to understand if website migration is the right approach for your organization? Bluetext can help.