- In migrations, URLs change. You need to redirect from old URLs to new.
- Don't just have a long list of one-to-one redirects, but define logical rules.
- Ideally your redirect engine also reports on redirects and 404s, to help improve redirects.
Why redirects during a migration?
Your site, before the migration, has a set of pages with URLs. Let's say you have these URLs:
The corresponding new URLs are:
The default, core CMS in the new system will only serve the right pages for the new URLs. But meanwhile Google search results, other websites, and bookmarked links to your site are to the old URLs. You need to redirect from the old to the new so you don't get 404s.
A redirect is when the browser requests a URL but the web server tells the browser to go to another URL instead. This is different from a rewrite, which is where the server looks at the URL but (transparent to the user and browser) returns the contents of another file.
Why do URLs change?
Although it's theoretically possible that none of the URLs will change during a migration, there are lots of reasons that you end up having changed URLs, including:
- There's a new URL strategy, for instance, one to improve SEO.
- Content is deleted.
- Content is reformatted (for instance from PDF to HTML).
- Content is restructured (what was on multiple pages is now on one or vice versa).
- The overall URL strategy stays the same but the components change (for instance, if your URL contains taxonomy terms then perhaps those change).
- The CMS has a default way of working, that it isn't worth the effort of changing.
Sometimes content is so unused and unimportant that we don't care if we have a redirect for the old URL. But usually it's better to redirect to something. For instance, it might make sense to redirect all old article URLs to the articles listing page.
Why a redirect ENGINE?
You could hard code (for instance in the web server configuration) all the one to one mappings of the old URLs to the new (https://davidhobbsconsulting.com/how-to-articles/rules-content-migration-panning-gold goes to https://davidhobbsconsulting.com/articles/rules-content-migration-panning-gold, etc) but this has a lot of problems:
- This doesn't make any semantic sense (when, for example, all how-to-articles/* should go to articles/*).
- Related, this is tedious (impossible?) to maintain.
- There is no control back in the CMS, where you manage the content, to control the redirects.
- This provides no reporting (although you could do web server log analysis).
- Not in any kind of version control.
A strong redirect engine
Rules basedNot only one-to-one mappings
Makes semantic sense
Provides reportingPreferably differentiating between robot and user traffic, and ranked by frequency
Redirects can be version controlled
Can handle all domains being migrated into the new CMS
Some components can be handled by content editors in the CMS
Handles different types of redirectsURL structure changes, title changes, etc
The lifecycle of a redirect
Before starting on the lifecycle of a redirect, consider that there is a primary domain for your new website (for instance, davidhobbsconsulting.com). You may be merging other domains into your new website (for instance, merging go.davidhobbsconsulting.com into davidhobbsconsulting.com).
In general, as I often write about, I am against one-offs. Especially those that are solely done as one-offs to get around current technical limitations (rather than resolving the underlying issues). But if you do a one-off, then at least make sure you own the domain for the URLs (rather than using the domain of another service that you use). For instance, in a moment of weakness I used another service to create landing pages, but at least I made sure to use go.davidhobbsconsulting.com for those pages since I would at least have control over redirects in the future.
There are three layers to a redirect engine (with only one domain omit the Redirect Server layer):
- Redirect Server. This is a server that responds to all the other domains, and redirects to your URLs on your new domain (for instance, http://go.davidhobbsconsulting.com/microsite-checklist/ to https://davidhobbsconsulting.com/resources/microsite-checklist).
- Core CMS. This is the CMS just naturally doing what it does responding to URLs and serving content. Notably, it doesn't know anything about the old URLs (note: sometimes a CMS might support some old URL to new URL redirects — in this case the CMS would do some redirecting).
- Bail out redirector. This is the core of the redirect engine. It could be in front of the CMS (in which case it really isn't a "bail out" redirector), but logically it is the redirector that knows how to change old URLs (on the primary domain) to new URLs.
How to decide what to redirect
There are primarily two ways to decide what to redirect:
- Your content manifest. Ideally, before migration, you have a complete manifest of all the content and the decision of how it will be treated during the migration. This can be used to define redirects.
- Live reporting. Especially upon launch, it is extremely useful to ensure you have a dashboard to watch out for 404s. This can be done in Google Analytics or from the redirect engine itself. But at least this way you are able to respond to live problems to correct them.
In general you'll want to do 301 (permanent) redirects. An exception might be when you are truly temporarily redirecting as a brief step in a migration — in this case use a 302 (temporary) redirect.