Ditch the spreadsheet for content analysis.

Four Things You're Doing Wrong In Migration and Content Transformation Planning

Key Points

  • Don't do line-by-line inventory analysis for all content.
  • Don't think about migration / content transformation late.
  • Don't approach from a single approach or discipline.
  • Don't conflate decision and action.
Related resource
Dispositions Cheat Sheet | Use this sheet on your projects.

Most content migrations and content transformations are planned and executed poorly. This leads to:

  • "Surprises" that should have been anticipated.

  • Inefficient and wasteful execution.

  • Inflexible approaches that are less able to respond to truly-unforeseen issues.

  • Work spent on what's expedient rather than what's important. 

  • Launching with poor or missing quality.

  • Missing opportunities to improve content.

  • Not leveraging the migration toward longer-term quality and approaches.

Note: this advice is not for small sites.

Many of the things I highlight as wrong below may be just fine for small sites. But they are wrong for medium to large websites.

I'd like to highlight four pervasive migration issues — you're doing it wrong if you're doing these:

  1. Line-by-line inventory analysis for all content

  2. Thinking about migration / content transformation late

  3. Approaching from a single approach or discipline

  4. Conflating decision and action

Don't do this for all your content, unless you have a small site.

1. WRONG → Line-by-line inventory analysis for all content

This is a typical "migration planning" approach:

  1. Run Screaming Frog or some other spider to generate a spreadsheet of URLs along with other data.

  2. Maybe merge in some data points (like Google Analytics).

  3. Pass off the spreadsheet to stakeholders to review the spreadsheet to decide what to do with the content.

This is extremely common and can feel like progress: those working on the spreadsheet can really feel like they have put in some sweat toward the migration effort. Also, between teams it's a straightforward method of the team that will do technical work being able to put "blame" on the folks making decisions in the future.

But this approach is backwards for a ton of reasons, including: 

  • There's no way to revise any decisions that are made! If someone's gone through a thousand lines of a spreadsheet for their section of the site, the last thing they're going to want to do is do that again when they decide to make decisions in a different manner.

  • It is extremely poor for consistency across an enterprise or large organization. One person may be evaluating one way, and someone else in an entirely different manner.

  • It confuses business goals and the concerns of content owners. Sometimes of course these will be aligned, but especially if you are making large-scale and dramatic changes the various content owners may not yet be in sync with the overall business goals.

  • Inevitably it's tough to keep changes (especially between teams) in sync.

  • Related, it's difficult to see the big picture.

  • It's extremely slow.

  • It's painful.

Instead, take a big picture view (still making decisions about all content)

Very few people are going to understand a website by staring at a spreadsheet. Sure, throwing a huge Excel and someone may be impressive and show them it's a large site, but even that is probably misleading (since it may have many of what are essentially different URLs for the same content). 

We need to take a big-picture view of our site(s), both so all stakeholders can understand the situation and make informed decisions. Also, we should make decisions in a way that's good for the organization overall. 

Instead of line-by-line inventory analysis: 

  • Use charts to explore content.

  • Use rules to make decisions about content.

Good practice → Use charts to explore content

The most straightforward chart is one of the most useful: a bar chart showing the relative size (by page count) of website folders. This is something you can do in Excel or other tools.

Bar chart from Content Chimera, showing the relative sizes of website folders

The above example took out redundant URLs for the same content, changed the scales to show useful numbers, exposed the actual numbers above the small bars, and represented the long tail in an "other" section for better summarization, but even a straightforward Excel chart is far better than dumping a raw spreadsheet on someone. 

Ideally you go further than static charts to those that you can quickly explore, both in changing what is displayed as well as highlighting items or being able to drill-down or sample the underlying inventory (so bridging the detailed world and the overall patterns). In the example below, we see how hovering over legend items changes what's highlighted in a scatter chart showing pageviews vs. content count per content type:

Example scatter chart in Content Chimera, showing the relative effectiveness of content

One problem with Excel or Google Sheets specifically is that the data ends up getting divorced from the chart. Business Intelligence tools like Microsoft Power BI, Zoho Analytics, and Tableau provide the ability to more dynamically explore data (such as with drill-downs).

A couple related webinars: 

Good practice → Use rules to make decisions about content

When it comes time to make decisions about content, line-by-line analysis is not effective. Instead we should use rules to make decisions. Note that some of the rules may simply amount to "review this section of pages line-by-line," but we aren't reverting to saying that all the content requires this level of analysis. 

In the example below we have defined conditions (a page is in the articles folder and has HTML tables) and what to do with that content (for example, rewrite that content): 

Example simple rules in Content Chimera, using specific conditions to assign treatment

Using and applying rules is tough to do in normal spreadsheets (although for example I have hacked together rules in Google Sheets custom scripts), but to start you could use filters in Excel. 

Getting back to charting, you can then chart what's going to happen to the different content: 

Sankey diagram in Content Chimera, showing what treatment content is going to get.

Also see Rules for Content Migration and Large Scale Improvements.

Some common misunderstandings about this approach

These are some things I am NOT saying: 

  • Never look at a spreadsheet. I've probably spent far more time in spreadsheets/tables than most. Sometimes you need to look at the details. For some key content you may need to make line-by-line analysis. When actually executing on content it may make sense to walk through a spreadsheet as work progress. BUT the spreadsheet should only be used when really necessary (which isn't that often!).

  • Everything can be automated. Certainly some manual evaluation is sometimes needed. BUT for a lot of content on a site this is probably not needed. Also, sometimes you can evaluate a sample of a chunk of content to make a decision about the content overall. 

  • Take away content owner autonomy. The content owner still owns the content. But they need to be engaged with overall goals for the organization, and the decisions need to be aligned with that. Instead of arguing about specific pieces of content, the content owners can work to help form the global organization's goals and ensure their content meets that. 

  • Moving to another model will be easy. Everyone is quite comfortable with the process of scraping content and then working in a spreadsheet. Moving to the model I suggest above will take some learning and change.

2. WRONG → Thinking about migration / content transformation late

With respect to content transformation, the approach for developing websites is generally in two steps: 

  1. Do the work of strategizing, designing, and building the infrastructure of the site.

  2. Flesh out the site with content and launch.

This is especially true of multi-site rollouts (see Global Rollout and Migration and The "Build It And They Will Come" Fallacy).

Notably, step 1 does not usually include much on migration. For agencies to reduce their risk, they may cap hours for migration, shifting the risk to the organization that owns the site(s). 

Putting off migration / content transformation planning late in the process has a couple significant issues: 

  • Radically increases risk to the project, both the probability that it happens and how bad it would be if it occurred (if migration thinking doesn't start until the end, then if you have a problem it is very likely to impact launch).

  • Overlooks opportunities for optimization.

Good practice→ Start thinking about migration / content transformation at project inception

Migration planning is a two-way street: 

  • In one direction (the one that people usually concentrate on), we need to figure out how to get content from where it is now to the technical infrastructure that's been set up for the new site.

  • But there's another direct: we need to figure out how to set up the infrastructure so that it accommodates what we need in the content transformation!

Even more broadly, we need to: 

  • Consider how the content got to its current state, especially if you're going to need to do significant rework.

  • Think about target content structures that will better support our content, and will enable easier content transformation.

  • Think about templates in a way that is accommodating of content that may not totally fit the target state (or you may specifically decide to have everything conform to a precise new standard, but that should be specifically decided early rather than late).

  • What types of implementation partners do you need for the process? If you don't plan early you may be requesting help from the wrong types of vendors. For instance, if you have a lot of rewriting then you may need to hire writers, but if you need a lot of technical lifting then another type of vendor should be involved. But notably this should be decided before you reach out to potential vendors so you are clear on what you are requesting of them. 

Here are some of the types of migration planning activities you may wish to undertake as part of the larger project:

  • Prior to visioning: doing a quick analysis to understand the texture of existing content (especially after de-duping redundant or irrelevant URLs) and potentially scraping for common migration issues. For organizations, this can be useful to understand their own presence. For agencies, this type of analysis can be useful for preparing a strong bid.

  • During visioning / strategy: consider how the content got to its current state, and how to avoid that going forward. From a migration perspective, make sure you are not proposing a new structure that you could not fit your content into.

  • While assembling teams/partners/vendors: have an idea of what your content transformation is going to look like, so you can get the right type(s) of vendor(s) and clearly articulate what you need for a stronger and more accurate bid. Ideally you get buy-in from stakeholders, clearly communicating what will happen with the content and the big picture implications around effort.

  • During development: more detailed migration planning, and develop in a manner that will better accommodate the migrated / transformed content. Also, you may wish to sequence in such a way that the templates are ready for the "harder" content early so you can get a headstart on that. In general, hopefully you can overlap development and the actual migration efforts.

Also see: Migration As Part of the Larger Project.

3. WRONG → Approaching from a single approach or discipline

Every once in a while someone writes an article that proposes you throw away all content and start from scratch. These are obviously very popular articles, and gets everyone excited. Other articles on migration make the assumption that the migration will be a "lift and shift" type of copy of content. This often boils down to key stakeholders' skillsets: people with a writing background see the problem as a writing exercise, and those with a technical background see it as a technical problem. In other words, projects get dictated by the hammer that people see as the solution to the current problem (note that this is one major reason to talk about migration early, so you get the right team members in place to accomplish what you want, including in your RFP(s) for development). 

Quite simply, the optimal approach is rarely a black-and-white approach, but requires a nuanced approach. 

Note: sometimes a single approach makes sense, but don't mistake this to mean every situation should be handled in the same way. 

Good practice → Tune your efforts using whatever approaches lead to the best results

The most concrete way to counter the natural bias toward the type of change you most understand, consider a scale of potential treatments of content (of course using rules to assign content to treatments). For instance, here's a possible scale of dispositions from lower manual effort to higher manual effort: 

From the Dispositions Cheat Sheet

In conjunction with early planning and using rules to make decisions, assign your content to make categories than just Drop and Move As Is. The above is just an example scale, and your specific case may require a different blend. 

Also see: Delete Better and Dispositions Cheat Sheet.

4. WRONG → Conflating decision and action

A typical approach is for decisions about content (what to do with the piece of content, such as the decision to rewrite a piece of content) to be made at approximately the same time that the action is taken (actually doing what needs to be done, such as actually rewriting a piece of content). 

The leads to many problems: 

  • No one has an overall understanding of the migration effort until it's done.

  • The wrong people making decisions and decisions are not made from an organization-wide perspective.

  • More difficult to track progress.

  • Different people's definition of "best effort" leads to frustration.

Good practice → Decide and act as different steps

Yes, you need to make a decision about every piece of content. But you can make the decision about content separately from taking action on it. 

We can break this down into three aspects: 

  • who understands the decision/impact

  • who makes the decision

  • when the decision is made

Fundamentally, we need to have a way of understanding the big picture of the migration before it happens, some decisions need to be made centrally, and the decisions can be made early.

The diagram below illustrates: 

  • Decisions can be made in phases, but there is a point when decisions have been made about all content.

  • Decisions are made consistently and centrally (or weaved in from local decisions, but the decisions are made separate from action regardless)

  • At this point, we can communicate the big picture plan (and breakdown of how much content will get which treatment), and get buy-in or revise the plan before proceeding.

  • Different teams can act on what has already been decided about the content.

A model for separating decision and action.

Also see Decide. Don't Inspect.

Are you following good practices?

Strong content migration checklist

  • Take a big picture view (still making decisions about all content).
  • Start thinking about migration / content transformation at project inception.
  • Tune your efforts using whatever approaches lead to the best results.
  • Decide and act as different steps.

As mentioned at the beginning, if you have a small site then perhaps the old way is fine. But generally we should be moving to a more effect way of doing content transformation planning.

Need help moving to more effective transformation planning?

Contact us. We can either get you set up on Content Chimera or provide migration planning as a service.

Thanks to Caryn DiMarco for catching a couple of typos in the original version of this article.

Dispositions Cheat Sheet Use this sheet on your projects.