Ditch the spreadsheet for content analysis.

Rules for Content Migration and Large Scale Improvements: Panning for Gold

Last updated March 30th, 2019. First published

Key Points

  • Migration and large scale content improvements is a great time to sift through your content.
  • Define rules that you can use to make decisions about your content during migration and later.
  • A content rule has a condition (the "if") and the disposition (the "then").
Related resource
Dispositions Cheat Sheet | Use this sheet on your projects.

 One of the primary times that content is handled at a large scale is during the migration from one system to another. But whenever you are making wide-ranging content changes, you should make decisions about your content using rules. 

Anatomy of a rule

Each rule has two components:

  • The conditions (the "if").

  • The dispositions (the "then").

Rules are things like:

  • if the content is a blog post, then move it as-is

  • if the content is a press release, then delete it

  • if the content is a white paper and it hasn't been read in the last three months, then delete it

  • if the content is a white paper and it has been read in the last three months, then reformat it

Using rules

Using the rules happens in three steps: 

  1. Get a full list of all your content.

  2. Define the ruleset (a set of rules).

  3. Apply the rules to the content. 

These steps are then iterated. For instance, let's say we had the rules above and 5,000 pieces of content. On the first pass we might get something like this: 

  • 1,000 blog posts will be moved as-is

  • 1,000 press releases and 1,000 white papers will be deleted

  • 2,000 white papers will be reformatted

Perhaps the original assumption was that the white papers would be reformatted by hand, and it would take 1 hour per white paper. Suddenly you're staring at a large number of hours (2,000). There are a lot of ways you could go from here, but for the purposes of illustration let's say we come up with another rule: 

  • if the content is a white paper and it has been read in the last three months and it was published in the last three months then move as-is

  • Otherwise reformat

Then we might discover this would mean only reformatting 100 white papers. As I mentioned, there are a lot of ways of addressing this — the main point I want to make is that iteration is essential. 

How do you know you are using rules well?

  • You do not review every piece of content to decide what to do with it.
    Yes you need to make a decision about every piece of content, but that doesn't mean you need to inspect each one.
  • You can tweak and automatically re-run your rules across all your content.
    Decide to keep the last two years of blog posts rather than the last year? Seeing the impact of that should be automatic.
  • You can define rules on arbitrary information about your content.
    When making decisions, sometimes you realize another piece of information about content would help you make a decision.
Also: Dispositions Cheat Sheet

Why bother with rules? 

Apply in the system on ongoing basis

Why go through the effort of your migration only to have the content go stale immediately? If you decide that press releases older then four years old should be dropped during the migration, then do you really want seven year old releases kicking around your system two years after launch. Similarly, if you decide that pages that are regulation-driven and over one year old should be reviewed, then don't you want that in your new system? Of course, there's a whole other area of governance around actually being able to enforce these rules on an ongoing basis, but perhaps during migration planning is a good time to discuss this.

Better justify drops

In a simple (no bureaucracy) environment, justification is not really relevant. For instance, if I want to remove a page from this blog, I don't have to argue with anyone about it. But if you are operating in a large organization, you could wind up in endless discussions going nowhere by reviewing content items on a case-by-case basis (or, perhaps worse, just decide to throw everything into the new CMS again). That said, if you are dropping a piece of content because of a rule (any content of any type that has not been viewed more than ten times in the last year will be archived), it's much harder to argue with (of course, you would probably also need to come up with some sort of very tight exception policy).

More easily to agree first and then everyone do work separately

Related to the above comment, if you agree on the rules first, then you apply those rules and everyone can get cracking on whatever they have to do on the content. For example, if a rule states that a page needs to be updated because it mentions a particular highly negative incident, then the someone can start updating it rather than spending time talking about whether it needs to be updated. Related to this, having some rules in place would better fit into a web operations management framework so that existing teams could make high-impact decisions rather than the impossible task of getting mired in infinite details.

More easily identify disposition

Let's say you have that spreadsheet of 5,000 pieces of content. How do you start? You could just start at the top and work your way down, but how efficient will that be? If you have a rule like any content not updated in the last 8 years gets put into an archive site regardless of any other factors, then you can (assuming you have reliable dates) apply that rule and start working through the content in chunks like that. Note that the process of defining the rules probably means that you need to deep dive into your audit, but the point is that with rules in mind you will quickly see patterns that you can apply to quickly identify the disposition for different content.

Better move content

By looking for patterns, some commonalities may be relevant to deciding how to move content. For example, you may notice that content over a certain age used an old HTML template you forgot about, but that could be scraped easily. Or you may realize that your old working papers can just be moved in as-is rather than needing any manipulation.

Explain to end users if appropriate

If you have these rules in place, then it may in some cases be relevant to your end users. For example, if certain types of content go into an archive after a certain period, you can indicate this to your users. This also gives you a means for public feedback on your decisions.

Reapply rules once you realize you don't have the resources

While going through this process, hopefully you can also take some wild guesses at the effort it will take to deal with different types of content. If working in your content audit spreadsheet, you could always have a running table of the total editing effort. If the effort is high, then you can re-evaluate assumptions, change quality levels, and many other options. One option is to migrate in less. If you have applied rules, then you can tweak the rules and then see the impact on the projected effort. If you went in and did your analysis in a piecemeal manner, then you would now need to go through and have all those negotiations again about what's moving and not.

Dispositions Cheat Sheet Use this sheet on your projects.