9.1.18

Pagination Tunnels – An Experiment in Crawlability and Click Depth

SEO pagination tunnels and click depth visual exploration

We’ve all seen pagination links — those little numbered links at the top and bottom of multi-page content. They are used on blogs, e-commerce sites, webcomics, gallery pages, SERPs, and multi-page articles.

Simple pagination examples

31 Flavors of Pagination

 

From the human visitor’s point of view, pagination is pretty simple. If you are on page one, and you want to see page two, you click “2” (or “next”, or whatever). You don’t really have to think about it. From a web crawler’s point of view however, it’s a bit more complicated.

A search engine crawler's perception of a small website

A search engine crawler's perception of a small website

Web crawlers find new pages on a site by following links from pages they have already crawled. The crawler doesn’t know about a page until it has found at least one other page that links to it. (This is an over-simplification, but is generally true. There are some exceptions, like XML sitemaps and direct submission, but we’ll ignore those here.)

For sites with a simple tree-like structure, this works pretty well. The crawler reads the home page. Then it follows links from the home page to (for example) each of the top-level category pages. Then it follows links from these to the secondary category pages, and then from these to the content pages. In this simplistic example, the crawler can get from the home page to anywhere else on the site by following at most three links.

A very small website SEO crawl tree

An even smaller website crawl tree

But now let’s look at an example with paginated content:

Simple “Next” Link

Suppose you have a website that contains a sequence of 200 numbered pages. For purposes of this example, it doesn’t really matter what kind of pages they are. They could be product listings, or blog posts, or even a single article split into 200 pages. (Please don’t actually do that.) What matters is that there are 200 of these pages, and they are numbered sequentially from page 1 to page 200.

For this first example, let’s assume these pages are connected by the simplest pagination possible: a single “next page” link at the bottom of each page:

next page

 

This scheme is as simple as it gets. If you are on page 1, and you click this link, you will be taken to page 2. If you click again, you will be taken to page 3, and so on. If you click repeatedly for a very long time, you will eventually get to page 200. This scheme is fairly common in the real world, mostly on blogs (typically with the link text “« Older Posts”). It is not as popular as it used to be, however. (for reasons that will become apparent below)

From the crawler’s point of view, this site looks like this:

Crawl Tree: Simple “Next” Link Pagination

This chart shows the discovery path that was followed by a web crawler as it crawled a simulated website. In this case, the simulated website had 200 numbered pages connected with a simple “next” link on each page. (There were also some other, non-numbered pages on this site, but the numbered pages are what matter here.)

Each colored dot represents one page. A connection between two dots means the downstream page (the smaller dot) was discovered on the upstream page (the larger dot).

That long squiggly tail is a “tunnel”: a long connected sequence of pages that the crawler has to walk through one at a time.

The main thing to take away from this chart is that this form of pagination is extremely inefficient because it creates a very long pagination tunnel. This is a problem, because:

  • When content is buried hundreds of links deep, it sends a strong message to the search engines that you don’t think the content is important. The pages will probably be crawled less often, if at all, and they probably will not rank very well.
  • If just one page in that long chain returns an error (e.g. because of a temporary server hiccup), the crawler won’t be able to discover any of the other pages downstream.
  • Sequential chains can’t be crawled in parallel. In other words, the crawler can’t request more than one page at a time, because each page can only be discovered after the previous page has loaded. This may slow the crawler down, and may lead to incomplete crawling.
  • Human visitors will probably never reach the deepest pages at all, unless they are extraordinarily patient and persistent. If a human visitor does wish to see your deepest pages (e.g. because they want to read your first blog post), they are likely to give up in frustration long before they get there.

So, how can we improve this? How about…

Adding “Last” and “Previous” Links

Let’s make a few seemingly minor changes to the pagination links:

first previous next last

 

The important changes here are the “last” and “previous” links. Together, they give the crawler (or human) the option to step through the pages backwards as well as forwards. This scheme is also fairly common on real websites, especially blogs. To the crawler, this new site looks like this:

Crawl Tree: Added “Last” Link to Pagination

This is somewhat better. Now there are two tunnels, but they are each half as long. One of them starts at page 1 and counts up to page 101, and the other starts at page 200 and counts down to page 102. This cuts the maximum depth in half. This is better, but still not great.

Stepping by Two Pages

Let’s try something different. In this test, there is no “last” link, but there is a way to skip ahead (or back) by two pages. For page 1, the pagination would look like this:

1 2 3

For a deeper page, it would look like this:

23 24 25 26 27

If you start on page 1, you can jump to page 3, then to page 5, then to 7, and so on. There is no way to skip to the last page. I have seen this scheme on a couple of real-world websites, both huge online stores. (I’m guessing they chose to omit the “last” link for database performance reasons.) This site looks like this to the crawler:

Crawl Tree: Pagination Stepping by Two Pages

The most interesting thing about this chart is that it looks almost the same as the previous chart, even though the pagination schemes for the two are quite different. As before, numbered pages are split into two tunnels, each around 100 pages long.

The difference is that now the tunnels are split into even-numbered pages and odd-numbered pages. In the previous chart they were split into ascending and descending order. This raises the question: if each of these schemes cuts the maximum depth in half, what happens if we combine them both together?

Which brings us to:

Step by Two, plus “Last” Link

In this scheme, the pagination for page 1 looks like this:

1 2 3 200

And for deeper pages, it looks like this:

1 23 24 25 26 27 200

This is just the last two schemes combined together. This scheme allows you to skip ahead two pages at a time, and allows you to jump to the end and then work backwards. Most real-world websites use something similar to this (like the site you are reading right now, for example).

This produces the following chart:

Crawl Tree: Pagination Stepping by Two, plus “Last” Link

This has cut the maximum depth down to a fourth of what it was originally. This is a significant improvement. But why stop there? What happens if we go crazy?

Extreme-Skip Nav (Crazy Idea #1)

We’ve seen above that being able to to skip ahead by just two pages can cut the maximum crawl depth in half. So why not take this to extremes? Why not allow skipping by, say, eighteen pages?

In this scheme, the pagination for page 1 looks like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

And for deeper pages, it looks like this:

1 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 200

 

This allows the crawler (or human) to skip ahead by as many as eighteen pages at a time. It also still allows the crawler to jump to the end and work backwards, as before. This should reduce the maximum depth by quite a bit.

Yes, all those numbered links are kind of ugly, and they add a lot of clutter. You probably wouldn’t use this on a real website for that reason. That’s OK though, because this is just an experiment. Let’s just try it and see what happens.

The above scheme produces the following chart:

Crawl Tree: Extreme-Skip Nav Pagination

This brings the maximum depth down to just seven, which is a huge improvement. Unfortunately, this scheme is probably too ugly and visually cluttered for users to be a good choice for most real-world applications.

We need some way to achieve the same improvement, but with a pagination scheme that is more compact and easy to read. Such as…

Adding Midpoint Link (Crazy Idea #2)

In this scheme, the pagination for page 1 looks like this:

1 2 3 101 200

And for deeper pages, it looks like this:

1 12 23 24 25 26 27 113 200

 

Note that this is exactly the same as the “Step by Two, plus ‘Last’ Link” scheme above, except with two additional links inserted.

The “101” in the above example was added because it is the midpoint between 3 and 200, and the “113” because it is the midpoint between 27 and 200. In other words, the new link is based on the number you get by averaging the numbers immediately to the left and right of the “…” in the old scheme. These midpoint links make it possible to for a crawler to get from any page to any other page in just a few steps.

This scheme produces the following chart:

Crawl Tree: Adding Midpoint Link to Pagination

This shows the same level of crawlability improvement as the previous chart, but now with a scheme that is much easier to read (if a bit counterintuitive).

But How Do These Pagination Structures Scale?

So far, all of the examples have had a mere 200 numbered pages. This creates simple easy-to-understand charts, but a real website can easily have tens of thousands of pages. What happens when we scale things up?

Let’s run the last two crawls, with the same two pagination schemes, but with a hundred times as many pages.

Extreme-Skip Nav, with 20,000 Pages

This is Crazy Idea #1, but with a much bigger crawl:

Crawl Tree: Extreme-Skip Nav Pagination, with 20,000 Pages

Yes it looks kind of pretty, but it’s terrible as far as crawlability and click depth.

The deepest page is at level 557. This scheme does not scale very well at all. The relationship between page count and maximum depth is more-or-less linear. In other words, if you double the number of pages, you double the maximum depth.

Midpoint Link, with 20,000 Pages

This is Crazy Idea #2, again with a much bigger crawl:

Crawl Tree: Midpoint Link Pagination, with 20,000 Pages

The deepest page is now at level 14. This is a dramatic improvement, meaning this scheme scales extremely well.

The relationship between page count and maximum depth is (nearly) logarithmic. In other words, if you double the number of pages, the maximum depth only increases by one.

In general, if a chart is mostly made of long squiggly tentacle-like structures then the relationship will be linear (which is bad), and if the chart has a finely-branched tree-like structure, then the relationship will be logarithmic (which is good).

Begging the million dollar, win-both-showcases, takeaway question: is this midpoint link pagination worth using?

The answer: a resounding “Possibly. It depends.”

If you’re looking for conclusive advice on the structure of your site, the Portent team will almost always favor user experience over pleasing the search engine overlords. But getting the right, highly specific content in front of a searcher as quickly as possible is absolutely part of user experience. If that long-tail content is hundreds, or even thousands of clicks away from your homepage today, taking proactive steps to reduce click depth could well be worth it.

If you have many tens of thousands (or even hundreds of thousands) of numbered pages, this midpoint link pagination scheme may help to get your content crawled more thoroughly, and may help the deeper pages to rank better.

On the other hand, it may be a bit confusing to the user and will add some clutter. For smaller sites, the “Step by Two, plus ‘Last’ Link” scheme may be a better choice.

In the end, the point of this experiment and exploration was to shine some light on an often-neglected part of most websites, and to show that small changes to your pagination can have a surprisingly large impact on how a crawler sees your site and all that wonderful content.

The post Pagination Tunnels – An Experiment in Crawlability and Click Depth appeared first on Portent.



from Conversation Marketing: Internet Marketing with a Twist of Lemon http://ift.tt/2FjmPbR
via IFTTT

No comments:

Post a Comment