This guide was last updated: 8 January 2016.
Before proceeding, please understand that changes in faceted navigation or core website architecture are extremely high risk and can negatively impact your organic visibility if not properly handled. There is a huge amount of opportunity and potential gains by implementing these suggestions, but the risks must be mitigated. There is no one ‘best’ architecture – your category system should be a reflection of your customers’ needs and problems. You should consult an expert before implementing the recommendations set out below.
- What are product attributes?
- What is faceted navigation?
- How faceted nav helps you to rank
- How to tell if you have a problem
- Faceted navigation keyword research
- Building a business case
- Functional specification for an ideal system
The ecommerce SEO industry has a problem
Many ecommerce websites have severe issues within their product category system – filtered category pages (also called facets) are often blocked from crawling by robots, or worse still – the entire system is left crawlable without a second thought. This means that they lack the specific pages required to rank for many mid-tail terms (those keywords that a consumer typically searches for when in the consideration phase of their purchase journey). For our industry reports, we monitor around 1,000 UK ecommerce websites and around 70% of them have a variation of this issue, seriously limiting their visibility.
It’s very common to see lower-authority websites outranking industry giants simply because they have a better technical set up. This is obvious in most industries, but a great example is within the “DKNY evening dresses” SERP.
What are product attributes?
All products have different features or attributes. If you’re selling physical products, such as dresses, these attributes are things like the colour, brand, size, type, make, or model of the product. If you’re selling something less tangible – maybe you’re a hotel booking engine, say – these attributes are things like each hotel’s location, facilities (golf course, pool, free wifi), type (hostel, motel, hotel), and anything else that guests might be looking for (pet friendly, romantic).
Product attributes are anything that a potential customer cares about in a product. Not every customer will care about the same attributes, and all customers will prioritise some attributes over others; in a group of customers looking at dresses, some will be searching for a sparkly dress, some will be searching for a size 18 dress, some will be looking for a particular designer, etc.
What is faceted or filtered navigation?
Faceted (also called filtered) navigation is a design element that allows users to select combinations of the attributes that are important to them, to filter a list of products down to the ones that match their needs.
What’s the problem?
Most implementations of category systems are one of two (sub-optimal) settings:
All facets crawlable, content is not changed
In this set up, every facet (and combination of selected facets) are crawlable and indexable, and the page content is not changed between facets. For example,
The main Dresses page, /dresses
“Red” colour selected, /dresses?colour=red
“Red” colour and style “Bodycon” selected, /dresses?colour=red&style=bodycon
In this scenario, every crawlable attribute combination within the “Dresses” category is causing duplication and link equity dilution. In effect, every page competing directly with the main category page for the keyword “Dresses”, making it less likely that the website will rank at all.
No facets crawlable
A knee-jerk reaction that many SEOs have to the above issue is to prevent any crawling or indexation of filtered category pages. Search engines are able to access the “Dresses” page, but they aren’t allowed to crawl the “Red Dresses” page.
While this solves the overindexation issue, it means that the website isn’t able to rank for a huge variety of longer tail terms. Search engines are less likely to rank a generic “Dresses” page if a user searches for a specific type of dress (such as “Red Evening Dresses”) – particularly when competitors have specific, well optimised pages for that term.
Some websites try to overcome this downfall by conducting keyword research then manually creating a lot of categories for longer tail terms, such as “Evening Dresses”, “Bodycon Dresses”, etc. While this approach can work when your business operates in a small, unchanging niche, this approach is impossible for a business with a large range of products – important attributes inevitably get missed; and what about second or third level facets? Users absolutely make searches such as “Size 18 Black Evening Dresses” – manually creating every combination to that level is messy, time consuming, and will probably cost the business more than it will help.
Specific pages often outrank generic pages on long-tail queries
The query “DKNY Evening Dresses” is a mid-tail term – the consumer has identified the problem they are trying to solve (need a dress for an event) and have selected the brand that they are looking for. Net-a-Porter is one of the most authoritative websites in the fashion industry and arguably one of the most suitable websites on the internet for finding these products, but they don’t rank at all for this phrase.
The reason? Net-a-Porter’s category system does not adequately change the content on pages to send relevancy signals to search engines. When a user visits the “Dresses” page and selects the “DKNY” filter, the products change but the page title and H1 remain as “Clothing” – every single filtered page is competing for the same head term (clothing).
Instead, much less authoritative websites rank for the term “DKNY Evening Dresses”. These websites all have pages specifically about that term. In this case, the relevancy of these category pages means that they can outrank websites with much more authority.
This is a huge opportunity for most ecommerce websites. Very few platforms do this correctly off the shelf, and those that do are often configured wrong – they may miss popular attributes that users are looking for, or not correctly ensure that all important attributes appear within the content of the page.
How to tell if you have an issue
This issue has many different faces, but the underlying concept is that you should have indexable and discoverable pages within your category system for every combination of product attributes that a customer may be looking for.
Common symptoms of a non-optimal category system:
- The main heading of a page does not precisely reflect the filters selected.
- If most pages are non-indexable (canonicalise elsewhere, noindex, disallowed in robots.txt), you may have an issue
- You don’t have filters for attributes which your customers find important – i.e. you may only have a filter for brands, but omit technical or functional attributes (such as the style of dress)
If you have identified that you have an issue, you should follow these steps:
- Keyword research – identify all of the keyword phrases your customers may use
- Build a business case – fundamentally changing the way the category system works is a big cost. You need to work out whether the work is going to pay off in the long run, and be able to justify your recommendations to your boss or client.
- Set a functional specification – to make sure that developers are able to code this system correctly the first time, they’ll need some good documentation of what will and will not be acceptable.
- Test, test, and re-test – use a good crawler like DeepCrawl to detect issues, make sure the system is correctly set up, and there are no regressions between versions.
You need to find every single product attribute that a customer may be looking for. Go through every single product and category to extract every product attribute you can. Check manufacturer spec sheets, look at your site search and adwords data, and use a suggest-scraper like keywordtool.io to find related terms that people use. This is a critical step – if you only use those filters/attributes that you already have on your site, you risk missing an attribute that your users really care about.
Once you’ve found every single product attribute, merge them all to create a keyword list.
For instance, with brand, style, and size attributes, we will get keywords like:
- evening dresses
- dkny evening dresses
- size 14 evening dresses
- size 14 dkny evening dresses
You will also want to play around with the order of attributes – you should have both “dkny evening dresses” and “evening dresses dkny”. This is because the keyword planner will give search volumes on exact match, and you don’t want to miss an important attribute because you’ve put it in the wrong part of the phrase.
- evening dresses
- dkny evening dresses
- evening dkny dresses
- size 14 evening dresses
- evening size 14 dresses
- size 14 dkny evening dresses
- dkny size 14 evening dresses
- dkny evening size 14 dresses
- evening dkny size 14 dresses
- evening size 14 dresses
You also want to think about product use-cases – as users identify a problem before the product attributes that will solve that problem, use cases are common query attributes.
Finding out the problem or reasons your customers buy products is critical to determining the search queries they use when looking for products to solve those problems.
Generating all combinations and orderings of attribute-modified categories
When you’ve got a couple of product categories, you’ll be able to sit down for an hour and come up with a list of all possible queries related to your products. When you’ve got a thousand products, this is impossible to do manually.
We are currently testing a tool to generate all combinations of faceted product categories. This will be released for public beta in February 2016 – if you’d like to get a preview (and help us test this) before it’s publicly released, contact us.
Get search volumes
To find which keywords and combinations of product attributes are actually searched by users, you will need to run all of these keywords through the Keyword Planner. For anything up to a few thousand queries, this should be relatively easy – segment your list into groups of 800, upload them, and download & merge the results.
Again, this is impossible for large-scale ecommerce businesses – for instance, in the past we have completed these keyword research projects and ended up with several million possible keywords. Running these through the keyword planner manually would take days of non-stop clicking & downloading; and when you have this data, you’ll find that standard business tools are not built for this much data – even Excel’s row limit is 1,000,000.
Industry search volume APIs and tools such as Grepwords and SEMrush are largely useless for this task – they just don’t track volume on long-tail queries (and ignore most terms outside of the US and UK).
This case will be the topic of our talk at BrightonSEO April 2016, and we will be releasing a tool for accurately getting the search volume for any long tail keyword (or keywords – our tool accepts lists up to 50 million lines long). If you’d like to help us to test this tool before it is released, get in touch.
Building the business case
Unless your platform supports this behaviour out of the box (most don’t), there are going to be time and money costs to redevelop your website so that it behaves in the ideal way. Before proceeding then, you should work out whether it’s even worth making the change. We need to work out how many incremental sales you might get in a best-case scenario.
- Rank check all of the keyword phrases you found during the keyword research – this will give you an idea of the opportunity available. The keywords that you don’t currently rank for on page 1 for are those which that opportunity.
- Take the search volume of these opportunity keywords and apply a CTR – a percentage of the total searches that you expect to click on your website if you ranked on page 1. There are a huge amount of CTR studies, most of them are fundamentally flawed, but you can assume that for position 5 or above you’ll get between 5% (conservative) and 30% (liberal) of people clicking through to your site.
- Merge this with commerce data about these products – take an average conversion rate and AOV by category.
This should give you enough to work out how much money you’re currently losing by not ranking for these keywords. Obviously it’s very unlikely that you’ll rank at the top of page 1 for every single key phrase, so stick to the conservative end and set a goal like “rank top 5 positions for 10% of these keywords”. If potential revenue is high enough to justify redeveloping the category system, start putting together a functional specification.
Automate the business case
A tool we have been using to do this recently is SEOMonitor – this is a rank tracking and market intelligence tool that will do all of the above steps for you, and then monitor your progress against the target over time.
The functional spec is a document that lays out everything the system should and should not do. It’s used by your development team to build the system, and by you to check that they’ve followed your instructions correctly.
The overarching goal of this project is to maximise the number of useful pages that are indexable and discoverable, while minimising the number of useless pages that search engines can crawl and index.
Useful and Useless pages
A useful page is one which a user may be searching Google for. Your keyword research will support which needs to be indexable and which should not be, but as a general rule useful pages are:
- Those that have one or zero attributes per filter group selected (for instance, one brand is selected but no styles – “DKNY dresses” for instance).
- Pages which have fewer than a maximum number of filter groups selected – i.e. “Size 14 DKNY Evening dresses” contains three groups – size, brand, style
- You have enough products for sale for that specific combination of filters to make a good quality page. There’s no point in creating an indexable page if you only have one or two products – you’ll only be creating a terrible user experience.
Useless pages then are those that:
- Have more than one attribute for a filter group selected – “DKNY or Karen Millen Evening Dresses”, “DKNY Red or Blue Dresses” etc
- Pages that have more than a maximum number of groups selected – “Size 14 DKNY Red Strapless Silk Evening Dresses” has 6 groups selected. We want these to be non indexable/discoverable because they’re too specific – a user will normally search for the core attributes that they care about. Once we get to a certain level, you are unlikely to have many products to create a decent landing page for specific terms, and users are unlikely to ever search Google for a term that specific.
- Any pages that are sorted or ordered differently from the default view (i.e. price low to high) as these are effectively duplicates of the default view. Do not confuse this with a price attribute – in some sectors attributes such as “under £20” are legitimate and quite popular with users so should be indexable
- Any pages that have a de facto attribute. For instance, in the recruitment industry all searches for jobs are intended to be full-time by default. Therefore, users are unlikely to ever search for “full time marketing jobs” – pages with this attribute selected are useless; but pages with the “part time” attribute tend to be useful.
Content must change
The content on every filtered category page must change to reflect the facets that are selected. At a minimum, this means the page title, meta description, and H1; however you should also include functionality to add custom content (this may be necessary for particularly competitive queries).
Red Dresses – /dresses?colour=red
Red DKNY Dresses – /dresses?colour=red&brand=dkny
Remember that the main content of the page (that a user has come to see) are the product listings themselves, all of the other textual content on the page is there to reassure the user that they have landed on the right page and to give relevancy signals to search engines so that they know what the page is about.
Links to other filtered pages
Links to a page’s children and parents should exist. i.e. From the “Red Dresses” page, there should be followable links to the parent (“Dresses”), and children (“Red DKNY Dresses”). Depending on your product, it may also be beneficial to include links to sibling pages (“Blue Dresses”).
However, links to filtered pages with multiple values from the same attribute (or any other useless page) need to be treated differently. For instance, from the “Red Dresses” page, links to “Red AND Blue Dresses”. Because users almost never search Google for two values in this way, we don’t want search engines to crawl or index these pages. There are two ways to handle this:
- Use a nofollow attribute on the links themselves, so that users can see pages filtered with two (or more) values from an attribute, but search engines can not.
- Completely remove links to filter by another attribute; i.e. show the colour filter, but remove it once a colour has been selected.
Strict URL ordering
To allow these pages to be indexed, the URL is going to have to contain the selected attributes – this can either be done within the path (/dresses/red/dkny/), or within the parameters (/dresses?colour=red&brand=dkny). There are SEO arguments to do it either way, but more often than not, the method is determined by the capabilities of the CMS to handle these.
When setting the URL, it is vitally important that attributes are always ordered in the same way, regardless of the order the user selected the attributes. For instance, if a user visits the “Dresses” page and selects the “DKNY” brand, they should be sent to the same URL as if they visited the “DKNY” page and selected the “Dresses” attribute.
The order that parameters are in is not particularly important – the important part is that they are always in that order. Therefore, an easy and computationally-cheap method is to order them alphabetically – either by the attribute name, or the value itself. As alphabetical ordering never changes, a certain combination of attributes or values will always end up in the same order.
Using inconsistent URL ordering will result in duplication issues.
Test, test, and re-test
As mentioned at the beginning of this guide, this extremely high-risk change for many sites. If implemented incorrectly, there’s a pretty good chance you’ll only do damage to your organic visibility.
Therefore it’s vital to test at every stage of development – get a baseline, crawl after every revision of the staging site, re-crawl the live site (to see how the baseline changed) before deployment, test after deployment, and again a week later. There are a number of website crawlers on the market that will do this, and maintain historical logs of any symptoms so that you can root out issues. We use DeepCrawl for this job.
- We did a webinar on this topic recently. Watch it here.
- We originally presented this topic at BrightonSEO April 2015. See the presentation deck.
Get in touch
We’re a specialist SEO agency based in Farringdon, London. We regularly help ecommerce businesses to identify and address critical issues within their site architecture – get in touch to see how we can help.