By Jonathan Griffin. Editor, SEO Consultant, & Developer.
Google’s John Mueller responded to a Reddit thread last Thursday, in which a user was having problems with an hreflang implementation. Mueller indicated that some webmasters create pages for all languages in all countries, but that could be the wrong approach. He continued that you should “limit the number of pages you create to those that are absolutely critical & valuable.”
The question on Reddit is related to a user having issues where a Russian version of the page still had English content.
The Reddit user asked if it was advisable to block the hreflang page via the robots.txt file, as the hreflang page was not required because it was already in English.
Mueller responded to this question saying that:
You definitely shouldn’t block / disallow these in robots.txt – if they’re disallowed from crawling, we wouldn’t be able to canonicalize them at all, or see any of the metadata on them.
Mueller criticized the approach of creating pages for all languages and all countries, and even different languages for each specific country, such as Swahili in Japan. He implied that this might be the cause of the Redditor’s issues.
He indicated a lot of these pages were created “because [they] can,” and probably get very “little traffic” and “add very little value.” He continued that they “add a significant overhead (crawling, indexing, canonicalization, ranking, maintenance, hreflang, structured data, etc.).
Mueller then proceeded to provide advice on hreflang implementation, which I shall summarize as follows:
- You should limit the number of pages created by hreflang to those that are “absolutely critical and valuable.”
- Focus on pages where you see wrong-language traffic. These are often pages that “get a lot of global, branded queries.” You can check this in Google Analytics by going to Audience -> Geo - > Language. The example below shows a site with a lot of non-English language traffic that may benefit from hreflang, although India may not:
- Mueller admitted that the line distinguishing “critical and valuable” is not so easy. There is a balance between “save effort by thinking” and “just do it everywhere.”
You can read the full reply by John Mueller on Reddit below:
It’s easy to dig into endless pits of complexity with hreflang. “Let’s create all languages! Let’s make pages for all countries! What if someone in Japan wants to read it in Swahili? Let’s make even more pages!” My guess is most of these “pages created because you can” get very little traffic, add very little value, and they add a significant overhead (crawling, indexing, canonicalization, ranking, maintenance, hreflang, structured data, etc.).
My recommendation would be first to limit the number of pages you create to those that are absolutely critical & valuable – maybe that already cuts the pages you’re thinking about. Think big here; if you’re talking about individual pages within a medium-sized site, it’s probably a non-issue. On the other hand, if you’re considering copying your whole site into 20 languages x 10 countries, that’s something else.
Past that, for hreflang, I’d focus first on pages where you’re seeing wrong-language traffic – often these are pages that get a lot of global, branded queries, where it’s hard to determine which language content they want. A search for “google” can match a lot of language pages, hreflang can help to differentiate. On the other hand, a search for “search engine” is pretty clear & matches pages where you write about “search engine” already, so pages like that don’t need as much help being language-targeted. That said, sometimes the balance between “save effort by thinking” and “just do it everywhere” is not that straightforward to determine :).
It is not the first time Mueller has weighed in on the complexities for hreflang. In February 2018, Mueller said that “hreflang is one of the most complex aspects of SEO (if not the most complex one):
Mueller continued in the same Twitter thread that learning the intricacies of hreflang requires a “multi-semester course,” rather than one of his usual “5 minutes videos”: