Posts

If you operate a website that is frequently creating or changing pages – such as an e-retail or publishing site – you’ve probably noticed it can take Google a while to update the search engine with your new content.

This has led to widespread speculation about just how frequently Google indexes pages and why it seems like some types of websites get indexed more frequently than others.

In a recent Q&A video, Google’s John Mueller took the time to answer this directly. He explains how Google’s indexing bots prioritize specific types of pages that are more “important” and limit excessive stress on servers. But, in typical Google fashion, he isn’t giving away everything.

The question posed was:

“How often does Google re-index a website? It seems like it’s much less often than it used to be. We add or remove pages from our site, and it’s weeks before those changes are reflected in Google Search.”

Mueller starts by explaining that Google takes its time to crawl the entirety of a website, noting that if it were to continuously crawl entire sites in short periods of time it would lead to unnecessary strain on the server. Because of this, Googlebot actually has a limit on the number of pages it can crawl every day.

Instead, Googlebot focuses on pages that should be crawled more frequently like home pages or high-level category pages. These pages will get crawled at least every few days, but it sounds like less-important pages (like maybe blog posts) might take considerably longer to get crawled.

You can watch Mueller’s response below or read the quoted statement underneath.

“Looking at the whole website all at once, or even within a short period of time, can cause a significant load on a website. Googlebot tries to be polite and is limited to a certain number of pages every day. This number is automatically adjusted as we better recognize the limits of a website. Looking at portions of a website means that we have to prioritize how we crawl.

So how does this work? In general, Googlebot tries to crawl important pages more frequently to make sure that most critical pages are covered. Often this will be a websites home page or maybe higher-level category pages. New content is often mentioned and linked from there, so it’s a great place for us to start. We’ll re-crawl these pages frequently, maybe every few days. maybe even much more frequently depending on the website.”

google-alerts1

Google is continuing its efforts to promote privacy in search by prioritizing indexing HTTPS pages over their HTTP equivalents.

In the announcement, Google explains its long-term aim is to eventually direct users to secure webpages with a private connection. The step to only index HTTPS pages when an HTTP equivalent exists is their most recent move in this process, following the small rankings boost given to HTTPS pages last year.

Unlike the change to Google’s algorithm in August 2014, this move will not have any effect on rankings. Instead, it simply means that Googlebot will only index the HTTPS version of a URL when both an HTTPS and HTTP version exist.

While Google’s commitment to secure search may lead to more rankings boosts for HTTPS pages in the future, this change is mostly to improve the efficiency of Google’s current indexing process. As they explain in their announcement:

“Browsing the web should be a private experience between the user and the website, and must not be subject to eavesdropping, man-in-the-middle attacks, or data modification. This is why we’ve been strongly promoting HTTPS everywhere.”

The Googlebot is Google’s automated program for searching and indexing content on the Internet. In the realm of SEO, the first part of good optimization is all about crafting textual content that’s visible and makes sense to Googlebot. After Googlebot indexes a page, the Google algorithm takes the content text and automatically ranks it on the search results page according to the search terms that the user enters into Google search. If your optimized website performs well for the term “electronic widgets,” for example, the Google algorithm will place your site near or at the top of the search results whenever someone uses Google to search for “electronic widgets.” Did you know that in addition to its automated components like Googlebot and the algorithm that Google also uses human site raters in the ranking of websites?

Google employs hundreds of site raters who rate a huge number of websites on relevancy. The input collected from this team doesn’t directly influence the search results, but it does influence the Google engineers in changing the algorithm to better serve more relevant results to the search engine user.

In this great video, Google senior software engineer Matt Cutts, demystifies this process by explaining how human website raters are used in testing changes to the Google algorithm. Essentially, after a change to the automatic search ranking is made, Google performs many test queries and evaluates what has changed in the results. The new search results are checked against the results before the change, and then presented to the human raters – in what Matt Cutts calls a “blind taste test” – to determine which set of search engine results are more relevant and useful. Only after analyzing and evaluating the feedback of the human raters are the new search results then tested with a small, carefully selected number of Internet users. Only if this last round of surveys on the algorithm change prove the results more accurate and useful will the updated algorithm be integrated into Google Search for the use of the public. It’s an exhaustive process, but that’s how much Google wants its search engine to be the most relevant on the web.

Watch the video here: