Posts

Most people these days understand the general idea of how search engines work. Search engines like Google send out automated bots to scan or “crawl” all the pages on a website, before using their algorithms to sort through which sites are best for specific search queries. 

What few outside Google knew until recently, was that the search engine has begun using two different methods to crawl websites – one which specifically searches out new content and another to review content already within its search index.

Google Search Advocate John Mueller revealed this recently during one of his regular Search Central SEO office-hours chats on January 7th.

During this session, an SEO professional asked Mueller about the behavior he has observed from Googlebot crawling his website. 

Specifically, the user says Googlebot previously crawled his site daily when it was frequently sharing content. Since content publishing has slowed on this site, he has seen that Googlebot has been crawling his website less often.

As it turns out, Mueller says this is quite normal and is the result of how Google approaches crawling web pages.

How Google Crawls New vs. Old Content

While Mueller acknowledges there are several factors that can contribute to how often it crawls different pages on a website – including what type of pages they are, how new they are, and how Google understands your site.

“It’s not so much that we crawl a website, but we crawl individual pages of a website. And when it comes to crawling, we have two types of crawling roughly.

One is a discovery crawl where we try to discover new pages on your website. And the other is a refresh crawl where we update existing pages that we know about.”

These different types of crawling target different types of pages, so it is reasonable that they also occur more or less frequently depending on the type of content.

“So for the most part, for example, we would refresh crawl the homepage, I don’t know, once a day, or every couple of hours, or something like that.

And if we find new links on their home page then we’ll go off and crawl those with the discovery crawl as well. And because of that you will always see a mix of discover and refresh happening with regard to crawling. And you’ll see some baseline of crawling happening every day.

But if we recognize that individual pages change very rarely, then we realize we don’t have to crawl them all the time.”

The takeaway here is that Google adapts to your site according to your own publishing habits. Which type of crawling it is using or how frequently it is happening are not inherently good or bad indicators of your website’s health, and your focus should be (as always) on providing the smoothest online sales experience for your customers. 

Nonetheless, it is interesting to know that Google has made this adjustment to how it crawls content across the web and to speculate about how this might affect its ranking process.

To hear Mueller’s full response (including more details about why Google crawls some sites more often than others), check out the video below:

If your site is offline for more than a couple of days you could be at risk of having your pages deindexed, according to Google Search Advocate John Mueller.

It should go without saying that the less downtime your website experiences, the better. Still, some downtime is unavoidable thanks to maintenance, updates, redesigns, and other issues which can be entirely out of your hands.

This inevitably raises the question of exactly how long is too long for your site to be offline. At what point does this begin to hurt your rankings?

After years of debate, we finally have an official answer from Google courtesy of John Mueller during the most recent Google Search Central SEO office hours session.

How Long is Too Long to Be Offline?

The topic arose when an SEO specialist named Aakash Singh asked Mueller what can be done to minimize the loss of rankings or search performance while his client’s website undergoes an expected week of downtime.

The bad news is that a week is simply too long for a site to be offline without experiencing any negative side effects. In fact, Mueller says that sites can start having pages be de-indexed after being down for just a few days.

John Mueller On How Site Downtime Impacts Rankings

Beginning his response, Mueller explains how Google “sees” sites that are experiencing downtime.

“For an outage of maybe a day or so, using a 503 result code is a great way to tell us that we should check back. But after a couple of days we think this is a permanent result code, and we think your pages are just gone, and we will drop them from the index.”

“And when the pages come back we will crawl them again and we will try to index them again. But it’s essentially during that time we will probably drop a lot of the pages from the website from our index, and there’s a pretty good chance that it’ll come back in a similar way but it’s not always guaranteed.”

The general message is that sites should minimize downtime, even when using the proper redirects or site codes.

Mueller does leave us with a suggestion for avoiding the worst fallout from downtime, but he still emphasizes the importance of getting a site back up as quickly as possible:

“… that could be something like setting up a static version of the website somewhere and just showing that to users for the time being. But especially if you’re doing this in a planned way I would try to find ways to reduce the outage to less than a day if at all possible.”

To hear Mueller’s full explanation, check out the recording from the December 10th SEO office hours session below:

A few weeks ago, Google teased that it planned to refine its PageSpeed Insights tools to make data “more intuitive” and easy to understand. Now, that update has arrived.

What Is The PageSpeed Insights Tool?

If you’re unfamiliar, the PageSpeed Insights tool from Google evaluates your web pages to provide suggestions to improve how quickly content loads. 

The tool has been around in various forms since 2013 when it was a simple API webmaster could use to test their page speeds. Version 5, the most recent major update, arrived in 2018. However, smaller updates like this week’s happen somewhat regularly. 

Along with this new update, Google has moved the PageSpeed Insights tool to a new home at https://pagespeed.web.dev/.

What Is New In The PageSpeed Insights Tool?

The biggest focus of the new update is a change to the user interface to be more intuitive by “clearly differentiating between data derived from a synthetic environment and data collected from users in the field.”

To do this, Google has added dedicated sections for each type of data.

Where the tool used to include a label specifying which type of data you were viewing, Google has instead added information about what the data means for you and how it may be used to improve your performance.

Additionally, Google has shifted its emphasis to data collected from real users by moving field data to the top.

The Core Web Vitals assessment has also been expanded, with a label showing if your site has passed a Core Web Vitals assessment in the field and in-depth metrics from simulated environments.

Importantly, the PageSpeed Insights tool also includes details at the bottom of the page specifying how the data was collected in the field. This information includes:

  • Data collection period
  • Visit durations
  • Devices
  • Network connections
  • Sample size
  • Chrome versions

Lastly, Google has removed the previously included screenshot of the page as it indexed your content, replacing it with a series of images displaying the full loading sequence. 

For more, read the announcement for the update from Google’s Web.Dev blog.

Google’s Page Experience Algorithm update is officially coming to some desktop search results, beginning in February of next year.’

Google Search product manager Jeffrey Jose teased this news earlier this year at the annual I/O event. At the time, however, details about when it would be rolled out and how it would be implemented were scarce. Now, we have the full rundown.

What Is Google’s Page Experience Algorithm?

The Page Experience Algorithm was originally rolled out exclusively for searches coming from mobile devices earlier this year, but the search engine confirmed it will be bringing much of the algorithm to desktop searches. This includes the much-talked-about “Core Web Vitals” metrics which are intended to ensure a good user experience on sites.

As the announcement says:

“This means the same three Core Web Vitals metrics: LCP, FID, and CLS, and their associated thresholds will apply for desktop ranking. Other aspects of page experience signals, such as HTTPS security and absence of intrusive interstitials, will remain the same as well.”

However, one notable signal from the mobile Page Experience Algorithm will not be coming to desktop search results for obvious reasons: mobile-friendliness.

To accompany the new search signal, Google says it is working on a new Search Console report dedicated to showing how your desktop pages stack up when this algorithm is applied to them.; For now, the release date of that is unknown, but most believe the report will arrive before or at the same time as the algorithm update.

For more information, read the full announcement here.

It is no secret that Google knows the price you, your competitors, and even the shady third-party companies charge for your products or services. In some cases, you might even directly tell the company how much you charge through Google’s Merchant Center. So, it is reasonable to think that the search engine might also use that information when it is ranking brands or product pages in search results.

In a recent livestream, however, Google Webmaster Trends Analyst, John Mueller, denied the idea.

What John Mueller Has To Say About Price as a Google Ranking Signal

The question arose during an SEO Office-Hours hangout on October 8, which led to Mueller explaining that while Google can access this information, it does not use it when ranking traditional search results.

As he says in the recording of the discussion:

“Purely from a web search point of view, no, it’s not the case that we would try to recognize the price on a page and use that as a ranking factor.

“So it’s not the case that we would say we’ll take the cheaper one and rank that higher. I don’t think that would really make sense.”

At the same time, Mueller says he can’t speak on how products in shopping results (which may be shown in regular search results) are ranked. 

Within shopping search results, users can manually select to sort their results by price. Whether it is used as a factor the rest of the time isn’t something Mueller can answer:

“A lot of these products also end up in the product search results, which could be because you submit a feed, or maybe because we recognize the product information on these pages, and the product search results I don’t know how they’re ordered.

“It might be that they take the price into account, or things like availability, all of the other factors that kind of come in as attributes in product search.”

Price Is And Isn’t A Ranking Factor

At the end of the day, Mueller doesn’t work in the areas related to product search so he really can’t say whether price is a ranking factor within those areas of Google. This potentially includes when they are shown within normal search results pages.

What he can say for sure, is that within traditional web search results, Google does not use price to rank results:

“So, from a web search point of view, we don’t take price into account. From a product search point of view it’s possible.

“The tricky part, I think, as an SEO, is these different aspects of search are often combined in one search results page. Where you’ll see normal web results, and maybe you’ll see some product review results on the side, or maybe you’ll see some mix of that.”

You can hear Mueller’s full response in the recording from the October 8, 2021, Google SEO Office Hours hangout below:

Google has confirmed that it is sometimes replacing page titles in search results with other copy it finds more relevant. As public liaison for Google Search, Danny Sullivan, explains:

“Last week, we introduced a new system of generating titles for web pages. Before this, titles might change based on the query issued. This generally will no longer happen with our new system. This is because we think our new system is producing titles that work better for documents overall, to describe what they are about, regardless of the particular query.”

In plain English, this means that Google is rewriting the title tags accompanying web pages in some search results – often replacing it with other text from your page. This is not the first time Google has made adjustments to title tags being shown in search results, but it is definitely the most extensive rewriting the search engine has done. 

According to Sullivan, the goal of this is to highlight the most relevant content for users and focus on content that users can “visually see”: 

“Also, while we’ve gone beyond HTML text to create titles for over a decade, our new system is making even more use of such text. In particular, we are making use of text that humans can visually see when they arrive at a web page. We consider the main visual title or headline shown on a page, content that site owners often place within <H1> tags, within other header tags, or which is made large and prominent through the use of style treatments.”

Does This Mean HTML Title Tags Don’t Matter?

If Google is going to just replace the tags put on pages, why should we even bother? The answer is for a few reasons. 

Firstly, the title tags will still provide their traditional SEO value by helping the search engine understand your page.

Secondly, Google is not rewriting the majority of search results titles. According to Sullivan, Google will show the original HTML title tags in more than 80% of cases. The system will only revise title tags if it believes the current tags are either too long, stuffed with irrelevant keywords, or a generic boilerplate.

“In some cases, we may add site names where that is seen as helpful. In other instances, when encountering an extremely long title, we might select the most relevant portion rather than starting at the beginning and truncating more useful parts.”

What This Means For You

Since there is no way of opting out of this system, there is nothing for brands to change moving forward. 

The biggest changes from this will instead be in reporting, where some pages may see increased or decreased click-through rates due to changed titles in search results. 

For more, read the full statement from Google and Danny Sullivan here.

One of the most frustrating aspects of search engine optimization is the time it takes to see results. In some cases, you can see changes start to hit Google’s search engines in just a few hours. In others, you can spend weeks waiting for new content to be indexed with no indication when Google will get around to your pages.

In a recent AskGooglebot session, Google’s John Mueller said this huge variation in the time it takes for pages to be indexed is to be expected for a number of reasons. However, he also provides some tips for speeding up the process so you can start seeing the fruits of your labor as soon as possible.

Why Indexing Can Take So Long

In most cases, Mueller says sites that produce consistently high quality content should expect to see their new pages get indexed within a few hours to a week. In some situations, though, even high quality pages can take longer to be indexed due to a variety of factors.

Technical issues can pop up which can delay Google’s ability to spot your new pages or prevent indexing entirely. Additionally, there is always the chance that Google’s systems are just tied up elsewhere and need time to get to your new content.

Why Google May Not Index Your Page

It is important to note that Google does not index everything. In fact, there are plenty of reasons the search engine might not index your new content.

For starters, you can just tell Google not to index a page or your entire site. It might be that you want to prioritize another version of your site or that your site isn’t ready yet. 

The search engine also excludes content that doesn’t bring sufficient value. This includes duplicate content, malicious or spammy pages, and websites which mirror other existing sites.

How To Speed Up Indexing

Thankfully, Mueller says there are ways to help speed up indexing your content.

  • Prevent server overloading by ensuring your server can handle the traffic coming to it. This ensures Google can get to your site in a timely manner. 
  • Use prominent internal links to help Google’s systems navigate your site and understand what pages are most important.
  • Avoid unnecessary URLs to keep your site well organized and easy for Google to spot new content.
  • Google prioritizes sites which put out consistently quality content and provide high value for users. The more important Google thinks your site is for people online, the more high priority your new pages will be for indexing and ranking.

For more about how Google indexes web pages and how to speed up the process, check out the full AskGooglebot video below:

In a Google Search Central SEO session recently, Google’s John Mueller shed light on a way the search engine’s systems can go astray – keeping pages on your site from being indexed and appearing in search. 

Essentially the issue comes from Google’s predictive approach to identifying duplicate content based on URL patterns, which has the potential to incorrectly identify duplicate content based on the URL alone. 

Google uses the predictive system to increase the efficiency of its crawling and indexing of sites by skipping over content which is just a copy of another page. By leaving these pages out of the index, Google’s engine has less chances of showing repetitious content in its search results and allows its indexing systems to reach other, more unique content more quickly. 

Obviously the problem is that content creators could unintentionally trigger these predictive systems when publishing unique content on similar topics, leaving quality content out of the search engine. 

John Mueller Explains How Google Could Misidentify Duplicate Content

In a response to a question from a user whose pages were not being indexed correctly, Mueller explained that Google uses multiple layers of filters to weed out duplicate content:

“What tends to happen on our side is we have multiple levels of trying to understand when there is duplicate content on a site. And one is when we look at the page’s content directly and we kind of see, well, this page has this content, this page has different content, we should treat them as separate pages.

The other thing is kind of a broader predictive approach that we have where we look at the URL structure of a website where we see, well, in the past, when we’ve looked at URLs that look like this, we’ve seen they have the same content as URLs like this. And then we’ll essentially learn that pattern and say, URLs that look like this are the same as URLs that look like this.”

He also explained how these systems can sometimes go too far and Google could incorrectly filter out unique content based on URL patterns on a site:

“Even without looking at the individual URLs we can sometimes say, well, we’ll save ourselves some crawling and indexing and just focus on these assumed or very likely duplication cases. And I have seen that happen with things like cities.

I have seen that happen with things like, I don’t know, automobiles is another one where we saw that happen, where essentially our systems recognize that what you specify as a city name is something that is not so relevant for the actual URLs. And usually we learn that kind of pattern when a site provides a lot of the same content with alternate names.”

How Can You Protect Your Site From This?

While Google’s John Mueller wasn’t able to provide a full-proof solution or prevention for this issue, he did offer some advice for sites that have been affected:

“So what I would try to do in a case like this is to see if you have this kind of situations where you have strong overlaps of content and to try to find ways to limit that as much as possible.

And that could be by using something like a rel canonical on the page and saying, well, this small city that is right outside the big city, I’ll set the canonical to the big city because it shows exactly the same content.

So that really every URL that we crawl on your website and index, we can see, well, this URL and its content are unique and it’s important for us to keep all of these URLs indexed.

Or we see clear information that this URL you know is supposed to be the same as this other one, you have maybe set up a redirect or you have a rel canonical set up there, and we can just focus on those main URLs and still understand that the city aspect there is critical for your individual pages.”

It should be clarified that duplicate content or pages impacted by this problem will not hurt the overall SEO of your site. So, for example, having several pages tagged as being duplicate content won’t prevent your home page from appearing for relevant searches. 

Still, the issue has the potential to gradually decrease the efficiency of your SEO efforts, not to mention making it harder for people to find the valuable information you are providing. 

To see Mueller’s full explanation, watch the video below:

With the announcement that Google will begin including the “Core Web Vitals”  (CWV) metrics in its search engine algorithm starting next year, many are scrambling to make sense of what exactly these metrics measure and how they work.

Unlike metrics such as “loading speed” or “dwell time” which are direct and simple to understand, Core Web Vitals combine a number of factors which can get very technical.

To help you prepare for the introduction of Core Web Vitals as a ranking signal next year, Google is sharing a comprehensive guide to what CWV measures, and how they can affect your website. 

What Are Core Web Vitals

The first thing to understand is what exactly Core Web Vitals are. Simply put, CWV are a combination of three specific metrics assessing your page’s loading speed, usability, and stability. These three metrics appear very technical at first, but the gist is that your site needs to load quickly and provide a secure and easy to use experience. As for the specifics, Core Web Vitals include:

  • Largest Contentful Paint (LCP): Measures loading performance. To provide a good user experience, sites should strive to have LCP occur within the first 2.5 seconds of the page starting to load.
  • First Input Delay (FID): Measures interactivity. To provide a good user experience, sites should strive to have an FID of less than 100 milliseconds.
  • Cumulative Layout Shift (CLS): Measures visual stability. To provide a good user experience, sites should strive to have a CLS score of less than 0.1.

Importantly, in the new guide, Google reaffirmed its intention to start using Core Web Vitals as a ranking signal in 2021. 

“Starting May 2021, Core Web vitals will be included in page experience signals together with existing search signals including mobile-friendliness, safe-browsing, HTTPS-security, and intrusive interstitial guidelines.”

Does Every Page Need To Meet CWV Standards?

In the help document, Google explains that the Core Web Vitals standards it set out should be seen as a mark to aim for, but not necessarily a requirement for good ranking. 

Q: Is Google recommending that all my pages hit these thresholds? What’s the benefit?

A: We recommend that websites use these three thresholds as a guidepost for optimal user experience across all pages. Core Web Vitals thresholds are assessed at the per-page level, and you might find that some pages are above and others below these thresholds. The immediate benefit will be a better experience for users that visit your site, but in the long-term we believe that working towards a shared set of user experience metrics and thresholds across all websites, will be critical in order to sustain a healthy web ecosystem.

Will Core Web Vitals Make or Break Your Site?

It is unclear exactly how strongly Core Web Vitals metrics will be able to affect your site when they are implemented, but Google’s current stance suggests they will be a significant part of your ranking.

Q: How does Google determine which pages are affected by the assessment of Page Experience and usage as a ranking signal?

A: Page experience is just one of many signals that are used to rank pages. Keep in mind that intent of the search query is still a very strong signal, so a page with a subpar page experience may still rank highly if it has great, relevant content.

Other Details

Among the Q&A, Google also gives a few important details on the scope and impact of Core Web Vitals.

Q: Is there a difference between desktop and mobile ranking? 

A: At this time, using page experience as a signal for ranking will apply only to mobile Search.

Q: What can site owners expect to happen to their traffic if they don’t hit Core Web Vitals performance metrics?

A: It’s difficult to make any kind of general prediction. We may have more to share in the future when we formally announce the changes are coming into effect. Keep in mind that the content itself and its match to the kind of information a user is seeking remains a very strong signal as well.

The full document covers a wide range of technical issues which will be relevant for any web designer or site manager, but the big picture remains the same. Google has been prioritizing sites with the best user experience for years, and the introduction of Core Web Vitals only advances that effort. 

Find out more about Core Web Vitals here.

Google is adding a new set of ranking signals to its search engine algorithm in the coming year, according to an announcement this week. 

The search engine says it will begin factoring “Core Web Vitals” as a ranking signal starting in May 2021, combining with already existing user experience-related ranking signals. 

Google has been measuring Core Web Vitals since earlier this year, assessing the speed, responsiveness, and stability of web pages. 

These factors are what Google calls the Core Web Vitals:

  • Largest Contentful Paint (LCP): Measures loading performance. To provide a good user experience, sites should strive to have LCP occur within the first 2.5 seconds of the page starting to load.
  • First Input Delay (FID): Measures interactivity. To provide a good user experience, sites should strive to have an FID of less than 100 milliseconds.
  • Cumulative Layout Shift (CLS): Measures visual stability. To provide a good user experience, sites should strive to have a CLS score of less than 0.1.

These signals will be joining the already announced page experience signals:

  • Mobile-friendliness
  • Safe-browsing
  • HTTPS-security
  • Intrusive interstitial guidelines

“These signals measure how users perceive the experience of interacting with a web page and contribute to our ongoing work to ensure people get the most helpful and enjoyable experiences from the web.”

Based on recent data assessments, this should concern the majority of websites out there. A study published in August suggests less than 15% of all websites would pass a Core Web Vitals assessment if the signals were implemented today. 

The search engine has also hinted at the potential to introduce new labels in search results, highlighting pages with the best user experience. Though nothing is set in stone, this would provide even more motivation for pages trying to maintain the best place in search results. 

For more information about updating your site for Core Web Vitals, you can explore Google’s resources and tools here