In August 2017, Google’s John Mueller said:
Some use it to create menus, grab attention, pull in products or services, or check forms.
Googlebot crawls, renders, and indexes web content to make it appear on search engine search pages.
In the crawling phase, Googlebot discovers URLs by following links on websites.
Before crawling it, Googlebot will first have to read the robots.txt file to check if the page allows crawling. If it marks the page as disallowed, Googlebot will skip making an HTTP request to the page.
After that, Googlebot will parse the response from other URLs in the HTML code to find new URLs.
This process is known as “crawling,” and it’s how Googlebot finds new pages to index.
The third and final phase is indexing, where Googlebot adds the rendered content to its search index so that it can appear in relevant search results.
This can be done in many ways, such as:
- Ensure that Googlebot has access to all the website’s resources, such as CSS and image files
- Make sure the website’s structure is easy for Googlebot to understand
- Use proper techniques to lazy-load content
It’s primarily concerned with:
- Ensuring that all website content is accessible to Googlebot and other search engine crawlers
- Creating a map of the website for Googlebot (and other crawlers) to follow
- Providing Googlebot with resources it needs to crawl and index the website, such as CSS and image files
- Make sure your web pages are discoverable by creating and submitting a sitemap
- Ensuring that your website’s structure is easy for Googlebot to understand
- Rendered content
- Lazy-loaded images
- Page load times
This is an app shell model.
The app shell model operates as a template and is the foundation of all Progressive Web Applications (PWAs).
The two lookup tools operate as browser extensions.
You can install any of the two, load up the page, and then click on the extension to see what technology that particular website or application is built on.
You can also “View Source” or “Inject Element” in the browser to check for JS Code.
Here are some popular JS frameworks you might find:
- Angular (Created and owned by Google)
- React (Create and owned by Facebook)
- Vue (Created by a former Google employee Evan You)
Here’s the default template for Angular, produced by Google:
The template looks like a typical web page on the web.
It has text, images, and links.
Now, let’s see what’s the under the hood:
As you can see in this HTML document, the code is almost devoid of any content.
We only see the app root and a few script tags.
That’s because the main content is dynamically injected into the DOM via JS.
So, What Are the SEO Issues to Fix:
The content is rendered to the user but not to search engine bots, as you can see.
That means search engine bots can’t fully crawl your content. This puts the website at risk of being overlooked in favour of your competitors.
We’ll be discussing this in detail in the later sections of this article.
Common Reasons Google Finds it Hard to Index JS Content
- The content is not rendered in the first instance
- The pages time out before Googlebot has had enough time to crawl and index them
We’ll be going through each of these problems and uncovering possible solutions.
They love the language and its frameworks because it allows them to create interactive web pages that users love to interact with.
And yes, both sides are right.
The good thing is that there’s always a middle ground and a way the two can work together productively to create websites that are user-friendly and easy to crawl and index by search engine bots.
Have you ever wondered how Google discovers new links?
They crawl the links they find on web pages.
Google recommends using HTML tags with HREF attributes to link web pages as a best practice. They also recommend using descriptive anchor texts for the hyperlinks:
Of course, there are other ways to hyperlink text.
But Google doesn’t recommend them – like div or span or JS event handlers.
These are what’s popularly referred to as “pseudo links.”
And Google makes it clear in their official guidelines that these links can’t be crawled.
However, an independent third-party study seems to suggest otherwise. The study suggests Googlebots may be able to crawl these links.
Still, we would recommend your keep your links as static HTML elements.
The SEO Issues to Fix:
If Google can’t crawl a link and follow it to your key pages, the page could likely miss out on valuable internal links.
Search engines rely on internal links to crawl and index your pages efficiently.
So, should it turn out the links are implemented incorrectly, Google will have a hard time discovering the new pages on your site (outside your XML sitemap).
Check the example below.
Googlebot has advanced enough to support lazy-loading. But it’s slow and doesn’t scroll like a human user.
When crawling your web content, Googlebot will first resize the virtual viewport of the image to be longer. As a result, the scroll event listener is never triggered, making it impossible for the crawler to render your content.
Here’s a piece of content we’d consider SEO-friendly:
The Intersection Observer API triggers a call back in this code whenever the observed element becomes visible.
As you can also see, the code is more flexible and robust compared to the on-scroll event listener.
The modern Googlebot also supports it.
And it works because of how Googlebot resizes its viewpoint to view your content.
Alternatively, try using native lazy loading in your web browser. Google Chrome supports this. But the feature is still in the experimental stage.
Googlebot may ignore it, but the image will still load anyway.
The SEO Issues to Fix: The idea is to get Google to see all the content on your page, images included.
For instance, in the case of an ecommerce website with multiple rows and columns of product listings, lazing loading images could be used to provide faster experiences for both bots and users.
We all know that a slow-loading page can potentially affect your search engine ranking.
Fortunately for you, there’s a way to mitigate the issue and move your site a few notches up the search engine ranks:
The SEO Issues to Fix: It’s not just about search engines. A slow-loading website is bad for the user experience.
It’s also bad for SEO.
Googlebot will defer loading JS to cut on resources.
You want to make sure any content you serve to the client is coded and delivered efficiently if at all you want to rank.
For Single Page Applications (SPAs) that utilise router packages, such as Vue Router or React Router, it helps to take extra steps to handle things like changing meta-tags when navigating between different router views.
Usually, this is handled by the node.js package like React meta-tags and Vue meta tags.
What’s a router view?
Here’s how the linking to different webpages in a single page app work in React in five easy steps:
- Whenever a user visits a React application, the application reacts by sending a GET request to the server for the index.html file.
- The server will respond by sending the requested index.html file to the user/client. The HTML file sent will contain the React scripts and Router to launch.
- The application will then be loaded on the client’s browser.
- Should the user click on a link to go to a new page, another request will be sent to the server to get the new URL.
- But before the request reaches the server, it will be intercepted by the React Router. It will then proceed to handle the page changes itself. This happens locally, changing the client-side URL and updating the rendered React components.
Here’s the difference:
When a user or search engine bot follows a link on a React website or app, they won’t be served with multiple static HTML files.
But rather, they’ll be served by React components such as headers, body, and footer hosted on the index.html file.
All these components will be re-organised to display different content, hence the name Single Page Applications (SPAs).
The SEO Issues to Fix: Use a package like React Helmet to ensure the user gets served with unique metadata for each page or “view” when browsing the SPA.
Otherwise, search engine bots may be crawling the same metadata for all the pages. Or worse, they might be crawling nothing.
Ecommerce sites are the perfect examples of dynamic content injected via JS. For example, an online store will load products on its category pages.
That makes a lot of sense, considering inventories aren’t static. They’re in a constant state of flux whenever you make a sale.
But can Googlebot see this content?
It depends on online conversions. Not having your products indexed by Google bots could prove disastrous.
- Use Google’s webmaster tools to view and visualise the pages. Do this for every page on your website. That should help you see the pages as Googlebots view them.
- Use Google Chrome’s built-in dev to debug the page. Compare and contrast what users see (the rendered code) and what Google bots see (the source code).
The two codes must align. If not, do something about it.
Note that there are third-party plugins to help you out with this. We’ll be diving into them in the later section of this article.
Google Webmaster Tools You Can Use to Determine if Google is Experiencing Technical Difficulties When Attempting to Render Your Webpages
Compare and contrast the content being served on your web browser and what’s being displayed in the tools.
Note that the two tools mentioned above both use the evergreen Chrome rendering engine as the Google search engine. That means they’ll give you an accurate visual representation of what Googlebots see when crawling your website.
Alternatively, you can use a third-party technical SEO tool like Merkle fetch-render.
Unlike the two Google tools we’ve mentioned, this tool will give you a full-size screenshot of how Googlebots view your web application or website.
Use Google’s Site Search Operator
Alternatively, just run the standard site search operator.
Simple, copy-paste a line or two of the page’s that you suspect Google hasn’t indexed and place it after the site: operator and your domain name, and press enter.
If the page doesn’t show up in the Google search result pages, then the page has yet to be indexed.
If the page shows up on Google search, that should mean that Google can find, crawl, render, and index your page content.
Here’s what the results should look like in the SERPs:
An even easier way would be to copy the page’s URL and Google search it.
The page should show up in the SERPs.
Google Chrome Dev Tools
You can also test and debug your web application for SEO issues using the built-in dev functionality in the Chrome web browser.
You can begin by hovering your cursor anywhere on the webpage and right-click on it.
An options menu will pop up. Head over to the bottom of the menu popup and click on “View Page Source.” A static HTML document will load up in a new tab.
Also, after right-clicking on the page, you can click on “Inspect Element” to view the content that’s loaded in the DOM, including JS.
You can compare the two views to see if your core content is only loaded in the DOM (inspect element) but not coded in the source code (view page source).
These implementations include:
Server-Side Rendering (SSR)
This rendering process happens in real-time, with users and search engine bots receiving the same treatment.
Pros and Cons:
- It provides a fast FCP (First Contentful Paint)
- Slow TTFB (Time to First Byte): The server renders the content on the fly instead of pre-rendering them and only loading them on request.
Hybrid rendering is exactly as the name suggests. It combines client-side and server-side rendering.
It will render the core content on the server-side before sending it to the client. It will then proceed to offload any additional resources to the client.
We suggest you use an open-source software solution called Renderton to implement dynamic rendering.
This is only a workaround and not something you want to adopt permanently. It might sound like cloaking, but Googlebot doesn’t consider it so long as the dynamic produced is the same on both requests.
- It’s easier and faster to implement
- All elements search engine bots need to crawl your website or app is availed on the first request
- Complicates debugging issues
Debugging becomes a little complicated once you’ve allowed your website to load content dynamically.
Incremental Static Rendering
In this workaround, static content is updated after the site has already been deployed. You can use a framework like Nuxt.js for Vue or Next.js for React.
In so doing, your site gets all the SEO benefits of server-side rendering without the need for server management.
However, some of these solutions aren’t easy to implement once you’ve built your web infrastructure.
Note, for websites running on a content management system such as WordPress, Joomla, or Shopify, this shouldn’t be an issue as most of them are already designed with SEO in mind.
|Server-side Rendering||Static SSR||SSR with Re-hydration||CSR with Prerendering||Full CSR|
|Overview||A web application where the input is a navigation request and the output is in HTML||An application build as a SPA (Single Page Application), but the pages are pre-rendered as static HTML||An application built as a SPA. The server will pre-render the page but also boot the full app on the client||A SPA, where the initial skeleton/shell is pre-rendered to a static HTML at built||A SPA, all logic, rendering and botting happen on the client.
Essentially, HTML is just script and style tags
|Authoring||Entirely server-side||Build to mimic the client-side… everything happens as if it’s client-side||Build as client-side||Client-side||Client-side|
|Server Role||Controls every aspect||Focuses on only delivering static HTML||Renders page||Focuses on only delivering static HTML||Focuses on only delivering static HTML|
|SEO Pros||All the required SEO elements will be available on the initial request||The required SEO elements will be available on the initial request||All the required SEO elements can be available on the initial request||The initial shell may contain the required SEO element||All the required SEO elements will be available on the initial request|
|SEO Cons||None||Make sure the outdated content is cached||Ensure the outdated content is cached||Implementation might forget to populate with the required SEO elements||Everything|
Of course, each method comes with its share of pros and cons.
The thing with SEO is that you do not want to take a risk with anything. Search engine crawlers can read anything, but unless you make an effort to simplify things, they won’t see a fully-optimized site in their initial crawl.
Even the most advanced search engines like Google can’t handle it well enough.
Things get even worse with smaller search engines like Bing, Baidu, Naver, etc.
Let’s pick the best-case scenario for a group of users (users with low-power CPU phones, for example) and bots.
An example setup would be when you use client-side rendering for most users (slow mobile phones, old browsers, non-JS users, etc.) and send search engine bots and social media crawlers the fully rendered static HTML version.
The Minimal Requirements for Initial HTML Response
Here’s a simple list of SEO basics:
- Your application’s title and metatags
- Structured data markup
- Directives like crawling and indexing directives, href-langs annotations, and canonical references
- All textual content, including a semantic structure set of HX headings
Lazy Loading: Lazy loading is one of the best practices in modern performance optimization, but for Google Discovery Feed and SERP thumbnails, Googlebots prefer a NoScript version.
Third-party Rendering Setup Monitoring
While third-party rendering tools like pre-render.io might help fix some of these SEO issues if not all, they’re not perfect either.
They might break, and all your SEO effort will come down crumbling when they do.
It doesn’t take much. When Amazon crushes its infrastructure, most third-party rendering tools will be offline.
We suggest you also start considering where these third-party tools host their services.
Another trick you might use to ensure Google bots don’t index empty pages is to start serving 503 headers. After which, you want to load the page and send a signal back to the server once the page finishes loading, updating the header status.
You have to be careful with this approach so as not to ruin your page rank completely.
It’s more of a band-aid workaround than a permanent solution to your SEO problem.
You also want to go through their troubleshooting guide before proceeding with anything else.