SEO for Single Page Apps and PWAs – Part 2

2 minutes remaining

SPA’s – single page applications – are awesome, especially for mobile users. They enable us to give web users the feel and fluidity of using a native app, with the ease of updates and deployment of a website. They even scale better: since much of the content and template compositing happens on the client-side, your server has to do less work.

They are the best of all worlds. Except for SEO.

In Part 1, we looked at the history of SEO and JavaScript and the slow but steady development of SPAs. In this, Part 2, we’ll offer specific guidance to ensure your SPA site is search-friendly. We used these techniques to build the incredibly fast, user-friendly SPA product search for our client, Anvil International. And by design it is search-friendly.

Principles of Good SEO for SPAs

Set the URL using pushState

This part of the HTML5 History API has been supported by all major browsers for over five years now, so you aren’t risking much in the way of compatibility problems.

Parse correctly any URLs you generate by pushState

This is an easy one to forget. Half the reason of rewriting the URL is for sharing and bookmarking. Make sure your SPA can fully reconstruct a page when given the URL. If you don’t, all you have done is created broken links. Your future self will not thank you.

Don’t block resources with robots.txt

Don’t block your scripts and style sheets. While this is mostly an issue for existing sites that are adding SPA features, check your robots directives just in case. Don’t forget your .htaccess files.

Don’t use long-running scripts

Google will time-out scripts that run too long, resulting in an incomplete page rendering and missing content. How long is “too long”? Google doesn’t explicitly say; they just warn against scripts that are “too complex or arcane”. A good rule of thumb is that if, as a site user, you ever notice a slight delay in an action, the script is taking too long. Optimize it.

(Trivia: GoogleBot uses an accelerated JavaScript timer. If any of your scripts are time-dependent, be aware they may malfunction for Google.)

Corollary: Profile your code and parse times. You may be surprised how much time a browser spends getting your your libraries ready to run. JavaScript is expensive.

Verify your site works using Google Search Console’s “Fetch as Google” tool

One of the hidden gems of Search Console is the ability to show you page as Google sees it. As part of your QA process, use Fetch as Google to make sure your page looks correct and is content-complete. This will help you spot pages whose scripts are too slow or complex for GoogleBot to process.

Use a sitemap.xml to provide a URL list of all content pages on site

This is really a fallback for any spiders that either do not run JavaScript, or that can’t run your site’s scripts. (This is the equivalent of using both a belt and suspenders to keep your pants from falling down in public.)

Be careful not to create duplicate pages

This can happen unexpectedly when you’re building a front-end to a large database of content. For example, let’s say I create a streaming music website (because that’s never been done, right?). Consider these URLs:

mymusicstreamer.net/play/artist/nin/downwardspiral
mymusicstreamer.net/play/album/downwardspiral/nin

Both are ways of showing the same album, but with different URLs. Which one is canonical? How would a bot know?

What do I need to know about GoogleBot’s JavaScript support?

Use a sitemap file with correct `lastmod` dates – this signals changes on your website, and provides a fallback if JavaScript fails
Test in Chrome – If it works in Chrome, it’s reasonable to expect GoogleBot can index it. You should verify this using Google Search Console’s ‘Fetch as Google tool’.
Avoid using hash (#) in URLs – Google doesn’t like to spider them, even when generated in scripts, and they can create what appear to be duplicate pages
Don’t use hashbang URLs – Google has deprecated AJAX Crawling
Don’t block JS or CSS resources in robots.txt or other directives
Performance matters – Profile your code. Make sure it doesn’t lag
Don’t create duplicate pages
GoogleBot does not support Service Workers – so it can’t spider a PWA

What about Bing?

Ah, Bing. The search engine nobody admits to using, but still accounts for 15% of North American desktop searches, and 3% of mobile searches. (Yahoo is back-ended by Bing, so I added them together to get these numbers.)

Bing has supported and recommended pushState since 2013. Bing’s developers haven’t explicitly stated that BingBot runs JavaScript, but since it can’t even discover those AJAX URLs without running JS, we may safely assume Bing is SPA-friendly. And that they did it years before Google. Go figure.

Go Forth and Be JavaScripty

You can do some pretty amazing stuff with JavaScript in the browser. And so long as you follow these guidelines, it’ll work just fine for Google and Bing.