AWS: Optimizing SEO for a SPA with Lambda@Edge and Prerender.io

AWS: Optimizing SEO for a SPA with Lambda@Edge and Prerender.io

Single-Page Applications (SPAs) are popular for their fast and seamless user experiences. However, SPAs often struggle with SEO because search engine crawlers can’t easily render JavaScript-heavy pages. This can result in poor indexing and missed traffic opportunities. While frameworks like Next.js offer a built-in solution with server-side rendering (SSR) to address this issue, not all projects are built with these frameworks. Fortunately, services like Prerender.io and AWS Lambda@Edge provide a flexible and powerful alternative to achieve similar results, even without SSR.

In this guide, I’ll walk you through how I used Lambda@Edge to intercept bot requests for my SPA and route them to Prerender.io to generate static, SEO-friendly content.


The Challenge with SPAs

SPAs rely heavily on client-side JavaScript, which can create a blank page when crawlers like Googlebot or Twitterbot visit. While modern crawlers can render JavaScript to some extent, it’s inconsistent and slow. To fix this, we can send pre-rendered content from Prerender.io when bots visit the site.

In my case, I was building social preview cards (using OpenGraph meta tags) dynamically for my website. On the front-end, I used the react-helmet library to inject these meta tags for different pages.


What is Prerender.io?

Prerender.io is a service that renders your JavaScript SPA as a static HTML snapshot. When crawlers visit your site, you can serve these static pages, ensuring the bots can easily read your content and index your site correctly.


What is Lambda@Edge?

Lambda@Edge, an AWS service, allows you to execute code at AWS’s edge locations (CloudFront) in response to events, such as HTTP requests. This makes it perfect for intercepting requests from bots and rerouting them to Prerender.io without adding latency for regular users. It ensures that only crawlers see the pre-rendered pages, while real users still experience the dynamic, JavaScript-driven SPA.


Setup

First, ensure that you have deployed your website content to S3 and have it served through CloudFront.

Create an account on Prerender.io and grab the API Key. Since CloudFront integration is not officially supported, you can skip the setup wizard and go straight to AWS to create Lambda@Edge functions. There are two functions to set up: one for a viewer request and one for an origin request.

Viewer Request

A Viewer Request event occurs when CloudFront receives a request from the viewer, such as a browser or bot/crawler. By attaching a Lambda@Edge function to the Viewer Request on our CloudFront Distribution, we can modify the incoming request before it is sent to the origin (like a web server or S3 for the SPA).

Our viewer request lambda function detects whether the request is coming from a bot/crawler based on request headers and adds the following headers:

  • X-Prerender-Token: Set to the API Token from Prerender.io
  • X-Prerender-Host: Set to the domain for your website
  • X-Prerender-Injected-Data: In my case, used to inject OpenGraph meta tags via react-helmet (set to true)

The host is also updated to ensure it points to your website, not S3. Make sure to go to your CloudFront distribution and add these custom headers under Cache key and origin requests > Legacy cache settings > Headers.

Since AWS requires Lambda@Edge functions to be in the us-east-2 region, create a new Node.js Lambda function. The Lambda@Edge function looks as follows:

exports.handler = async (event, context, callback) => {
    const request = event.Records[0].cf.request;
    const headers = request.headers;
    const user_agent = headers['user-agent'];
    const host = headers['host'];
    if (user_agent && host) {
        if (/baiduspider|Facebot|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator/.test(user_agent[0].value.toLowerCase())) {
            headers['x-prerender-token'] = [{ key: 'X-Prerender-Token', value: '<Your Prerender.io API Key Here>'}];
            headers['x-prerender-host'] = [{ key: 'X-Prerender-Host', value: host[0].value}];
            headers['host'] = [{ key: 'Host', value: '<Your Domain Here>' }];
            request.headers['x-prerender-injected-data'] = [{ key: 'X-Prerender-Injected-Data', value: 'true' }];
        }
    }
    callback(null, request);
};        

In CloudFront, Lambda@Edge functions must be referenced by a version, so after deploying a new version, you need to update the ARN in CloudFront, or you can deploy the Lambda function as Lambda@Edge from the Lambda console, although this may be buggy.



Origin Request

An Origin Request event happens right before CloudFront forwards a request to the origin (S3, web server, etc.). We can further modify the request here before it reaches the server.

This Lambda@Edge function has a different context from the Viewer Request. It checks for the headers added from the Viewer Request function and modifies the origin if it’s a pre-render request:

export const handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;
    const response = event.Records[0].cf.response;
    if (request.headers['x-prerender-token'] && request.headers['x-prerender-host']) {
        request.origin = {
            custom: {
                domainName: 'service.prerender.io',
                port: 443,
                protocol: 'https',
                readTimeout: 20,
                keepaliveTimeout: 5,
                customHeaders: {},
                sslProtocols: ['TLSv1', 'TLSv1.1'],
                path: '/https%3A%2F%2F' + request.headers['x-prerender-host'][0].value
            }
        };
    }
    callback(null, request);
};        

Testing

Since Lambda@Edge functions can execute in different regions based on the client’s location, finding logs can be tricky. You may need to navigate through different regions in CloudWatch to find logs for specific requests.

You can verify the association of your Lambda@Edge functions with your CloudFront Distribution by navigating to Function Associations under the default behavior.


To test, simulate a bot request using a command like:

curl -v -A "twitterbot" <URL to test>        

If you see a 404 error, it may be because CloudFront is pointing to S3 for a dynamic route. You can set up a custom 404 response to return a 200 status code and use the pre-rendered content as the response body.

For testing social previews, you can use these tools:


By leveraging AWS Lambda@Edge and Prerender.io, you can ensure that your SPA is fast for users and SEO-friendly for crawlers. In my case, I was able to generate dynamic social preview cards. This solution is scalable, low-maintenance, and significantly boosts your website’s discoverability.


要查看或添加评论,请登录

Michael Stewart的更多文章

社区洞察

其他会员也浏览了