guide

Running Playwright on AWS Lambda: Setup and Best Practices

Running Playwright on AWS Lambda: Step-by-Step Guide and Best Practices

Introduction

Playwright is a powerful framework for automating web browsers, commonly used for end-to-end testing and web scraping. Running Playwright in a serverless environment like AWS Lambda offers enticing benefits: you only pay when your code runs, it scales automatically, and you avoid managing servers (source).

This guide will walk you through how to run Playwright on AWS Lambda, with step-by-step setup instructions, code examples, troubleshooting tips, and performance best practices to keep your serverless automation running smoothly.

Imagine an architecture where a Lambda function launches a headless browser (via Playwright) to perform actions, returning results like a screenshot or scraped data either directly or by storing them, for example, in an S3 bucket. This type of serverless pipeline is ideal for periodic tests or on-demand scraping. Let’s dive in.


Why Run Playwright on AWS Lambda?

Benefits

  • No servers to manage: AWS Lambda is serverless, so you avoid maintaining EC2 instances or containers.
  • Cost efficiency: Pay only for compute time you use. This is cheaper for infrequent tasks like nightly tests or scraping (source).
  • Automatic scaling: Lambda can easily scale to run tasks in parallel, helping reduce execution times (source).
  • Isolation: Each Lambda invocation runs in a fresh environment, preventing one browser’s session from affecting another.

Challenges

Headless browsers require native dependencies not included in Lambda by default. By default, Playwright installs browser binaries incompatible with AWS Lambda’s Amazon Linux environment (source). This guide will help you overcome this hurdle with detailed setup instructions.


Step-by-Step Setup: Running Playwright on AWS Lambda

Step 1: Create an AWS Lambda Function

  1. Choose a runtime: Select Node.js (e.g., Node.js 18).
  2. Set memory and timeout: Start with 512 MB of memory (or higher) and increase the timeout to 1–3 minutes. Browsers need resources to function.
  3. Create an execution role: Attach policies to allow access to services like S3 (e.g., for saving screenshots). Permissions like s3:PutObject are necessary for S3-related operations.

Step 2: Bundle Playwright and Headless Chromium for Lambda

There are two popular approaches for including Playwright and a headless browser.

Option A: Use playwright-aws-lambda via Lambda Layer

  1. Install playwright-core and playwright-aws-lambda.
  2. Package and upload your Lambda function code along with these dependencies via a Lambda Layer.
  3. Use the playwright-aws-lambda launch methods which bundle the necessary Chromium binaries automatically.

Option B: Use a Container Image

  1. Create a Dockerfile using Microsoft’s Playwright base image (mcr.microsoft.com/playwright:<version>-focal).
  2. Copy your Playwright scripts into the container, bake in any dependencies, and configure the entry point (CMD).
  3. Push the image to AWS ECR and use it to create your Lambda function.

Choosing the Method:

  • Use Option A for simplicity and experimentation.
  • Opt for Option B if you need custom libraries, specific browser versions, or more control over the environment.

Step 3: Write the Lambda Handler Code with Playwright

Here’s an example Lambda handler using Option A (playwright-aws-lambda) to capture a screenshot of a webpage:

// handler.js
const playwright = require('playwright-aws-lambda');

exports.handler = async (event) => {
  const targetUrl = event.url || "https://www.example.com";

  const browser = await playwright.launchChromium();
  const context = await browser.newContext();
  const page = await context.newPage();

  await page.goto(targetUrl, { waitUntil: 'networkidle' });
  const screenshotBuffer = await page.screenshot();

  await browser.close();

  return {
    statusCode: 200,
    body: screenshotBuffer.toString('base64'),
  };
};

For Option B (container-based), change the Playwright import as follows:

const { chromium } = require('playwright');

const browser = await chromium.launch({ args: ['--no-sandbox', '--disable-dev-shm-usage'] });

Step 4: Deploy and Test

For ZIP + Layer Approach:

  • Package your code into a ZIP.
  • Attach the Playwright Layer.
  • Test it through the Lambda Console or AWS CLI using an event payload like:
    { "url": "https://www.google.com" }
    

For Container-Based Approach:

  • Push your container image to ECR.
  • Deploy the Lambda function using this container.
  • Test as described above.

Troubleshooting Common Issues

Browser or Launch Errors

  1. Executable not found: Ensure a compatible Chromium binary is included. For playwright-aws-lambda, verify the Layer is attached.
  2. Sandbox errors: Add --no-sandbox to Playwright launch arguments.
  3. Memory or timeout errors: Increase memory allocation or timeout duration in the Lambda configuration.

Performance Issues

  1. Lambda timeouts: Increase the timeout value or optimize page loading with timeout parameters:
    await page.goto(url, { timeout: 10000, waitUntil: 'domcontentloaded' });
    
  2. Out-of-memory crashes: Allocate more memory to your Lambda or optimize navigation to exclude unnecessary page resources.

Best Practices for Playwright on Lambda

  1. Increase memory: Lambda provides more CPU as memory increases, so try 1024 MB or higher for heavy tasks.
  2. Browser reuse: Persist browser launches across warm starts to save time.
  3. Optimize page navigation: Use network interception to block images, CSS, or ads to improve speed.
  4. Match versions: Ensure your Playwright version matches the browser binary version in your environment.

Considering Alternatives: Hosted Browser Services

Setting up Playwright on Lambda can be complex. If you prefer to avoid the overhead, consider a managed service like BrowserCat. Hosted services manage browser deployments, updates, and scaling, so you can focus on writing scripts instead of infrastructure.

BrowserCat offers Hosted Playwright, handling browser lifecycle management and scaling seamlessly. It’s ideal for teams looking to save time while maintaining robust automation capabilities.


Conclusion

Running Playwright on AWS Lambda combines serverless scalability and efficient browser automation. This guide covered setup, troubleshooting, and optimization techniques to help you manage Playwright and Chromium in a serverless environment.

While you now have the knowledge to deploy Playwright on Lambda, also consider managed services like BrowserCat for a simpler solution.

Happy automating! 🐾

Automate Everything.

Tired of managing a fleet of fickle browsers? Sick of skipping e2e tests and paying the piper later?

Sign up now for free access to our headless browser fleet…

Get started today!