Examples¶

Learn by example! This section contains working scripts demonstrating various web scraping techniques.

Available Examples¶

Basic example showing page navigation and data extraction.

What you'll learn: - Navigating to a URL - Waiting for page load - Extracting page title and heading - Using scrape_data()

Clicking Links ¶

Navigate to 9to5Linux and click on a link containing specific text.

What you'll learn: - Finding elements by text content - Clicking links - Handling navigation after clicks - Conditional element checks

Manual Screenshots ¶

Take screenshots at specific points in your workflow.

What you'll learn: - Using capture_screenshot() - Screenshot before/after actions - Debugging with screenshots - Error state documentation

Debug Logging ¶

Track script execution with detailed logging.

What you'll learn: - Using debug_log() - Tracking progress - Identifying where scripts hang - Best practices for logging

Downloading Files ¶

Download files from web pages during script execution.

What you'll learn: - Using download_file() function - Downloading from URLs - Downloading by clicking links - Downloading multiple files - Error handling for downloads - Combining downloads with data scraping

Importing Data ¶

Use imported data to make scripts reusable with different inputs.

What you'll learn: - Using the Import Data feature - Accessing imported_data variable - Parameterizing scripts - Processing multiple URLs - Configuration via imported data - Error handling with imported data

Running Examples¶

All examples are located in the /examples directory of the repository.

To run an example:

Go to the Editor
Copy the example code
Paste into the editor
Click "Run"

Example Template¶

Use this template as a starting point for your scripts:

import datetime

async def main(page):
    """
    Description of what this script does
    """

    # Step 1: Navigate
    debug_log("Starting navigation")
    await page.goto('https://example.com')
    await page.wait_for_load_state('networkidle')

    # Step 2: Interact
    debug_log("Looking for elements")
    await capture_screenshot("Initial state")

    # Your scraping logic here

    # Step 3: Extract data
    debug_log("Extracting data")
    scrape_data({
        'scraped_at': datetime.datetime.now().isoformat(),
        # Your data here
    })

    debug_log("Complete!")

Common Use Cases¶

E-commerce Price Monitoring¶

async def main(page):
    await page.goto('https://shop.example.com/product/123')

    price = await page.locator('.price').text_content()
    title = await page.locator('h1.product-title').text_content()

    scrape_data({
        'product': title,
        'price': price,
        'url': page.url
    })

Form Submission¶

async def main(page):
    await page.goto('https://example.com/search')

    # Fill and submit form
    await page.locator('input[name="q"]').fill('search term')
    await page.locator('button[type="submit"]').click()

    # Wait for results
    await page.wait_for_selector('.search-results')
    await capture_screenshot("Search results")

    # Extract results
    results = await page.locator('.result-item').all()
    for result in results:
        title = await result.locator('.title').text_content()
        scrape_data({'result': title})

async def main(page):
    # Login
    await page.goto('https://example.com/login')
    await page.locator('#username').fill('user@example.com')
    await page.locator('#password').fill('password')
    await page.locator('button[type="submit"]').click()

    # Wait for dashboard
    await page.wait_for_url('**/dashboard')
    await capture_screenshot("Logged in")

    # Navigate to data page
    await page.goto('https://example.com/data')
    # ... extract data

Pagination¶

async def main(page):
    await page.goto('https://example.com/listings')

    page_num = 1
    while True:
        debug_log(f"Processing page {page_num}")

        # Extract items on current page
        items = await page.locator('.item').all()
        for item in items:
            title = await item.locator('.title').text_content()
            scrape_data({'title': title, 'page': page_num})

        # Check for next button
        next_button = page.locator('a.next-page')
        if await next_button.count() == 0:
            break

        await next_button.click()
        await page.wait_for_load_state('networkidle')
        page_num += 1

Tips for Writing Examples¶

Add comments - Explain what each section does
Use debug_log() - Make execution flow clear
Take screenshots - Capture important states
Handle errors - Use try/except for robustness
Extract meaningful data - Show real-world use cases

Contributing Examples¶

Have a useful script? Share it with the community!

See Contributing for guidelines.

Examples¶

Available Examples¶

Simple Navigation¶

Clicking Links¶

Manual Screenshots¶

Debug Logging¶

Downloading Files¶

Importing Data¶