Skip to content

Writing Scripts

Comprehensive guide to writing Playwright scripts in Scrapazoid.

Script Structure

All scripts must define an async main(page) function:

async def main(page):
    # Your script logic here
    pass

The page parameter is a Playwright Page object that's automatically provided.

Basic Workflow

A typical script follows this pattern:

  1. Navigate - Go to a URL
  2. Wait - Wait for page to load
  3. Interact - Click, fill forms, scroll
  4. Extract - Get data from elements
  5. Store - Save data with scrape_data()

Simple Navigation

await page.goto('https://example.com')

With Wait

await page.goto('https://example.com')
await page.wait_for_load_state('networkidle')

Load States

  • 'load' - DOM loaded (fastest)
  • 'domcontentloaded' - DOM ready, scripts may still be loading
  • 'networkidle' - No network requests for 500ms (most complete)

Timeouts

Script Timeout

All scripts timeout after 5 minutes (300 seconds).

# This will timeout if it takes too long
await page.wait_for_selector('.slow-element', timeout=10000)  # 10 sec max

Element Waits

Always add timeouts to avoid infinite waits:

# Good - has timeout
await page.wait_for_selector('button', timeout=5000)

# Bad - no timeout (uses default 30 seconds)
await page.wait_for_selector('button')

Importing Data

You can import external data into your scripts to make them reusable with different inputs.

How to Import Data

  1. Click the "Import Data" button in the editor toolbar
  2. Paste your JSON data in the modal
  3. Click "Import" - the button will show a green checkmark
  4. The data is available in your script as the imported_data variable

Example Usage

async def main(page):
    if imported_data:
        # Process multiple URLs from imported data
        for url in imported_data['urls']:
            await page.goto(url)
            title = await page.title()
            scrape_data({'url': url, 'title': title})
    else:
        # Fallback if no data imported
        await page.goto('https://example.com')

Benefits

  • Reusable Scripts: Same script, different inputs
  • Testing: Test with various datasets
  • Parameterization: Configure scraping behavior dynamically
  • Batch Processing: Process multiple items in one run

See Available Functions - imported_data for more details.

See Full Documentation

For complete guide, see: - Available Functions - Debug Logging - Screenshots