Skip to content

Available Functions

This page documents all the functions and objects available to your scripts.

Core Objects

page

The Playwright Page object for browser automation.

# Navigate to a URL
await page.goto('https://example.com')

# Wait for load state
await page.wait_for_load_state('networkidle')

# Find elements
button = page.locator('button.submit')
await button.click()

# Get page title
title = await page.title()

See Playwright Page API for full documentation.

asyncio

Standard Python asyncio module for async operations.

# Wait for a duration
await asyncio.sleep(1)

# Run tasks concurrently
results = await asyncio.gather(
    page.locator('h1').text_content(),
    page.locator('p').text_content()
)

imported_data

Data imported by the user through the "Import Data" button in the editor. This variable is None if no data has been imported.

Type: dict | list | str | int | float | bool | None

How to Import Data: 1. Click the "Import Data" button in the editor toolbar 2. Paste your JSON data in the modal 3. Click "Import" 4. The data will be available in your script as imported_data

Example:

async def main(page):
    # Check if data was imported
    if imported_data:
        debug_log(f"Processing {len(imported_data)} items")

        # Example: Loop through URLs
        for url in imported_data['urls']:
            await page.goto(url)
            title = await page.title()
            scrape_data({'url': url, 'title': title})
    else:
        debug_log("No data imported, using default URL")
        await page.goto('https://example.com')

Common Use Cases:

# Scraping multiple URLs
imported_data = {
    "urls": [
        "https://example.com/page1",
        "https://example.com/page2",
        "https://example.com/page3"
    ]
}

# Search queries
imported_data = {
    "queries": ["playwright python", "web scraping"],
    "max_results": 10
}

# Product IDs to scrape
imported_data = {
    "product_ids": ["12345", "67890", "11111"],
    "store": "example-store"
}

# Configuration parameters
imported_data = {
    "wait_time": 5,
    "screenshots": true,
    "max_pages": 20
}

Benefits: - Run the same script with different inputs without editing code - Test scripts with various datasets - Parameterize scraping behavior - Reuse scripts for different use cases

Data Functions

scrape_data(dict)

Store extracted data from your script. Data is saved to the execution record and viewable in the execution history.

Parameters: - dict: Dictionary of data to store

Example:

scrape_data({
    'title': 'Example Page',
    'price': '$19.99',
    'url': page.url
})

# Data is merged, so you can call it multiple times
scrape_data({'additional_field': 'value'})

Logging Functions

print(*args)

Standard Python print function. Output appears in execution logs.

print("Starting navigation...")
print(f"Found {count} items")

debug_log(message)

Enhanced logging with [Script] prefix for better visibility.

Parameters: - message (str): Message to log

Example:

debug_log("About to click submit button")
await button.click()
debug_log("Button clicked successfully")

Best Practices: - Use before/after important operations - Log variables and state - Helps identify where scripts hang

Screenshot Functions

capture_screenshot(description)

Manually capture a screenshot with a custom description.

Parameters: - description (str): Description for the screenshot (default: "Manual screenshot")

Example:

# Before an action
await capture_screenshot("Before login")
await login_button.click()

# After an action
await capture_screenshot("After login")

# In error handling
try:
    await risky_operation()
except Exception as e:
    await capture_screenshot("Error state")
    debug_log(f"Error: {e}")

Automatic Screenshots:

Screenshots are also automatically captured on: - Page load events - Page scroll events

File Download Functions

download_file(url_or_selector, description="", filename=None)

Download files from web pages. Files are stored securely and accessible from the execution detail page.

Parameters: - url_or_selector (str, required): URL to download from, or CSS selector to click - description (str, optional): Human-readable description for the UI - filename (str, optional): Custom filename (auto-detected if not provided)

Returns:

{
    'success': True,
    'filename': 'report.pdf',
    'file_size': 1234567,
    'mime_type': 'application/pdf'
}

Examples:

# Download from URL
result = await download_file(
    'https://example.com/files/report.pdf',
    description='Monthly sales report',
    filename='sales_jan_2024.pdf'
)

if result['success']:
    debug_log(f"Downloaded: {result['filename']} ({result['file_size']} bytes)")
else:
    debug_log(f"Failed: {result['error']}")

# Download by clicking a link
result = await download_file(
    'a[href*="invoice.pdf"]',
    description='Invoice #12345'
)

# Download multiple files
csv_links = await page.query_selector_all('a[href$=".csv"]')
for i, link in enumerate(csv_links):
    href = await link.get_attribute('href')
    result = await download_file(href, description=f'Dataset {i+1}')
    debug_log(f"Downloaded {i+1}: {result['filename']}")

Security Features: - File size limits (default: 50MB) - Download count limits (default: 20 per execution) - File type validation (blocks executables) - Filename sanitization - Scripts cannot access downloaded files

Allowed File Types: - Documents: PDF, CSV, JSON, TXT, HTML - Office: Excel (.xlsx), Word (.docx) - Images: PNG, JPEG - Archives: ZIP

Blocked Types: - Executables (.exe, .sh, .py, .js, .php)

See Downloading Files for comprehensive guide.

Allowed Imports

You can import these standard library modules:

Data & Serialization

  • json - JSON encoding/decoding
  • csv - CSV file handling
  • base64 - Base64 encoding
  • xml - XML processing
  • html - HTML utilities

Date & Time

  • datetime - Date and time handling
  • time - Time functions

Text Processing

  • re - Regular expressions
  • string - String utilities

Collections & Iteration

  • collections - Specialized container datatypes
  • itertools - Iterator functions
  • functools - Higher-order functions

Math & Numbers

  • math - Mathematical functions
  • random - Random number generation
  • decimal - Decimal arithmetic
  • fractions - Rational numbers

Type Hints & Data Classes

  • typing - Type hints
  • dataclasses - Data classes
  • enum - Enumerations

Async

  • asyncio - Asynchronous I/O

Utilities

  • hashlib - Hashing algorithms
  • uuid - UUID generation

Forbidden Operations

The following are NOT allowed for security:

❌ File system access (open, pathlib) Note: Downloaded files are stored securely and not accessible to scripts ❌ Network requests (requests, urllib) Use download_file() for downloading files ❌ Process execution (subprocess, os.system) ❌ Import restrictions bypass (__import__, importlib) ❌ Code execution (eval, exec, compile)

Example Script

import datetime
import json

async def main(page):
    # Navigation
    debug_log("Navigating to example.com")
    await page.goto('https://example.com')

    # Screenshot
    await capture_screenshot("Initial load")

    # Extract data
    title = await page.title()
    heading = await page.locator('h1').text_content()

    # Log progress
    debug_log(f"Title: {title}")
    debug_log(f"Heading: {heading}")

    # Store data
    scrape_data({
        'title': title,
        'heading': heading,
        'scraped_at': datetime.datetime.now().isoformat()
    })

    debug_log("Script complete!")

See Also