Available Functions¶
This page documents all the functions and objects available to your scripts.
Core Objects¶
page¶
The Playwright Page object for browser automation.
# Navigate to a URL
await page.goto('https://example.com')
# Wait for load state
await page.wait_for_load_state('networkidle')
# Find elements
button = page.locator('button.submit')
await button.click()
# Get page title
title = await page.title()
See Playwright Page API for full documentation.
asyncio¶
Standard Python asyncio module for async operations.
# Wait for a duration
await asyncio.sleep(1)
# Run tasks concurrently
results = await asyncio.gather(
page.locator('h1').text_content(),
page.locator('p').text_content()
)
imported_data¶
Data imported by the user through the "Import Data" button in the editor. This variable is None if no data has been imported.
Type: dict | list | str | int | float | bool | None
How to Import Data:
1. Click the "Import Data" button in the editor toolbar
2. Paste your JSON data in the modal
3. Click "Import"
4. The data will be available in your script as imported_data
Example:
async def main(page):
# Check if data was imported
if imported_data:
debug_log(f"Processing {len(imported_data)} items")
# Example: Loop through URLs
for url in imported_data['urls']:
await page.goto(url)
title = await page.title()
scrape_data({'url': url, 'title': title})
else:
debug_log("No data imported, using default URL")
await page.goto('https://example.com')
Common Use Cases:
# Scraping multiple URLs
imported_data = {
"urls": [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3"
]
}
# Search queries
imported_data = {
"queries": ["playwright python", "web scraping"],
"max_results": 10
}
# Product IDs to scrape
imported_data = {
"product_ids": ["12345", "67890", "11111"],
"store": "example-store"
}
# Configuration parameters
imported_data = {
"wait_time": 5,
"screenshots": true,
"max_pages": 20
}
Benefits: - Run the same script with different inputs without editing code - Test scripts with various datasets - Parameterize scraping behavior - Reuse scripts for different use cases
Data Functions¶
scrape_data(dict)¶
Store extracted data from your script. Data is saved to the execution record and viewable in the execution history.
Parameters:
- dict: Dictionary of data to store
Example:
scrape_data({
'title': 'Example Page',
'price': '$19.99',
'url': page.url
})
# Data is merged, so you can call it multiple times
scrape_data({'additional_field': 'value'})
Logging Functions¶
print(*args)¶
Standard Python print function. Output appears in execution logs.
debug_log(message)¶
Enhanced logging with [Script] prefix for better visibility.
Parameters:
- message (str): Message to log
Example:
debug_log("About to click submit button")
await button.click()
debug_log("Button clicked successfully")
Best Practices: - Use before/after important operations - Log variables and state - Helps identify where scripts hang
Screenshot Functions¶
capture_screenshot(description)¶
Manually capture a screenshot with a custom description.
Parameters:
- description (str): Description for the screenshot (default: "Manual screenshot")
Example:
# Before an action
await capture_screenshot("Before login")
await login_button.click()
# After an action
await capture_screenshot("After login")
# In error handling
try:
await risky_operation()
except Exception as e:
await capture_screenshot("Error state")
debug_log(f"Error: {e}")
Automatic Screenshots:
Screenshots are also automatically captured on: - Page load events - Page scroll events
File Download Functions¶
download_file(url_or_selector, description="", filename=None)¶
Download files from web pages. Files are stored securely and accessible from the execution detail page.
Parameters:
- url_or_selector (str, required): URL to download from, or CSS selector to click
- description (str, optional): Human-readable description for the UI
- filename (str, optional): Custom filename (auto-detected if not provided)
Returns:
Examples:
# Download from URL
result = await download_file(
'https://example.com/files/report.pdf',
description='Monthly sales report',
filename='sales_jan_2024.pdf'
)
if result['success']:
debug_log(f"Downloaded: {result['filename']} ({result['file_size']} bytes)")
else:
debug_log(f"Failed: {result['error']}")
# Download by clicking a link
result = await download_file(
'a[href*="invoice.pdf"]',
description='Invoice #12345'
)
# Download multiple files
csv_links = await page.query_selector_all('a[href$=".csv"]')
for i, link in enumerate(csv_links):
href = await link.get_attribute('href')
result = await download_file(href, description=f'Dataset {i+1}')
debug_log(f"Downloaded {i+1}: {result['filename']}")
Security Features: - File size limits (default: 50MB) - Download count limits (default: 20 per execution) - File type validation (blocks executables) - Filename sanitization - Scripts cannot access downloaded files
Allowed File Types: - Documents: PDF, CSV, JSON, TXT, HTML - Office: Excel (.xlsx), Word (.docx) - Images: PNG, JPEG - Archives: ZIP
Blocked Types: - Executables (.exe, .sh, .py, .js, .php)
See Downloading Files for comprehensive guide.
Allowed Imports¶
You can import these standard library modules:
Data & Serialization¶
json- JSON encoding/decodingcsv- CSV file handlingbase64- Base64 encodingxml- XML processinghtml- HTML utilities
Date & Time¶
datetime- Date and time handlingtime- Time functions
Text Processing¶
re- Regular expressionsstring- String utilities
Collections & Iteration¶
collections- Specialized container datatypesitertools- Iterator functionsfunctools- Higher-order functions
Math & Numbers¶
math- Mathematical functionsrandom- Random number generationdecimal- Decimal arithmeticfractions- Rational numbers
Type Hints & Data Classes¶
typing- Type hintsdataclasses- Data classesenum- Enumerations
Async¶
asyncio- Asynchronous I/O
Utilities¶
hashlib- Hashing algorithmsuuid- UUID generation
Forbidden Operations¶
The following are NOT allowed for security:
❌ File system access (open, pathlib)
Note: Downloaded files are stored securely and not accessible to scripts
❌ Network requests (requests, urllib)
Use download_file() for downloading files
❌ Process execution (subprocess, os.system)
❌ Import restrictions bypass (__import__, importlib)
❌ Code execution (eval, exec, compile)
Example Script¶
import datetime
import json
async def main(page):
# Navigation
debug_log("Navigating to example.com")
await page.goto('https://example.com')
# Screenshot
await capture_screenshot("Initial load")
# Extract data
title = await page.title()
heading = await page.locator('h1').text_content()
# Log progress
debug_log(f"Title: {title}")
debug_log(f"Heading: {heading}")
# Store data
scrape_data({
'title': title,
'heading': heading,
'scraped_at': datetime.datetime.now().isoformat()
})
debug_log("Script complete!")