Skip to content

Scrapazoid Documentation

Welcome to Scrapazoid - a Flask-based web platform for automated web scraping using Playwright.

Overview

Scrapazoid provides a user-friendly web interface for writing, executing, and monitoring Playwright-based web scraping scripts. It includes real-time execution monitoring, screenshot capture, and comprehensive logging.

Key Features

  • ๐ŸŽฏ Web-Based Editor - Monaco code editor with Python syntax highlighting
  • ๐Ÿš€ Real-Time Execution - Watch your scripts run with live logs and screenshots
  • ๐Ÿ“ธ Screenshot Capture - Automatic and manual screenshot capabilities
  • ๐Ÿ“ฅ File Downloads - Download files during script execution with security controls
  • ๐Ÿ“ฆ Data Import - Import JSON data to parameterize scripts without editing code
  • ๐Ÿ” Debug Logging - Comprehensive logging at every step
  • ๐Ÿ“Š Execution History - Review past executions with full details
  • ๐Ÿ“ Script Versioning - Track every change and see what code produced results
  • ๐Ÿ›ก๏ธ Sandboxed Execution - Secure script execution environment
  • โฑ๏ธ Automatic Timeouts - Scripts automatically timeout after 5 minutes
  • ๐Ÿงน Cleanup System - Background cleanup for stuck executions and old files
  • ๐Ÿ‘ฅ Multi-User Support - Secure authentication with isolated workspaces

Quick Start

async def main(page):
    # Navigate to a webpage
    await page.goto('https://example.com')

    # Get page title
    title = await page.title()
    print(f'Page title: {title}')

    # Extract data
    scrape_data({'title': title})

Architecture

Scrapazoid consists of several key components:

  • Flask Application - Web server and API
  • SocketIO - Real-time communication for live updates
  • Playwright Executor - Sandboxed script execution engine
  • Execution Monitor - Tracks and stores execution results
  • Cleanup System - Automatically handles stuck executions

Getting Started

  1. Installation Guide - Set up Scrapazoid
  2. Quick Start - Run your first script
  3. Configuration - Configure your instance

Support

License

See the LICENSE file in the repository for details.