Simulating Realistic Load Testing with Playwright and Async Python

Repo: https://github.com/FarrelPandhita/git-viewer

When it comes to testing web applications, traditional synchronous approaches often hit performance bottlenecks. Whether you're a developer needing to validate your infrastructure, a DevOps engineer stress-testing a new deployment, or a researcher analyzing web performance under load, you need tools that can simulate realistic traffic patterns efficiently. Today, I'm sharing git-viewer, an async-first web visitor tool that demonstrates how to build scalable testing solutions using Playwright and Python's asyncio.

The Challenge: Testing at Scale

Web testing isn't just about checking if a page loads—it's about understanding how your application behaves when thousands of real users visit simultaneously. Traditional approaches often rely on sequential page visits, which means if each visit takes 2 seconds, testing 10,000 visits becomes incredibly slow and resource-intensive.

This is where asynchronous programming shines. By leveraging Python's async/await syntax and Playwright's non-blocking browser automation, we can dramatically increase throughput while maintaining realistic user behavior patterns.

Introducing git-viewer

git-viewer is a lightweight but powerful async web visitor that can simulate thousands of page visits with minimal overhead. Here's what makes it stand out:

Key Features

Asynchronous Architecture: Unlike traditional sequential browsers, git-viewer uses async contexts to manage multiple browser instances efficiently. Each visit runs independently without blocking others, allowing you to simulate realistic user traffic patterns at scale.

Realistic User Simulation: The tool rotates through multiple user agents and simulates natural browsing behaviors like page scrolling and varied viewing durations. This helps ensure your tests reflect actual user interactions rather than artificial bot patterns.

Flexible Configuration: Every aspect is customizable. Need to test 100 visits or 100,000? Want to simulate fast readers or users spending more time on content? Simply adjust the parameters to match your testing scenario.

Comprehensive Logging: Each visit is logged with timestamps and response codes, giving you visibility into success rates, timing patterns, and errors. This data is invaluable for performance analysis and debugging.

Network Awareness: The tool waits for the network idle state rather than just DOM ready, ensuring all page resources have fully loaded before recording the visit. This provides more accurate real-world testing.

How It Works

Let's walk through the architecture:

Browser Launch (Headless)
    ↓
Loop VISIT_COUNT times:
    ├─ Select random user agent
    ├─ Create new browser context
    ├─ Navigate to URL
    ├─ Wait for network idle
    ├─ Simulate viewing time (0.5-2 seconds)
    ├─ Scroll to simulate activity
    ├─ Log results
    ├─ Close context
    └─ Random delay (0.05-0.25 seconds)

The Core Flow

User Agent Rotation: By randomly selecting from a pool of realistic browser strings, we avoid appearing as a coordinated bot attack. Each context gets a different identity, mimicking diverse traffic sources.

Page Navigation: We navigate to your target URL and wait for networkidle—a key difference from naive approaches that only check for DOM ready. This ensures stylesheets, images, scripts, and all resources have loaded.

Activity Simulation: We introduce randomized viewing times and simulate scrolling, making traffic patterns look natural to any basic analytics tracking.

Error Resilience: Each visit is wrapped in try-catch blocks, so individual failures don't crash the entire testing run. Failed visits are logged without stopping progress.

Context Isolation: Creating a new browser context for each visit ensures complete isolation—no cookie contamination, cache conflicts, or state leakage between visits.

Getting Started

Installation

# Clone the repository
git clone https://github.com/FarrelPandhita/git-viewer.git
cd git-viewer

# Install dependencies
pip install playwright

# Download browser binaries
playwright install chromium

Basic Configuration

Edit play_visitors_async_safe.py to set your target URL and testing parameters:

URL = "https://example.com"        # Your test target
VISIT_COUNT = 10000                # Number of visits
MIN_VIEW_SECONDS = 0.5             # Minimum viewing time
MAX_VIEW_SECONDS = 2.0             # Maximum viewing time
MIN_INTER_DELAY = 0.05             # Min delay between visits
MAX_INTER_DELAY = 0.25             # Max delay between visits

Running the Tests

In Jupyter/Colab:

await main()

As a standalone script:

python play_visitors_async_safe.py

Real-World Scenarios

Pre-Deployment Testing: Run git-viewer against a staging environment before going live to ensure your infrastructure can handle launch day traffic.

Performance Regression Testing: Establish baseline metrics, then re-run the tests after code changes to catch performance degradations early.

Capacity Planning: Gradually increase VISIT_COUNT to find your application's breaking point and plan infrastructure accordingly.

Analytics Validation: Verify that your analytics and monitoring systems correctly track high-traffic scenarios.

Cache Strategy Testing: Observe how your caching layers respond to repeated visits with different user agents.

Customization Examples

Simulating Mobile Traffic

Add mobile user agents to your rotation:

USER_AGENTS = [
    # Desktop browsers
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...",
    # Mobile browsers
    "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0) AppleWebKit/605.1.15...",
    "Mozilla/5.0 (Linux; Android 13) AppleWebKit/537.36...",
]

Adding Custom Headers

Simulate specific client types by including custom headers:

context = await browser.new_context(
    user_agent=ua,
    extra_http_headers={
        "Accept-Language": "en-US,en;q=0.9",
        "X-Custom-Header": "testing"
    }
)

Longer Timeouts for Heavy Pages

Adjust for slower or content-heavy pages:

response = await page.goto(URL, wait_until="networkidle", timeout=60000)

Tracking Response Data

Collect HTTP status codes for analysis:

status_codes = {"200": 0, "404": 0, "500": 0}
# Later, after a visit
if status == 200:
    status_codes["200"] += 1

Performance Considerations

Memory Management: Each browser context consumes memory. For very large test runs (50,000+ visits), consider running multiple smaller batches rather than one massive batch.

CPU Scaling: The headless browser still renders pages. Monitor CPU usage and adjust timing parameters if the system becomes bottlenecked.

Rate Limiting: Server rate limiting may kick in during testing. If you see 429 errors, increase MAX_INTER_DELAY to space out requests more generously.

Network Bandwidth: High-traffic testing can saturate your connection. Run tests during maintenance windows if possible.

Troubleshooting Guide

Browser crashes or hangs: Increase viewing times and reduce concurrent load. Check available system memory.

Timeout errors: Increase the timeout parameter—some pages just load slowly. Verify the URL is correct and accessible.

High failure rate: Check if your target server is rate-limiting or blocking the testing patterns. Increase delays between visits or reduce VISIT_COUNT.

Memory bloat: Add periodic garbage collection or run multiple smaller batches instead of one enormous test.

Important: Ethical Use

This tool is powerful, which comes with responsibility. Only run git-viewer against:

Your own applications and infrastructure
Third-party services with explicit written permission
Your staging/testing environments

Unauthorized load testing is unethical and may violate terms of service or local laws. Always be a responsible member of the web community.

Why Async Matters

Traditional single-threaded testing tools would require complex multiprocessing or threading to achieve what git-viewer does with simple async/await. Python's asyncio makes this elegant:

Simpler code: No thread locks, race conditions, or complex synchronization
Better scalability: Thousands of concurrent operations without thousands of threads
Lower overhead: Context switching is CPU-efficient compared to OS-level threading
Easier debugging: Stack traces and error handling are more straightforward

This is why async/await has become the standard for I/O-bound operations like web scraping, API testing, and load testing.

Conclusion

git-viewer demonstrates that building professional-grade load testing tools doesn't require complex frameworks or expensive platforms. With just Python, Playwright, and async/await, you can create a tool that's both powerful and maintainable.

Whether you're validating infrastructure before a product launch, analyzing performance metrics, or learning how to build async applications, git-viewer is a solid foundation to build upon.

The full source code is available on GitHub at FarrelPandhita/git-viewer. Feel free to fork it, customize it for your needs, and contribute improvements back to the community.

Happy testing!

Have you built async tools in Python? Share your experiences in the comments below. What use cases would you add to git-viewer?