Automating Browser Agent Video Demos with Shot-Scraper
- Authors

- Name
- Nino
- Occupation
- Senior Tech Editor
The evolution of Large Language Models (LLMs) has shifted from simple text generation to 'Actionable Agents' or 'Computer Use' models. As these agents become more capable of navigating the web, clicking buttons, and filling out forms, a new challenge emerges: observability. How do we verify what an agent did without watching it in real-time? Simon Willison recently introduced a powerful feature to his shot-scraper tool that solves this by allowing developers to record video demos of agents at work. By leveraging the high-speed LLM endpoints from n1n.ai, developers can now build, test, and document these agents with unprecedented efficiency.
The Need for Visual Proof in Agentic Workflows
When you deploy an agent powered by Claude 3.5 Sonnet or GPT-4o to perform a task—such as booking a flight or scraping a dynamic dashboard—relying solely on text logs is insufficient. Logs might tell you that a button was clicked, but they won't show you if the UI shifted, if a pop-up blocked the view, or if the agent struggled with a specific CSS selector.
Visual recording provides:
- Debugging Clarity: See exactly where the agent failed in the DOM.
- Stakeholder Trust: Show non-technical users exactly how the AI performs the task.
- Training Data: High-quality video logs can be used to further fine-tune vision-based models.
To achieve this smoothly, the underlying LLM must respond with low latency. Using the optimized API routes at n1n.ai ensures that the agent's 'thinking' time doesn't lead to excessively long, boring videos with minutes of static frames.
Introducing shot-scraper video
shot-scraper is a CLI tool built on top of Playwright, designed to take screenshots and now, record videos of web pages. The video command is particularly useful for capturing the autonomous actions of an AI agent.
Installation
To get started, you need Python and the Playwright dependencies:
pip install shot-scraper
shot-scraper install
If you want to record videos, ensure you have ffmpeg installed on your system, as Playwright uses it to encode the captured frames into an MP4 file.
Practical Implementation: Recording an Agent
Let's look at a scenario where an agent needs to navigate a complex site. We can use shot-scraper to record the session while the agent is driven by a script.
Basic Command Line Usage
The simplest way to record a page for a set duration is:
shot-scraper video https://n1n.ai --duration 10 --output n1n-demo.mp4
However, for agents, we often need more control. We might want the agent to perform actions and only stop recording when the task is complete.
Advanced Scripting with Python
You can combine shot-scraper logic with an agent loop. Here is a conceptual implementation using a high-performance model accessed via n1n.ai:
import subprocess
import time
def record_agent_session(url, output_file):
# Start the recording in a separate process or via the library
# Here we use the CLI for simplicity
cmd = [
"shot-scraper", "video", url,
"--output", output_file,
"--width", "1280",
"--height", "720",
"--wait", "5000" # Wait for initial load
]
print(f"Starting recording for {url}...")
# In a real scenario, you would use a non-blocking call
# and interact with the browser via Playwright directly
process = subprocess.Popen(cmd)
# Simulate Agent Logic powered by n1n.ai
# agent.perform_tasks(target_url)
return process
# Example usage
# session = record_agent_session("https://example.com", "agent_work.mp4")
Deep Dive: How it Works Under the Hood
shot-scraper video utilizes the Playwright browser.new_context(record_video_dir="...") API. When the context is closed, Playwright automatically saves the recording.
One of the 'Pro Tips' from the community is handling the Viewport vs. Window Size. Often, agents behave differently on mobile vs. desktop viewports. shot-scraper allows you to define these explicitly:
shot-scraper video https://github.com/simonw/shot-scraper \
--viewport 800 600 \
--output github.mp4
Comparison of Recording Methods
| Feature | Manual OBS Recording | Playwright Native | shot-scraper video |
|---|---|---|---|
| Automation | None | High (requires code) | Very High (CLI-first) |
| Headless Support | No | Yes | Yes |
| Ease of Use | Low | Medium | High |
| Resource Overhead | High | Low | Low |
| Integration | Difficult | Native Python/JS | Shell/Python |
Optimizing for Speed with n1n.ai
Recording a browser session is resource-intensive. If your LLM latency is high, the browser sits idle, and the resulting video is bloated with inactivity. By using n1n.ai, you benefit from:
- Aggregated Throughput: Access the fastest instances of GPT-4o and Claude 3.5 Sonnet.
- Reduced TTFT (Time to First Token): The faster the agent receives instructions, the more fluid the video recording will be.
- Reliability: If one provider goes down, n1n.ai routes your request to another, ensuring your automated recording pipeline never breaks.
Handling Dynamic Content and Authentication
One common hurdle is recording pages behind a login. shot-scraper handles this gracefully through the auth.json file. You can use the shot-scraper auth command to create a session file, which the video command will then use to record as an authenticated user.
# First, create the authentication file
shot-scraper auth https://example.com/login auth.json
# Then, record using that session
shot-scraper video https://example.com/dashboard -a auth.json --output private.mp4
Technical Considerations for Developers
When implementing this at scale, keep in mind the following:
- Frame Rates: By default, Playwright records at 25 FPS. For documentation, you might want to downsample this using
ffmpegto save space. - Timeout Management: Agents can get stuck. Ensure your
shot-scrapercommand has a--durationor a hard timeout in your Python wrapper to prevent infinite recording. - Storage: MP4 files can grow quickly. Consider an automated workflow that uploads these to an S3 bucket and generates a signed URL for your debugging dashboard.
Conclusion
Combining shot-scraper with robust AI agents opens up new possibilities for automated QA, visual documentation, and transparent AI operations. By providing a clear window into the 'mind' of the agent, developers can build more reliable systems. To power these agents with the industry's most stable and high-speed infrastructure, always choose n1n.ai for your API needs.
Get a free API key at n1n.ai