Integrations > Computer AI agents
Claude computer use
Claude can control computers by looking at screens, moving cursors, clicking buttons, and typing text. This “Computer Use” capability launched in October 2024 as a public beta, available through Anthropic’s API, Amazon Bedrock, and Google Cloud Vertex AI.
Start small with AI agent tasks
Put your step-by-step instructions in the Tallyfy task description. Start with short, mundane tasks. Don’t ask an AI agent to handle huge, decision-driven jobs - they’re prone to unpredictable behavior and hallucination, and costs add up fast.
Computer Use vs MCP integration
This article covers Claude Computer Use - where Claude sees and controls screens through screenshots, mouse movements, and keyboard actions. That’s different from Claude’s MCP integration, which gives text-based chat access to data sources and APIs.
When to use each:
- Computer Use (this article): Automating visual UI tasks - clicking buttons, filling forms, working through menus
- MCP Integration: Data queries, API-based workflow management, text-based automation
Both can complement each other in automation workflows.
Rather than building thousands of app-specific integrations, Anthropic gave Claude general computer skills. Claude uses an API to see and interact with any application inside a sandboxed environment.
What to notice:
- Tallyfy provides the task description and expected outputs that guide Claude’s actions
- Claude loops through screenshot-analyze-act cycles until the task is done
- Results, logs, and screenshots get captured back into Tallyfy fields
Models with computer use support:
- Claude Sonnet 4.6 - Best balance of performance and cost for most automation
- Claude Opus 4.6 - Flagship model for the most demanding tasks
- Claude Haiku 4.5 - Lighter option for simpler, faster automation
Performance benchmarks (OSWorld):
- Sonnet 4.6 scores 72.5% - now matching human-level performance (72.4%)
- Rapid improvement from earlier models (Sonnet 4.5 scored 61.4%)
- Still experimental, so expect some errors on tricky UI interactions
Here’s how Tallyfy coordinates Claude’s computer use through an iterative loop - Claude perceives, acts, and gets feedback until your task is done.
What to notice:
- Tallyfy triggers your intermediary app via webhook with task data
- The loop between Claude and the sandbox continues until the task is done
- All tool execution happens in an isolated sandbox for security
Sandboxed environment: The Docker container typically includes:
- A virtual X11 display server (like Xvfb) for rendering the desktop
- A lightweight Linux desktop environment
- Pre-installed apps (Firefox, LibreOffice, text editors)
- Your implementations of Anthropic’s defined tools
Three core tools (Anthropic-defined, you execute them):
- computer: Mouse/keyboard actions (clicks, typing, scrolling, cursor movement) and taking screenshots
- text_editor: View, create, and edit files
- bash: Run shell commands in the sandbox
API pricing (verify current rates at Anthropic’s pricing page ↗):
- Claude Sonnet 4.6: $3 per million input tokens, $15 per million output tokens
- Claude Haiku 4.5: $1 per million input tokens, $5 per million output tokens
- Computer use adds extra tokens to each system prompt
Access requirements:
- Anthropic API key with sufficient credits
- Available through Anthropic API, Amazon Bedrock, or Google Cloud Vertex AI
- Docker needed for the reference implementation
Computer Use works well for specific automation scenarios. Early adopters include Asana, Canva, Replit, and DoorDash.
Good applications:
- Form filling across desktop apps
- Extracting data from legacy systems without APIs
- QA testing with synthetic test case generation
- Multi-step workflows spanning multiple applications
- Desktop file management tasks
Claude’s computer use is still developing. Anthropic acknowledges these constraints:
Technical:
- Latency: Tasks with dozens or hundreds of steps can be slow
- Error-prone: Scrolling, dragging, and zooming remain challenging
- Resolution: May struggle above 1024x768 or 1280x800 due to image scaling
- Reliability: Some actions people do effortlessly are still hard for Claude
Safety:
- Claude may follow instructions found on-screen, even if they conflict with yours
- Risk of prompt injection from webpages or images
- Potential for misuse if not properly isolated
Rate limits:
- API rate limits apply based on your tier
- Processing time varies with task complexity
You’ll need to build an intermediary app that connects Tallyfy to the Anthropic API. Anthropic provides a reference implementation with Docker.
-
Get Anthropic API access:
- Get an API key from the Anthropic Console
- Review the API docs on “Tool Use” and “Computer Use”
-
Install Docker:
- Install the latest version of Docker
- Required for the sandboxed environment
-
Pull the reference implementation:
- Anthropic provides a Docker-based reference implementation with containerized environment, tool implementations, and agent loop
- Pull:
docker pull ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
-
Configure the environment:
- Run the Docker container with proper security settings
- Container runs with minimal privileges (1 CPU, 2GB RAM default)
- Access the interface at
http://localhost:8080 - Never run Computer Use unattended - always monitor
-
Build the intermediary app:
- Receive webhook requests from Tallyfy
- Build prompts and tool lists for the Claude API
- Manage the agent loop between Claude and your sandbox
- Send results back to Tallyfy
-
Prompt tips:
- Keep tasks simple and well-defined
- Tell Claude to verify outcomes with screenshots after each step
- Suggest keyboard shortcuts for tricky UI elements
- Provide examples of successful interactions when you have them
A simplified example of the integration flow:
from anthropic import Anthropicimport os
client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])
# Basic computer use requestresponse = client.messages.create( model="claude-sonnet-4-6-20260220", max_tokens=1024, tools=[ { "type": "computer_20250124", "name": "computer", "display_width_px": 1024, "display_height_px": 768, }, { "type": "text_editor_20250124", "name": "text_editor", }, { "type": "bash_20250124", "name": "bash", } ], messages=[ { "role": "user", "content": "Open the file manager and navigate to Documents folder" } ])
# Handle tool use requests in the response# Execute tools in your sandbox# Return results to Claude# Continue loop until task completeNote: This is simplified. Real implementations need full agent loop handling, tool execution in a Docker sandbox, and result processing.
Key measures:
- Run Computer Use in a dedicated container or VM with minimal privileges
- Limit internet access to approved domains only
- Never give access to sensitive data or credentials
- Keep Claude isolated from production systems
- Require human confirmation for critical actions
- Enable audit logging
Known risks:
- Prompt injection - Claude may follow on-screen instructions
- Code execution risks if not properly sandboxed
- Information theft if given access to sensitive data
Good fit:
- Desktop app automation (Excel, legacy software)
- Data extraction from systems without APIs
- Automated testing of desktop apps
- Form filling across multiple apps
- Low-risk, repetitive UI tasks
Poor fit:
- Real-time or time-critical operations
- Tasks needing creative judgment
- Social media content creation (restricted by Anthropic)
- High-security environments without proper isolation
Tips for success:
- Start simple and well-defined
- Set strong security boundaries
- Monitor closely and keep humans in the loop
- Test with low-risk data first
Advantages:
- Works with any desktop or web app
- No app-specific APIs or integrations needed
- Adapts when UIs change
Disadvantages:
- Slower than traditional RPA for simple tasks
- Still experimental with some error-prone execution
- Requires Docker and sandbox infrastructure
- Higher latency than direct API calls
Alternatives to consider:
- Traditional RPA for stable, high-volume workflows
- Direct API integrations when available
- Browser-only automation tools for web tasks
- Identify repetitive desktop tasks worth automating
- Document exact steps with screenshots
- Set up Anthropic API access with credits
- Install Docker and pull the reference implementation
- Create a Tallyfy process with clear task instructions
- Test with low-risk, non-sensitive data first
- Set up security isolation and monitoring
- Refine prompts based on success rates
- Scale gradually with proven workflows
Mcp Server > Using Tallyfy MCP server with Claude (text chat)
Vendors > OpenAI agent capabilities
Was this helpful?
- 2025 Tallyfy, Inc.
- Privacy Policy
- Terms of Use
- Report Issue
- Trademarks