A Significant Increase in Digital Labor Automation | CAIS
About
About
AI risk<br>Resources
Resources
Contact<br>Careers<br>Donate
Our Work
Resources
AI RiskContactCareersDonateCareers<br>Donate
A Significant Increase in Digital Labor Automation
The newest frontier models automate substantially more real freelance work than their predecessors.
BLOG
AI Risks
July 1, 2026
7 min read
View as PDF
Author:<br>Mantas Mazeika
Related Posts:
Submit Your Toughest Questions for Humanity's Last Exam
Superhuman Automated Forecasting
RLI was jointly developed by the Center for AI Safety and Scale Labs · Leaderboard (CAIS) · Leaderboard (Scale)<br>The Remote Labor Index (RLI) measures how often AI agents can complete real, economically valuable freelance projects (3D & CAD, architecture, graphic design, video and animation, audio, data analysis, web apps, and more) at a quality a paying client would actually accept. Every deliverable is judged by human evaluators against a gold-standard deliverable produced by a paid professional. The headline metric, the automation rate , is the share of projects where the AI's work is judged as good as (or better than) the human's.<br>At the benchmark's release, the best AI agent automated just 2.5% of projects. Today we're publishing results for three newer models, paired with stronger agent scaffolding. Automation rate is rapidly increasing.<br>1. New Automation Rates<br>Fable 5 reaches the highest automation rate measured so far, 16.1% , roughly double Opus 4.8 at 8.3% . GPT‑5.5 reaches 6.3% . All three score above every previously evaluated model.
GPT‑5.5
6.3%
Opus 4.8
8.3%
Highest<br>Fable 5
16.1%*
For context, the previous published leader sat at 4.17% (Opus 4.6 with the Claude Cowork scaffold), and the field topped out at 2.5% when RLI was released. The frontier has more than quadrupled in under eight months , a concrete signal of how quickly economically capable AI agents are advancing.
Automation Rate (%)
15
10
1.25%
2.08%
2.50%
2.92%
4.17%
6.3%
8.3%
16.1%
Gemini 3 Pro
Grok 4
GPT-5.2
Manus 1.6 Max
Opus 4.6
GPT-5.5
Opus 4.8
Fable 5
Full-automation rate on the Remote Labor Index: the share of projects where each model's deliverable was judged at least as good as the professional's. The three newly evaluated models (Fable 5, Opus 4.8, GPT‑5.5) score above every previously evaluated model.
Full-automation rate on the Remote Labor Index: the share of projects where each model's deliverable was judged at least as good as the professional's. The three newly evaluated models (Fable 5, Opus 4.8, GPT‑5.5) score above every previously evaluated model.<br>* We were able to evaluate 218 of RLI's 240 projects before access to Fable 5 was restricted by the U.S. government. We will update the results on the remaining projects shortly. The 22 unevaluated projects are spread uniformly across the benchmark, not concentrated in any sector or difficulty band. Even under the worst-case assumption that Fable 5 failed every missing project, its automation rate would still be 14.6%, higher than any other model.<br>2. What the Work Looks Like<br>Automation rate is a single number, but RLI projects are concrete pieces of commissioned work, each with a client brief, input files, and a professional deliverable. Use the arrows to move between examples.
Ring Design
3D & CAD
The brief
Re-create the client's existing engagement ring with its emerald-cut center stone swapped for a marquise cut, delivering an updated 3D model plus photorealistic rose- and yellow-gold renders.
Input Files
ringdetails.png
Photorealistic render
GPT‑5.5
Opus 4.8
Fable 5
Human
Underlying 3D model
GPT‑5.5
Opus 4.8
Fable 5
Human
Fable 5's ring design is qualitatively much better than deliverables from previous AIs, although upon closer examination it remains unprofessional (low-effort rounded prong design).
Advertisement Video
Video & Animation
The brief
Produce a ~60-second flat-design 2D animated advertisement for "Skyline Tree Services," set to the provided voiceover, that walks viewers through the company's tree-care process and builds trust in the brand.
Input Files
VoiceOver.wav
Raw voiceover audio (the only input file)
Deliverable: three frames per video (intro · consultation · safety)
Human
Fable 5
Opus 4.8
GPT‑5.5
On vector graphics 2D animation, visual quality improves noticeably with the newer models, with animations and synchronization with the audio becoming smoother.
Floor Plan & Renders
Architecture
The brief
From a scanned cadastral plan, site photos, and measurements, produce a clean dimensioned floor plan, furniture-layout options, and photorealistic renders of the redesigned bathroom.
Input Files
cadastral floor plan.jpg · +11 more
Dimensioned floor plan
GPT‑5.5
Opus 4.8
Fable 5
Human
Bathroom render
Actual 3D model
Image-gen “render”
GPT‑5.5
Opus 4.8
Fable 5
Human
The floor plans get visibly more accurate and the 3D models more detailed with the newer models,...