A Significant Increase in Digital Labor Automation

willmarch1 pts1 comments

A Significant Increase in Digital Labor Automation | CAIS

About

About

AI risk<br>Resources

Resources

Contact<br>Careers<br>Donate

Our Work

Resources

AI RiskContactCareersDonateCareers<br>Donate

A Significant Increase in Digital Labor Automation

The newest frontier models automate substantially more real freelance work than their predecessors.

BLOG

AI Risks

July 1, 2026

7 min read

View as PDF

Author:<br>Mantas Mazeika

Related Posts:

Submit Your Toughest Questions for Humanity's Last Exam

Superhuman Automated Forecasting

RLI was jointly developed by the Center for AI Safety and Scale Labs · Leaderboard (CAIS) · Leaderboard (Scale)<br>The Remote Labor Index (RLI) measures how often AI agents can complete real, economically valuable freelance projects (3D & CAD, architecture, graphic design, video and animation, audio, data analysis, web apps, and more) at a quality a paying client would actually accept. Every deliverable is judged by human evaluators against a gold-standard deliverable produced by a paid professional. The headline metric, the automation rate , is the share of projects where the AI's work is judged as good as (or better than) the human's.<br>At the benchmark's release, the best AI agent automated just 2.5% of projects. Today we're publishing results for three newer models, paired with stronger agent scaffolding. Automation rate is rapidly increasing.<br>1. New Automation Rates<br>Fable 5 reaches the highest automation rate measured so far, 16.1% , roughly double Opus 4.8 at 8.3% . GPT‑5.5 reaches 6.3% . All three score above every previously evaluated model.

GPT‑5.5

6.3%

Opus 4.8

8.3%

Highest<br>Fable 5

16.1%*

For context, the previous published leader sat at 4.17% (Opus 4.6 with the Claude Cowork scaffold), and the field topped out at 2.5% when RLI was released. The frontier has more than quadrupled in under eight months , a concrete signal of how quickly economically capable AI agents are advancing.

Automation Rate (%)

15

10

1.25%

2.08%

2.50%

2.92%

4.17%

6.3%

8.3%

16.1%

Gemini 3 Pro

Grok 4

GPT-5.2

Manus 1.6 Max

Opus 4.6

GPT-5.5

Opus 4.8

Fable 5

Full-automation rate on the Remote Labor Index: the share of projects where each model's deliverable was judged at least as good as the professional's. The three newly evaluated models (Fable 5, Opus 4.8, GPT‑5.5) score above every previously evaluated model.

Full-automation rate on the Remote Labor Index: the share of projects where each model's deliverable was judged at least as good as the professional's. The three newly evaluated models (Fable 5, Opus 4.8, GPT‑5.5) score above every previously evaluated model.<br>* We were able to evaluate 218 of RLI's 240 projects before access to Fable 5 was restricted by the U.S. government. We will update the results on the remaining projects shortly. The 22 unevaluated projects are spread uniformly across the benchmark, not concentrated in any sector or difficulty band. Even under the worst-case assumption that Fable 5 failed every missing project, its automation rate would still be 14.6%, higher than any other model.<br>2. What the Work Looks Like<br>Automation rate is a single number, but RLI projects are concrete pieces of commissioned work, each with a client brief, input files, and a professional deliverable. Use the arrows to move between examples.

Ring Design

3D & CAD

The brief

Re-create the client's existing engagement ring with its emerald-cut center stone swapped for a marquise cut, delivering an updated 3D model plus photorealistic rose- and yellow-gold renders.

Input Files

ringdetails.png

Photorealistic render

GPT‑5.5

Opus 4.8

Fable 5

Human

Underlying 3D model

GPT‑5.5

Opus 4.8

Fable 5

Human

Fable 5's ring design is qualitatively much better than deliverables from previous AIs, although upon closer examination it remains unprofessional (low-effort rounded prong design).

Advertisement Video

Video & Animation

The brief

Produce a ~60-second flat-design 2D animated advertisement for "Skyline Tree Services," set to the provided voiceover, that walks viewers through the company's tree-care process and builds trust in the brand.

Input Files

VoiceOver.wav

Raw voiceover audio (the only input file)

Deliverable: three frames per video (intro · consultation · safety)

Human

Fable 5

Opus 4.8

GPT‑5.5

On vector graphics 2D animation, visual quality improves noticeably with the newer models, with animations and synchronization with the audio becoming smoother.

Floor Plan & Renders

Architecture

The brief

From a scanned cadastral plan, site photos, and measurements, produce a clean dimensioned floor plan, furniture-layout options, and photorealistic renders of the redesigned bathroom.

Input Files

cadastral floor plan.jpg · +11 more

Dimensioned floor plan

GPT‑5.5

Opus 4.8

Fable 5

Human

Bathroom render

Actual 3D model

Image-gen “render”

GPT‑5.5

Opus 4.8

Fable 5

Human

The floor plans get visibly more accurate and the 3D models more detailed with the newer models,...

fable automation opus projects model rate

Related Articles