VMAF v1: Good Is Not Good Enough

VMAF v1: Good Is Not Good Enough. By Christos G. Bampis, Zhi Li, Kyle… | by Netflix Technology Blog | Netflix TechBlog | Jun, 2026 | MediumSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Netflix TechBlog

Learn about Netflix’s world class engineering efforts, company culture, product developments and more.

VMAF v1: Good Is Not Good Enough

Netflix Technology Blog

10 min read· Just now

Listen

By Christos G. Bampis, Zhi Li, Kyle Swanson, Nil Fons Miret and Pavan Madhusudanarao Will this encode look good to Netflix members? Does switching to a new codec improve quality at the same bitrate and by how much? What is the best way to encode a movie title given a target bitrate budget? For years, VMAF has reliably helped us answer those questions and deliver an optimized quality of experience to our members. But good is not good enough. If VMAF misjudges quality, that may lead to loss of detail for a suspenseful close-up or banding for a stunning wide-angle sky shot. That’s a lot of trust to put in one number, so we strive to make sure it earns it. Over time, we collected feedback from VMAF users, both internally and externally. A few years ago, we embarked on a journey to develop a new version of VMAF to address some of its known limitations. Today, we are happy to announce that we are open-sourcing a new version of VMAF, with version number v1. By using VMAF v1 we can more accurately assess visual quality and hence efficiently deliver higher quality for Netflix members worldwide. In this post we share how v1 addresses the previous version’s (called VMAF v0) limitations and some of the challenges we faced along the way. What is VMAF and why improve it? VMAF (Video Multimethod Assessment Fusion) is a video quality metric that Netflix developed with university partners and open-sourced on GitHub. It has become a de facto standard for encoding evaluation and optimization for the video industry. VMAF combines elementary quality-aware features and fuses them with a support-vector regressor (SVR) trained on subjective data. For background, see our first, second and third VMAF tech blogs. Despite its accuracy and wide adoption, we have identified room to improve the core of the algorithm. That’s central to our mission of delivering the best possible visual quality to our members no matter where and how they watch Netflix. As new codecs, like AV2, are developed and use cases like live streaming and cloud gaming emerge, we strive to continue to improve VMAF to serve these business needs. We describe each key improvement below. Improving sensitivity to compression artifacts As discussed in our first VMAF tech blog [1], a typical encoding pipeline introduces both compression and scaling artifacts. Intuitively, when more bits are available, higher resolutions are preferable. VMAF quantifies the tradeoff between compression and scaling and determines the optimal resolution to use given a bitrate budget. This can be demonstrated by a VMAF vs. bitrate curve.

Press enter or click to view image in full size

In practice, we observed that VMAF v0 tends to favor switching to a higher resolution at lower bitrates, preferring compression artifacts over scaling, which could be visually annoying. This can be partially attributed to the DLM (Detail Loss Metric) feature, which penalizes contrast/detail loss, but may be less sensitive to distracting artifacts, like blockiness [3]. In VMAF v1, to complement DLM, we added the AIM (additive impairments) component [3] from the original ADM formulation with minor modifications to improve accuracy. These two elementary metrics are linearly combined, similar to the original implementation in [3]. One VMAF model to rule them all A first-order effect that influences quality perception is the visibility of artifacts and its relationship to viewing distance and canvas size. Put simply, the same encoded video looks better when displayed on a smaller canvas or viewed from further away.

Press enter or click to view image in full size

The standard VMAF model assumes that viewers sit in front of a 1920×1080 display, in a living room-like environment, with a normalized viewing distance of approximately 3× the screen height (3H). This means that the standard 1080p@3H VMAF model corresponds to a viewing angle of approximately 60 pixels per degree.

Press enter or click to view image in full size

For phone viewing, given a smaller screen size and longer natural viewing distance (typical phone viewing can be approximated as 4 to 5H) relative to the screen height, we expect that artifacts become less visible. The phone model of VMAF v0 captures this by post-processing the standard (TV/laptop) VMAF score by a second-order polynomial mapping. This mapping was estimated using subjective data. One drawback of the above mapping is that it is hard to generalize predictions for the myriad of viewing conditions that materially differ from the original subjective...

VMAF v1: Good Is Not Good Enough

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

German ruling declares Google liable for false answers in AI Overviews