ecodis :: Efficient Video Codecs
Video codecs which I worked on:
MPEG-I VVC / ITU-T H.266
Royalty-based broadcast & VR codec
In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. [...] The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding (encoding done at the source of the data before it is stored or transmitted) in opposition to channel coding.
Wikipedia page on data compression, 2017
Storing or transmitting contemporary ulta-high-definition (UHD) digital video content in uncompressed form is virtually impossible due to the extremely high data rates; only one second of HDR video with 3840×2160 pixels at 50 frames per second would fit onto a CD. Therefore, efficient lossy coding with very good visual quality even at very low data rates is even more important than in audio applications. This implies that a maximum of redundant and irrelevant information must be removed during the coding.
On this page, the three most efficient newest-generation video coding standards are introduced. The first one, whose specification I am involved in, is still being developed.
MPEG-I Versatile Video Coding (VVC), to be standardized as ITU-T H.266
VVC, also known as H.266, is a flexible general-purpose video coding specification currently standardized by ISO/IEC and ITU-T [ source]. Developed by the Joint Video Experts Team (JVET), the final VVC version is intended to exceed all existing standards (most notably, the three mentioned below) in compression performance at the same subjective reconstruction quality, with only a moderate increase in decoding workload.
Since «serious» work on the VVC standard has just begun (in 2018), I cannot present any comparisons between VVC and other state-of-the-art video codecs. However, I can report from the April 2018 JVET meeting in San Diego that some proposals submitted for standardization in February 2018 achieved notable compression efficiency gains near 40 % over VVC's predecessor (see below) at acceptable decoder complexities of 3–4 times that of the state of the art [ source]. Therefore, I expect the final version of the standard, due in late 2020, to provide bit-rate savings of at least one third at a decoding workload even closer to that of current video coding standards (factor of 2).
Update June 2019 A recently performed subjective test indicates that, for HD and UHD standard dynamic range content, even in its current unfinished state, VVC already achieves the same visual coding quality as its predecessor with 36–40 % lower bit-rate.
During the first stages of the VVC development, I proposed an encoder optimization algorithm improving the subjective video coding quality for some content and presented my work in Macau, Ljubljana, and Marrakech. At the Ljubljana meeting, I also suggested adding support for a new 10- and 12-bit packed YUV/RGB image and video storage format to the VVC code base [ report], a description of which is given below. Later during the standardization, I contributed to better in-loop filtering and the joint transform coding of the chroma components in color images and videos [ reports].
The working draft of the VVC reference encoding/decoding software is hosted here, and the draft specification text (currently version 7) is freely available via this link. Since May 2019, VVC achieves objective efficiency gains (in terms of Bjøntegaard delta rate over its predecessor, the HM reference software) of about 24 % for still pictures (1.8x decoder runtime of HM) and more than 34 % for random-access videos (1.7x decoder runtime of HM) [ table]. The final version is likely to improve upon this only by another 2–3 percent but may end up encoding and decoding a bit faster [ source].
MPEG-5 Essential Video Coding (EVC), an alternative 4K video codec
Around the April 2018 JVET meeting in San Diego, where the first draft of the VVC referencec software and specification text was agreed upon (see above), the Motion Picture Experts Group (MPEG) decided to initiate work on a separate, more constrained (in terms of development duration and included technology) video coding solution, to be completed and standardized in mid 2020 under the name MPEG-5 Essential Video Coding (EVC).
More details and the motivation behind this approach are given here.
In November 2018, two solutions were submitted to MPEG in response to its Call for Proposals (CfP) on new video coding technology with «simplified coding structure and an accelerated development time of 12 months» [ source]. Two months later, at its January 2019 meeting, MPEG evaluated both proposals [ report] and selected the one by Samsung, Huawei, and Qualcomm (a description of which is provided here) as the starting point of the EVC standardization. The relevant EVC documents are provided on the following web pages. Note that an MPEG user name and password are required to access these pages, which implies that this standardization is essentially closed-source.
In its first revision, MPEG-5 EVC roughly matches the joint MPEG-H/ITU-T HEVC in both objective (PSNR) and subjective (visual quality) performance when operated in the royalty-free Baseline mode, at least according to the CfP evaluation. Its Main profile configuration, however, was verified to already deliver 24 % better coding efficiency than HEVC, which may increase by a few percent until the end of EVC's development.
(Update Oct. 2019 ETM 3.0 provides about 26 % rate reduction over HEVC [ source], which should be very close to the final performance that this coding standard will offer.) Note that this value is still roughly 10 % short of the current results for MPEG-I VVC.
The next few years will tell which of these two codecs will achieve a wider market adoption. I will update this section once/if the EVC codec software has been published.
MPEG-H High Efficiency Video Coding, also standardized as ITU-T H.265
Using the High Efficiency Video Coding (HEVC) specification standardized in ISO/IEC 23008-2 and ITU-T H.265 currently is the most efficient way to compress moving pictures, especially high-resolution HD and UHD video. Developed mainly between 2010 and 2013, with some screen content and 3D coding extensions added after 2013, HEVC achieves an increase of about 50 % in perceptual compression efficiency over previous coding standards like H.264 [ source]. In other words, averaged across several coded video sequences, HEVC provides roughly the same subjective video quality as the older coding formats, and it does so using encoded bit-streams which are only half as large. This performance boost is what allowed, for the first time, the delivery of high-quality UHD video to consumers via broadcasting, streaming, and UHD Blu-Ray disc.
As of 2018, hardware-based HEVC decoding is supported by most TVs, set-top boxes, video players, computers, tablets, and even smartphones. The best freely and publicly available HEVC encoder is maintained by the x265 project team and is located here:
HEVC, as described above, is the predecessor of the VVC standard, and most of its underlying technology can still be found in the current draft of the VVC specification. In fact, all visual codecs discussed on this page use the exact same algorithmic building blocks which define a modern hybrid block transform video codec like HEVC. These are
a partitioner segmenting each component of the input into nonoverlapping blocks,
a prediction stage attempting intra- or inter-picture prediction of each input block,
a residual transform converting the prediction error into a spectral representation,
a quantizer mapping the residual transform values to a smaller set of coefficients,
an entropy coder applying lossless compression to the quantized coefficients, and
a few postfilters reducing blocking, denoising, and ringing artifacts upon decoding.
Some codecs also add encoder-side prefilters to complement the decoder's postfilters. Note that modern audio codecs employ the same elements and that, in both audio and video coding, a second prediction stage may be used before or after the quantizer.
AOMedia AV1: A Freely Available Open-Source UHD Video Codec
The AV1 video codec jointly developed by the Alliance for Open Media ( AOMedia) between 2015 and 2018 is a general-purpose open-source coding format based on well-known technology. The video compression capability of AV1 is realized primarily with coding techniques derived from VP9/VP10, Daala, and Thor [ source]. Inside a WebM container, audio compression support is added through the OPUS codec [ source].
The IETF is expected to adopt AV1 as the Internet Video Coding (NetVC) standard in late 2018 [ source] alongside the OPUS codec, which has already been standardized in RfC 6716 in 2012. I anticipate broad hardware decoding support for AV1 in late 2020. Note that software decoding is already provided, even on Windows [ source], and work on a BSD-licensed optimized decoder called dav1d has progressed as well. The current versions of the AV1 specification and software are available at these pages:
Surprisingly, the subjective performance of the AV1 codec in comparison to its latest competitors, VVC and HEVC, is relatively inconsistent. In some independent tests, AV1 matched the coding efficiency of HEVC [ source], while in others, the codec was objectively and subjectively inferior to HEVC [ source]. This can be attributed to the use of different encoder speed presets in the evaluations: for HEVC-like performance, a very slow AV1 encoder preset must be used [ source]. Moreover, the default encoder configuration for random-access scenarios is a bit different from that of other codecs, making direct codec comparisons difficult. However, since there is clear evidence that the precursor to VVC, known as Joint Exploration Model ( JEM), outperforms both AV1 and HEVC and also encodes faster than AV1 [ source 1, source 2], it is obvious that VVC will, indeed, become the most efficient video codec during the next few years.
Figure 1. Outcome
of BBC's subjective
comparison tests of
HEVC (HM software
encoder), AV1, and
JEM as experimen-
tal ancestor of VVC.
(Fig. copyright BBC,
2018, picture taken
R&D blog post)
Some online documentation of AV1's most interesting and innovative coding tools can be found here and here. An overview of all coding tools is given in this paper.
Update Nov. 2018 As this image indicates, the speed of the AV1 reference encoder has recently been increased by at least a factor of 60 without a significant reduction in coding efficiency [1.6 %, source], so it seems that a runtime and coding gain similar to that of the HEVC reference encoder will, indeed, become possible with AV1 soon. This observation is also supported by a May 2019 test by the BBC, summarized here. Still, the efficiency of the VVC reference software clearly remains out of reach for AV1. This shortcoming will be addressed by AV2, whose standardization will begin in a few years when support for AV1 has been widely deployed to consumer devices [ source].
Summary: More Coding Gain Possible but Hard to Implement Efficiently
The current VVC standardization (see above) indicates that further gains in image and video coding performance are still possible. However, given the order of magnitude increases in encoding runtime of both VVC and AV1, I feel that we are rapidly leaving the path of reasonable gain-efficiency tradeoff followed for so long: with next-generation standards, coding of a single 4K image on one processor takes up to half an hour, and the hardware requirements especially for moving-picture coding are substantial. For this reason, most of the often promising but experimental coding tools in JEM (like FRUC, for example) won't make it into VVC: their algorithmic complexity and/or fast memory demands are just too high for usable implementations in both hardware and software. It's true that encoding can now be performed highly parallelized in the cloud, but spending a year of aggregate computing power on one hour of UHD video is clearly not efficient and, so far, not environmentally friendly [ article]. Don't forget the countless coding-decoding iterations performed by the various participating companies during experiments towards a codec's standardization itself (more and more of which need to be run due to the vanishing potential for further coding gain)! And remember that the final HEVC/H.265 reference encoder was only three times slower than its AVC/H.264 ancestor [ source]!
Therefore, aside from working on speeding up the new-generation video encoders by at least an order of magnitude, I believe it's time to reconsider the current approach in video coding development. Some experts at, e.g., Netflix share this vision and call for innovation «beyond block-based hybrid video coding» as outlined above. If that means using more extreme measures like large 4D-DCTs or CNNs, I disagree since, in my view, the computational burden for a competitive level of performance will likely be even more problematic than in today's codecs. If, however, the idea is to refrain from squeezing 9% more coding gain (i.e., statistical redundancy) out of existing block-based schemes, and instead to further exploit the inaccuracies of human vision in the design of image and video codecs (using parametric tools as in audio, which we still hardly do), then I fully support that approach. In fact, with my current work I already do. I hope you do too.
HEIF/MIAF and AVIF: The Two Latest-Generation Still-Image Codecs
Still-image coding, like video coding, has come a very long way since the early days of T.81/JPEG and H.262/MPEG-2 about a quarter of a century ago. Recently, two additions to the long list of image coding specifications have emerged, namely, single-picture constrained variants of HEVC, called High-Efficiency Image Format ( HEIF), and of AV1, named AV1 Still Image File Format ( AVIF). An extension of the HEIF standard known as Multi-Image Application Format ( MIAF) is currently being finalized as well, and as if that weren't enough, the JPEG committee is also working on a novel still-picture coding standard, referred to as JPEG XL (the L means longterm), to be finalized in late 2019 [ source].
(Update Oct. 2019 This milestone has been moved to April 2020 [ source])
All these contenders have in common that they provide efficient support for high-resolution, HDR, and wide color gamut (WCG) as well as (partially) transparent image content. JPEG XL also provides means for lossless transcoding to and from legacy JPEG, PNG, and GIF compressed images, which is a very useful feature in my opinion. I will update this page with further details on each coding specification and comparative demos once evaluation software for all of these coding standards has become publicly available. For now, I can recommend this interactive, recently updated picture coding demo by Thomas Daede. See also here and here.
Figure 2. Basic
evolution of still
image coding in
the age of the
Internet. Top left:
JPEG (1992), top
right: JPEG 2000
left: HEIF (2015)
and bottom right:
in visual quality
by modern image
codecs like AVIF,
JPEG XL or VVC).
was chosen for all
Notice how block-
ing and blurriness
vanishes from top
left to lower right.
(Fig. Lena image)
page last modified in November 2019, removed some text as requested by my employer, Fraunhofer HHI