Math can be truly awe-inspiring, as in this example of the unexpected places that π can show up. The proof is nothing short of elegant – be sure to watch parts 2 and 3. Astonishing!

*–agc*

Skip to content
#
AGC Systems

## Delivering technology solutions that work for you

## Category: Explanation

## Happy Π Day

## Do I Really Need a 4K (or 8K!) TV?

## There’s No Such Thing as RMS Power!

### Additional thoughts on root-mean-square

## Industry Analysis

## HEVC and VP9 Comparison for IP Video: Which Will Win the Day?

### VP9 Web Video Format Improves on VP8

### Comparing HEVC and VP9

*This article was originally published in* Video Edge Magazine.

Math can be truly awe-inspiring, as in this example of the unexpected places that π can show up. The proof is nothing short of elegant – be sure to watch parts 2 and 3. Astonishing!

*–agc*

The short answer is, no and yes. Some analysts will have you believe that “8K TV blows 4K away,” and that might suggest that you at least want a 4K TV. The reality, as it comes to electronics and perception, is more complicated.

One might assume that higher resolution always makes a picture better, because the pixels get smaller and smaller, to the point where you don’t see them anymore. But the human visual system — your eyes — has a finite capacity, and once you exceed this, any other “improvement” is wasted, because it just won’t be seen.

Here’s why (warning, geometry involved):

The term “20/20 vision” is defined as the ability to just distinguish features that subtend one-arc-minute of angle (one-sixtieth of a degree). In other words, objects at a certain distance can only be resolved as separate objects if the objects are a certain distance apart.

Using trigonometry, this works out to be about 1/32″ as the smallest separation a person with 20/20 vision can see at a distance of ten feet. We can use the same math to show that the “optimum” distance from which to observe an HD (1080-line) display (i.e., where a 20/20 observer can just resolve the pixels) is about 3 times the picture height.

On a 1080-line monitor with a 15” diagonal, this works out to an optimum viewing distance of just under two feet; with a 42” display, it’s about five-and-a-half feet. Sitting closer than this means the pixels will become visible; sitting further means that the resolution is “wasted.” Keep in mind, also, that most people sit about 9 feet away from the TV, what is sometimes called the “Lechner distance,” after a well-known TV systems researcher.

Of course, these numbers (and others produced by various respectable organizations) are based on subjective evaluation of the human visual system, and different observers will show different results, especially when the target applications vary. Nonetheless, the “three picture heights” rule has survived critical scrutiny for several decades, and we haven’t seen a significant deviation in practice.

At 4K, the optimum distance becomes 1.6 picture-heights: at the same 1080-display viewing distance of 5.5 feet, one needs an 84”-diagonal display (7 feet), which is already available. For these reasons, some broadcasters believe that 4K is not a practical viewing format, since displaying 4K images would require viewing at 2.5 picture-heights to match normal human visual acuity.

At 8K, the numbers become absurd for the typical viewer: 0.7 picture heights, or a 195″ diagonal (16 feet) at a 5.5-foot distance. With a smaller display, or at a larger distance, the increased resolution is completely invisible to the viewer: that means wasted pixels (and money). Because such a display is very large (and thus very expensive), the 105-degree viewing angle it would subtend at the above viewing distance approaches a truly immersive and lifelike experience for a viewer — but how many people would put such a beast in their home?

From a production perspective, 4K does make some sense, because an environment that captures all content in 4K, and then processes this content in a 1080p workflow for eventual distribution, will produce archived material at a very high intrinsic quality. Of course, there’s a cost associated with that, too.

But there *are* two other reasons why one might be persuaded to upgrade their HDTV: HDR (High Dynamic Range) and HFR (High Frame Rate). Briefly, HDR increases the dynamic range of video from about 6 stops (64:1) to more than 200,000:1 or 17.6 stops, making the detail and contrast appear closer to that of reality. HFR increases the frame rate from the currently-typical 24, 30 or 60 fps to 120 fps. And these other features make a much more recognizable improvement in pictures — at almost any level of eyesight. But that’s another story.

— *agc*

This is one of my engineering pet peeves — I keep running into students and (false) advertisements that describe a power output in “RMS watts.” The fact is, such a construct, while mathematically possible, has no meaning or relevance in engineering. Power is measured in watts, and while the concepts of average and peak watts is tenable, “RMS power” is a fallacy. Here’s why.

The power dissipated by a resistive load is equal to the square of the voltage across the load, divided by the resistance of the load. Mathematically, this is expressed as [Eq.1]:

\large P=\frac{V^{2}}{R}

where *P* is the power in watts, *V* is the voltage in volts, and *R* is the resistance in ohms. When we have a DC signal, calculating the power in the load is straightforward. The complication arises when we have a time-varying signal, such as an alternating current (AC), e.g, an audio signal or an RF signal. In the case of power, the most elementary time-varying function involved is the *sine* function.

When measuring the power dissipated in a load carrying an AC signal, we have different ways of measuring that power. One is the instantaneous or time-varying power, which is Equation 1 applied all along the sinusoid as a time-varying function. (We will take *R* = 1 here, as a way of simplifying the discussion; in practice, we would use an appropriate value, e.g., 50Ω in the case of an RF load.)

In Figure 1, the dotted line (green) trace is our 1-volt (peak) sinusoid. (The horizontal axis is in degrees.) The square of this function (the power as a function of time) is the dark blue trace, which is essentially a “raised cosine” function. Since the square is always a positive number, we see that the power as a function of time rises and falls as a sinusoid, at twice the frequency of the original voltage. This function itself has relatively little use in most applications.

Another quantity is the peak power, which is simply Equation 1 above, where *V* is taken to be the peak value of the sinusoid, in this case, 1. This is also known as peak instantaneous power (not to be confused with peak envelope power, or PEP). The peak instantaneous power is useful to understand certain limitations of electronic devices, and is expressed as follows:

A more useful quantity is the average power, which will provide the equivalent heating factor in a resistive device. This is calculated by taking the mean of the square of the voltage signal, divided by the resistance. Since the sinusoidal power function is symmetric about its vertical midpoint, simple inspection (see Figure 1 again) tells us that the mean value is equal to one-half of the peak power [Eq.2]:

\large P_{avg}=\frac{P_{pk}}{2}=\frac{V^{2}_{pk}/R}{2}which in this case is equal to 0.5. We can see this in Figure 1, where the average of the blue trace is the dashed red trace. Thus, our example of a one-volt-peak sinusoid across a one-ohm resistor will result in an average power of 0.5 watts.

Now the concept of “RMS” comes in, which stands for “root-mean-square,” i.e., the **square-root of the mean of the square** of a function. (The “mean” is simply the average.) The purpose of RMS is to present a particular statistical property of that function. In our case, we want to associate a “constant” value with a time-varying function, one that provides a way of describing the “DC-equivalent heating factor” of a sinusoidal signal.

Taking the square-root of *V ^{2}*

\large V_{rms}=\sqrt{\frac{V^{2}_{pk}}{2}}=\frac{V_{pk}}{\sqrt{2}}\approx 0.7071

Thus, if we applied a DC voltage of 0.7071 volts across a 1Ω resistor, it would consume the same power (i.e., dissipate the same heat) as an AC voltage of 1 volt peak. (Note that the RMS voltage does not depend on the value of the resistance, it is simply related to the peak voltage of the sinusoidal signal.) Plugging this back into Eq. 2 then gives us:

\large P_{avg}=\frac{V^{2}_{rms}}{R}Note the *RMS* voltage is used to calculate the *average* power. As a rule, then, we can calculate the RMS voltage of a sinusoid this way:

\large V_{rms} \approx 0.7071 \cdot V_{pk}

Graphically, we can see this in Figure 2:

The astute observer will note that 0.7071 is the value of sin(45°) to four places. This is not a coincidence, but we leave it to the reader to figure out why. Note that for more complex signals, the 0.7071 factor no longer holds. A triangle wave, for example, yields *V*_{rms} ≈ 0.5774 · *V*_{pk }, where 0.5774 is the value of tan(30°) to four places.

For those familiar with calculus, the root-mean-square of an arbitrary function is defined as:

\large F_{rms} = \sqrt{\frac{1}{T_{2}-T_{1}}\int_{T_{1}}^{T_{2}}[f(t)]^{2}\, dt}

Replacing *f*(*t*) with sin(*t*) (or an appropriate function for a triangle wave) will produce the numerical results we derived above.

Because of the squaring function, one may get the sense that RMS is only relevant for functions that go positive and negative, but this is not true.

RMS can be applied to any set of distributed values, including only-positive ones. Take, for example, the RMS of a rectified (absolute value of a) sine wave. As before, *V*_{rms}=.7071 · *V*_{pk }, i.e., the RMS is the same as for the full-wave case. However, *V*_{avg} ≈ 0.6366 · *V*_{pk} for the rectified wave (but equals zero for the full-wave, of course; 0.6366 is the value of 2/π to four places). So, we can take the RMS of a positive-only function, and it can be different than the average of that function.

The general purpose of the RMS function is to calculate a statistical property of a set of data (such as a time-varying signal). So the application is not just to positive-going data, but to any data that varies over the set.

—*agc*

Internet video has become a practical medium for the delivery of video content to consumers. What has made this possible is the development of video compression, which lowers the enormous amount of bandwidth required to transport video to levels practical with most Internet connections. In this article, we’ll examine some of the technical and business issues associated with two video codec frontrunners: HEVC and VP9.

HEVC (High Efficiency Video Coding, also called MPEG-H Part 2 and ITU-T H.265) is a state-of-the-art video compression standard that provides about a 50 percent bit rate savings over H.264/MPEG-4 AVC, which in turn provided a similar efficiency over its MPEG-2 predecessor.

AVC solutions have already become widespread in many professional and consumer devices. HEVC, having been ratified by ISO and ITU in 2013, is similarly growing in the same applications, and would appear to be on the road to replacing the earlier codecs. But while MPEG and HEVC have been developed by standards committees representing a legion of strong industrial players, other forces have sought to displace their primacy, most notably, Google, with its VP9. First, let’s look at the toolkit of each codec.

HEVC incorporates numerous improvements over AVC, including a new prediction block structure and updates that include intra-prediction, inverse transforms, motion compensation, loop filtering and entropy coding. HEVC uses a new concept called coding units (CUs), which sub-partition a picture into arbitrary rectangular regions. The CU replaces the macroblock structure of previous video coding standards, which had been used to break pictures down into areas that could be coded using transform functions. CUs can contain one or more transform units (TUs, the basic unit for transform and quantization), but can also add prediction units (PUs, the elementary unit for intra- and inter-prediction).

HEVC divides video frames into a hierarchical quadtree coding structure that uses coding units, prediction units and transform units. CUs, TUs and PUs are grouped in a treelike structure, with the individual branches having different depths for different portions of a picture, all of which form a generic quadtree segmentation structure of large coding units.

While AVC improved on MPEG-2 by allowing multiple block sizes for transform coding and motion compensation, HEVC coding tree blocks can be either 64×64, 32×32, 16×16 or 8×8 pixel regions, and the coding units can now be hierarchically subdivided, all the way down to 4×4 sized units. The use of tree blocks allows parallel processors to decode and predict using data from multiple partitions—called wavefront parallel processing (WPP), which supports multi-threaded decode.

Because this new coding structure avoids the repetitive blocks of AVC, HEVC is better at reducing blocking artifacts, while at the same time providing a more efficient coding of picture details. HEVC also specifies several planar and DC modes, which reconstruct smooth regions or directional structures in a way that hides artifacts better. An internal bit depth increase allows encoding of video pictures by processing them with a color depth higher than 8 bits.

Motion compensation is provided with two new methods, and luma and chroma motion vectors are calculated to quarter- and eighth-pixel accuracy, respectively. A new deblocking filter is also provided, which operates only on edges that are on the block grid. After the deblocking filter, HEVC provides two new optional filters, designed to minimize coding artifacts.

With YouTube carrying so much video content, it stands to reason that the service’s parent, Google, has a vested interest in not just the technology behind video compression, but also in some of the market considerations attached therein. To that end, VP9 has been developed to provide a royalty-free alternative to HEVC.

Many of the tools used in VP9 (and its predecessor, VP8) are similar to those used in HEVC—but ostensibly avoid the intellectual property used in the latter. VP9 supports the image format used for many web videos: 4:2:0 color sampling, 8 bits-per-channel color depth, progressive scan, and image dimensions up to 16,383×16,383 pixels; it can go well past these specs, however, supporting 4:4:4 chroma and up to 12 bits per sample.

VP9 supports superblocks that can be recursively partitioned into rectangular blocks. The Chromium, Chrome, Firefox and Opera browsers now all support playing VP9 video in the HTML5 video tag. Both VP8 and VP9 video are usually encapsulated in a format called WebM, a Matroska-based container also supported by Google, which can carry Vorbis or Opus audio.

VP8 uses a 4×4 block-based discrete cosine transform (DCT) for all luma and chroma residual pixels. The DC coefficients from 16×16 macroblocks can then undergo a 4×4 Walsh-Hadamard transform. Three reference frames are used for inter-prediction, limiting the buffer size requirement to three frame buffers, while storing a “golden reference frame” from an arbitrary point in the past.

VP9 augments these tools by adding 32×32 and 64×64 superblocks, which can be recursively partitioned into rectangular blocks, with enhanced intra and inter modes, allowing for more efficient coding of arbitrary block patterns within a macroblock. VP9 introduces the larger 8×8 and 16×16 DCTs, as well as the asymmetric DST (discrete sine transform), both of which provide more coding options.

Like HEVC, VP9 supports sub-pixel interpolation and adaptive in-loop deblocking filtering, where the type of filtering can be adjusted depending on other coding parameters, as well as data partitioning to allow parallel processing.

As you would expect, performance depends on who you ask. Google says VP9 delivers a 50 percent gain in compression levels over VP8 and H.264 standards while maintaining the same video quality. HEVC supporters make the same claim, which would put VP9 close to HEVC in quality. But some academic studies show that HEVC can provide a bit rate savings of over 43 percent compared to VP9. Why the disparity? One likely reason is that using different tools within each codec can yield widely varying results, depending on the video material. The other is that, despite some labs having developed objective tools to rate image quality, the best metric is still the human visual system, which means that double-blind subjective testing must be done, and that will always have statistical anomalies.

But another important factor must be considered as well, and that’s complexity. While both HEVC and VP9 demand more computational power at the decoder, the required encoding horsepower has been shown to be higher (sometimes more than 10 times) for HEVC in the experiments where it outperformed VP9 on bit rate.

There’s a strong motivation for advancing an alternative to HEVC: VP9 is a free codec, unencumbered by license fees. Licenses for HEVC and AVC are administered by MPEG LA, a private firm that oversees “essential patents” owned by numerous companies participating in a patent pool.

Earlier this year, MPEG LA announced that a group of 25 companies agreed on HEVC license terms; an AVC Patent Portfolio License already provides coverage for devices that decode and encode AVC video, AVC video sold to end users for a fee on a title or subscription basis, and free television video services. Earlier, MPEG LA announced that its AVC Patent Portfolio License will not charge royalties for Internet video that is free to end users (known as “Internet Broadcast AVC Video”) during the entire life of the license; presumably, this means the life of the patents.

Last year, Google and MPEG LA announced that they had entered into agreements granting Google a license to techniques that may be essential to VP8 and earlier-generation VPx video compression technologies under patents owned by 11 patent holders. The agreements also grant Google the right to sublicense those techniques to any user of VP8, whether the VP8 implementation is by Google or another entity. It further provides for sublicensing those VP8 techniques in one next-generation VPx video codec.

So, while there is no license fee required to use VP8, there are other terms imposed—a so-called FRAND-zero license—and users may need a license to fully benefit from the Google-MPEG-LA agreement. One result of the agreements is that MPEG LA decided to discontinue its effort to form a VP8 patent pool.

Apparently, VP9 is a further attempt to provide a shield against the MPEG patent owners, by using elements thought to evade existing granted patents. But HEVC has already made inroads into commercial hardware and software, following on the heels of the already widespread MPEG-4/AVC rollout, and this could make an uptake of VP9 difficult. And even the best intents of the VP8/VP9 developers can be subverted: it’s always possible that a “submarine patent” could emerge, with its owner claiming infringement.

This has already happened, with smartphone maker Nokia suing HTC over its use of the VP8 codec. In this particular case, a court in Mannheim, Germany, ruled that VP8 does not infringe Nokia’s patent—but the possibility always exists of another challenge. While the specter of another contest could be enough to give some manufacturers pause, tilting support toward the “safer” HEVC, it could just as well be subject to some other submarine patent.

A final note: Google has announced that development on VP10 has begun, and that after the release of VP10, they plan to have an 18-month gap between releases of video standards.