AXINO TECH HOME

Published 2 February 2013

N.Z FREEVIEW AUDIO LOUDNESS
Are broadcasters conforming to their own standards?

Sound loudness has traditionally been badly controlled on T.V. Ads have sounded louder than programmes since the start of TV. Until recently, TV stations controlled peak sound levels only, which had very little to do with actual loudness as perceived by human listeners. TVNZ started talking about implementing better control of sound levels at the start of 2010, when the ITU presented its draft specification for improved sound loudness control for TV. Standard EBU 128 using ITU BS1770-2 is now ratified and although it has taken three years of talk, this is the specification that TVNZ, MediaWorks and hopefully, SKY (Prime) will adopt. From 1 Jan 2013, a voluntary TV standard based on the EBU guidelines, will be in place for the major broadcasters using the Freeview platform.

In the U.S, the government has forced TV stations by law, to control commercials to the same average loudness as programmes through the 'CALM' (Commercial Advertisement Loudness Mitigation) act. From December 2012, U.S viewers can actually complain to the FCC (Federal Communications Commission) if they think an ad is too loud, and the commission will take action. Sustained non-compliance with the law would result in monetary penalties for the station concerned and potentially, the station can be forced to cease broadcasting. Other countries are introducing similar laws. Here in N.Z, our governments do not generally like enacting such laws, mainly because of the on-going costs of monitoring and enforcement. They prefer that industries introduce voluntary codes in the hope such industries will 'do the right thing'. During 2011, the N.Z Labour opposition party did propose a similar law to the U.S CALM act, but such regulation was not seen as necessary by the incumbent National government.

finger in ear characterComplaints about the loudness of ads on TV have been around since the start of TV and it has taken broadcasters a little over 50 years (in N.Z) to do something. The following article describes some measurements of sound loudness taken from some of the FTA broadcast channels via Freeview terrestrial during December 2012 and January 2013.

About the new loudness standard

The EBU standard for TV loudness ITU BS1770 can be found here. Essentially,it specifies how loudness should be measured and controlled. Instead of specifying a peak level only, the standard requires an average level (centre-of-gravity) to be defined, which is a specific number of decibels below the maximum possible level. This is measured with an averaging time of typically 3 seconds. It is frequency weighted to give more emphasis to the parts of the spectrum where human hearing is more sensitive. The weighting method is called K-weighting and this is significantly different to the ubiquitous "A-weighting" used (often inappropriately) in industry sound level measurements. More emphasis is given to low frequencies and as such it is closer to the historical C-weighting curve. The units of measurement are still decibels (dB) but is referenced to digital full-scale (FS). The form of expression for the loudness unit becomes dB(LKFS) which is simply Loudness, K-weighted, below Full-Scale. The Europeans and the U.S have harmonised to a long term loudness value of -23dB LKFS plus a true-peak of -1dBFS. Embodied in the standard is the need to accurately measure peak levels as well, so that the loudness value is always referenced correctly. Peak levels have been misunderstood in the digital realm, meaning that by traditional methods, peaks can be under-estimated, particularly where bit-rate reduction or psycho-acoustic coding is involved. Meters that measure loudness also have a true-peak (TP) indicator, calculated by over-sampling the digital audio.

The voluntary N.Z standard, promulgated by TVNZ, recommends a long-term loudness of -24dB LKFS, with a maximum true-peak of -2dB on surround-sound tracks, and -9dB for stereo tracks. This difference for peak levels exists currently to protect analogue transmissions, and may be reviewed when these transmissions cease at the end of this year. The minor differences from EBU recommendations will not be significant; of most importance is the variation of loudness between programmes (and ads) and between channels.

What matters to the TV listener

There are three aspects of loudness control for the viewer (listener).

  1. Loudness between different programmes on any service.
    Of course, and most importantly, that includes the ads. Document A/85:2011 which is the "ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television" in the U.S includes a note that several studies have shown that the loudness "comfort range" for typical television listening is from +2dB to -5dB ( from the long term average). Beyond this range, a viewer is likely to become annoyed, eventually reaching for the remote control to change volume (or worse from the broadcaster's point of view, to mute a commercial).
  2. Loudness difference between services.
    It makes for poor user-friendliness if you have to reach for the volume control when going between services, such as between TV3 and Choice TV for example.
  3. Loudness difference between audio tracks.
    TV1,TV2 and TV3 each broadcast Dolby AC3 surround sound (FreeviewHD only) in addition to their stereo track.
    All receivers allow the listener to choose which track, but not all allow the user to set a default for one or the other track. This is unfortunate, since the receiver then prefers the surround track over the stereo track when a surround track is present, meaning that if you change from a stereo-only service such as ChoiceTV to say TV3, the receiver will jump from stereo to surround, often with a big change of volume.

The following tests look at (1) and (2) but not part (3) since I could not establish a consistent response from various Dolby codecs that were checked. This will be the subject of future work.

Results from December 2012

Recordings (AAC stereo only) were made from TV1,TV2,TV3,C4, Prime and Choice during the evenings. The following are plots of levels against elapsed time. The upper (red) trace is the true peak level and the lower green trace is the loudness based on 2.5 second averaging. The black trendline is a running average of the loudness over several samples. Following the six plots are histograms of loudness, a data table and notes.

TV1 plot TV2 plot TV3 plot C4 plot Prime plot Choice plot

While it is possible to discern the trends and approximate consistency of loudness from each station in the above plots, a better way is to produce histograms of loudness values for each. The histograms which follow show loudness level in 0.25dB bins as a percentage of the whole recording.

TV1 histo TV2 histo TV3 histo C4 histo Prime histo Choice histo

These make it clear that some stations are better than others at controlling the loudness. The ideal histogram shape would be a narrow peak centred at -24dB, without too much above that level. Loudness values below the ideal of -24dB is not quite the same problem, but nonetheless, most loudness values should be contained between -29dB and -22dB to conform to the ATSC guidelines. Looking at the TV1 histogram, the loudness values peak at -22dB and -21dB, which although they are too high, at least are contained in a narrow range. TV2 has a wider range of loudness than TV1 but the median value is 1dB lower. TV3 range of loudness is rather too wide. C4 is interesting. C4 is well controlled to a median of -23dB, with quite a narrow loudness range. I suspect this is achievable for C4 without too much effort, since all their source material is of a similar nature, probably comes from the one source anyway and it is mostly heavily compressed. Both Prime and Choice are too loud overall, but Prime especially has too wide a range of loudness.

Now, here are the data stats for the above in a table form. datatableAll these stations strictly are too loud based on their own voluntary standard, which states the median value should be -24dB LKFS. It wouldn't matter a lot if they were all centred on -23dB; the EBU standard, but they all should be closer together. Of more concern is the loudness range. All would get a knock on the door from the FCC if they were in the USA. TV1 might get away with it, but TV2, TV3, Prime and Choice have a bit of work to do yet. The worst sin is exceeding 2dB above the long term median. TV3 had 16.7% of readings more than 2dB higher than the median value. This might not appear too bad, but that is 20 minutes in a 2 hour period where you would be wanting to hit the mute button. TV3 has a different audio control method than the others. If you look at the TV3 time versus levels plot, there are periods where the peak levels and the loudness all drop. The peaks drop by up to 6dB. Curiously, these blocks of lower levels are the ad breaks. TV3 has either by accident or design, taken to dropping the output level of their ad playout machines. They may be using metadata to do this, but either way, they have gone too far, since the loudness within the ad breaks is lower than the programme segments in between. If they got that right, then the overall range of loudness would be smaller. This is not saying that the approach is wrong, since it may turn out that some ads need to be lower than average in order to be less annoying.

The big problem for the TV stations which makes controlling loudness difficult, is they might have material from 20 or more sources in any given hour. Each programme will have a certain loudness, then each and every ad and promo will have a certain loudness. Getting the loudness matched by each supplier of material must be a nightmare. So, the station then has to have a means of overall levelling the audio, which makes for a problem at transitions. Of course, these transitions are always the programme/ad interface.

Results from January 2013

To get an idea of whether there was any consistency, I repeated the exercise for some stations during January 2013.Below are the histograms obtained.

TV1 Jan histo TV3 16Jan histo TV2 11Jan histo TV2 lotr histo Prime 4Jan histo Prime 13Jan histo Choice Jan histo

And a data table for the January results..... datatableJan Only TV1 again shows a good distribution of loudness. Ideally, the histogram peak should be at -24dB LKFS, but that single peak is the proper shape indicating good loudness consistency. Only 2.3% of TV1 samples went higher than 2dB above median.

The Choice data is not too bad either except it is around 4dB too loud overall. Note that my vertical scales vary in this batch, however the spread of data within the plots are the most important factor.

TV3 nailed the correct long term loudness in this recording; the spread is higher than optimum, but it is skewed towards lower, rather than higher volume.

There are two plots for Prime taken on different days. The first one has a median that is about 2dB high with a concentration up to 2dB higher still. The second one has a median only 0.5dB high overall and there is less material on the high side. Prime appears not to have consistent audio levelling from one day to the next.

There are also two plots for TV2 taken on different days. On the 11th, they showed one of the Lord of the Rings movies. This movie has a lot of quiet bits, amongst the loud parts. The many quiet bits show in the histogram as the rising peak below -34dB LKFS. The nature of the movie produced a low median value of -25.6dB LKFS. Of course, there are ad breaks, and these are responsible for much of the skew to the high side of the median. This demonstrates very well, how difficult it is for a station to manage loudness with disparate source material. TV2 appears to have elected not to compress the LOTR movie in order to make it match better the loudness of ads and promos. In these circumstances they could have chosen to not insert ad breaks when the movie is quiet, to reduce the impact of the loudness change. Otherwise, they could have taken the TV3 approach and dropped the output of the ad playout machines to better suit the movie.

Summarising the test

Conforming to the voluntary standard

There is more work to do. None of the stations consistently nail the long-term loudness value of -24dB LKFS. Only TV1 appears to practice audio levelling based on loudness. TV3 has taken to dropping the playout level of ads, possibly based on embedded metadata. C4 has, probably accidentally, got reasonable loudness control due to the nature of its programming. Prime, Choice and TV2 have not shown consistent enough results in these tests to say that they are yet practicing levelling based on loudness.

The EBU R128 standard

The concept of levelling audio volume by loudness instead of peak levels is a great step forward. Just a pity it has taken fifty years to come about. Stations around the world have largely agreed to use the EBU R128 standard, which makes use of ITU BS1770-2 loudness measurement methodology. The essential question about BS1770 is whether it does, in fact, accurately represent loudness as a human would hear it. There are problems with BS1770. The main one is that for highly compressed material; e.g ads, it has been shown to over-indicate loudness. Another issue is that BS1770 is normally used on the 'full mix' instead of being used to normalise dialogue only. If the mix contains a lot of bass energy, the loudness algorithm may get it wrong and 'pumping' effects may occur when using a BS1770-2 based device to control levels. In fact, for surround 5.1 tracks, BS1770 excludes the .1 (subwoofer) track because early tests showed it did not correlate well with listening tests. The ITU has acknowledged this problem and is working on a v.3 of the standard. Despite these difficulties, loudness levelling is a much better basis than peak normalising to harmonise audio levels in a broadcast environment.

Loudness and annoyance

Listeners complain about ads. They get annoyed. The annoyance has a lot to do with the ads being louder than the programmes slotted between. So, assuming that stations level the audio so that loudness of ads is the same as loudness of the programmes, will people still get annoyed with the ads? I suspect they will, at least for some ads. The ones that sound like a demented race commentator for example. You would have to drop their volume a long way to stop being annoyed by them. Take note Harvey Norman, Noel Leeming, Bond and Bond, Big Save, Rebel Sports to name but a few. I see the CarpetOne lady has been toned down in recent times. Loudness is not the only factor governing annoyance and a complete technical solution may not even be possible. If I operated a TV station, I would not want to alienate viewers at all by playing annoying ads and I would exercise my executive right to refuse ads that are guaranteed to annoy my viewers. It would upset ad-makers who would say their creative right is sacrosanct, but their egos will subside in due course.

About these tests

To make these measurements, I used a Hauppauge WinTV HVR-3300 tuner which receives local DVB-T stations. The tuner produces a transport stream output, which can be saved as a .ts file. The file can be played back by VLC and other codecs. Audio measurements on playback are made by the Orban Loudness meter V2.0.8. using the Win7 pc WASAPI interface. I set the short term BS1770 averaging time to be 2.2 seconds instead of the default 3 seconds. This is because many spot ad inserts and promos are only 15 seconds long, so 3 second averaging is not as definitive. In practice, averaging periods between 1.5s and 3 sec made little difference to the results. EBU do provide some calibration files to test any loudness meter. The csv log from the Orban meter is imported into Excel to do the summaries and produce the plots and histograms.

Make a comment

Axino-tech Consulting & Services , February 2013.

Comments

From: Graham Knights 24 July 2013
Another very useful explanation of complex things in words that people (who know enough to be dangerous) can understand :o) I have long been frustrated by the 'loudness' problem on TV and mute ads as a matter of course, or better still record the few things I want to watch, and watch them when it suits me (fast-forwarding the ads). By the way, I still pay a lot for TV, but am unable to buy ad-free TV. Will it ever be available, at any price?

From: Peter Fleury (peter.fleury@xtra.co.nz) 3 February 2014
Interesting, though I dispute the results as they apply now. Currently, TV1 has offensively loud adverts and promos compared to Prime for example. I have broadcasting experience and have worked with film optical sound techniques, so I have an understanding of loudness, compression and anticipatory level control. With film sound, the soundtrack has a limited range, so when placing a soundtrack on film, a delay is used and a loud wavefront is observed and adjusted before it hits the track. Though this is redundant technology now, by simply delaying all content, adjustments can be done on the fly. With the appropriate software used, levels can be levelled out. By this I mean a measurement of programme material is made, and a loudness level measured and then the promos and adverts adjusted to a similar level. It is not rocket science, it should be simple, but I feel that the reluctance of \"those who yell to get their point across\" will fight changes. The insidious nature of the insistent advertisers, and the station's promos have a vested interest in their point being in front (human nature, or the greed aspect anyway). Only tough penalties will quell the yell.

From: Axino 5 February 2014
Hi Peter I do agree with your comments above. From my own observations, things are different now. I am contemplating doing another set of measurements... '... a year on' kind of thing. Unfortunately a snapshot over a couple of hours provides only a limited picture in any case and also takes a fair time commitment to ensure correctness. I don't know whether the broadcasters all have levelling equipment at the output of the studio like they once had. I think they rely heavily on the sources to get it right. Dolby tracks can come with embedded metadata relating to loudness and it is possible that MediaWorks at least use that to assist with loudness control. Last year I had contact with Michael Orton of Ebus. They are an organisation which offers to 'fix' commercial products both in terms of video and audio quality before the product is sent to the broadcaster. Their info sheet is at http://officialblog.ebus.tv/2013/03/loudness-hows-it-measured-by-ebus.html . I have no idea just how many ad-makers send their offerings through the ebus process but since it costs money, perhaps the answer is not many. Yes it is a shame the government didn't adopt similar legal requirements like the U.S, I suppose they saw costs for monitoring and legal proceedings.

Make a comment