All posts by john

University of Surrey Tonmeister Lecture

Thank you, Institute of Sound Recording! I was very touched by your welcome yesterday, though have never been introduced as a “Legendary Tonmeister” before. To be honest, that description is better owned by the likes of graduates such as Francis Rumsey, Mike Hatch or Jim Abbiss to name but three.

A full house of 2nd-year and final-year students, along with distinguished staff and alumni, came to hear my stories of music production, laughed in (some of) the right places, and asked a few challenging questions. If you were there and didn’t manage to speak up in the time allocated, please make contact through this blog or through the department.

Interest was expressed in being able to hear or see again the extracts of music and film that were critiqued, so I shall upload them in a way that might be useful to you in the near future.

The IoSR kindly organised some decent playback kit; my inability to see any of my lecture notes was my fault alone, so some of the material below wasn’t used in the lecture. Nevertheless, when it is written-up, it may possibly make sense.

This isn’t a blog version of my talk — you must come to the lecture for that — but you might it helpful to have notes of the recordings I used.

Each and every extract of a recording is accompanied by a critique of the performance or technique exhibited, so can be shown publicly in this context under the doctrine of Fair Use (in the USA) or Fair Dealing (in the UK, Europe and many Commonwealth countries).

Some recordings, e.g. the Stokowski, Stravinsky and Delibes early stereo examples, and the critique of the Elgar and Duke Ellington “accidental stereo” recordings, are still to be added. As is the tape of the Walter Gieseking Beethoven concerto performance recorded in Berlin in January 1945 where you can hear the bombs falling in the slow movement, again in stereo.

Index Description Date/Location Medium
1 Preussische Staatskapelle Berlin cond. Herbert von Karajan — Anton Bruckner, Symphony No. 8 in C minor, WAB 108 (III: Finale) 29 September 1944; Berlin Stereo tape
2 Michael Flanders & Donald Swann — A Song Of Reproduction (At The Drop Of A Hat) 2 May 1959; The Fortune Theatre, London Stereo tape
3 Paid in Full performed by Eric B. & Rakim, written by Eric Barrier and Rakim Allah 1985; Powerplay Studios, New York City Stereo tape
4 A Journey Into Sound — Train sequence, narrator: Geoffrey Sumner 1957, London Stereo tape via LP Decca SKL 4001
5 Under The Bridges of Paris played by Edmundo Ros and His Orchestra; Ping, Pong demonstration 1957, London Stereo tape via LP Decca SKL 4001
6 Cincinnati Pops Orchestra cond. Erich Kunzel; The Year 1812 (Festival Overture), by P.I.Tchaikovsky 1978; The Music Hall, Cincinnati, Ohio Soundstream digital + video
7 London Symphony Orchestra cond. André Previn: Images for Orchestra (I: Gigues) July 1979; No. 1 Studio, Abbey Road Prototype 14-bit stereo digital recorder
8 Something played by Steve Marcus (tenor saxophone), J. Inagaki & Soul Media 1970/71, Tokyo NHK 13-bit digital recorder
9 USA TV ad: Ronco Record Vacuum Unknown and best forgotten Unknown
10 Latin lesson April 1938; Eltham College, Mottingham, London BBC disc 870625
11 Let’s Begin played by Paul Whiteman and His Orchestra January 1933, New York City Victor, shellac disc 24453
12 Corner Pocket played by Harry James and His Orchestra 1976, Wylie Chapel, Hollywood LP disc
13 Money from the album Jazz Side Of The Moon Sepember 11–12 2007, St Peter’s Episcopal Church, New York City 24-bit stereo digital recording
14 I Can’t Quit You Baby performed by Led Zeppelin on the album Led Zeppelin Olympic Studios, London, 1968 Stereo tape
15 I Got A Woman performed by Ray Charles 18 November 1954, WGST, Atlanta, Georgia Mono tape
16 Alan Blumlein’s first stereo test 14 December 1933, EMI auditorium, Hayes, Middlesex British Library shelf mark 9TS0003378 two-track disc
17 Alan Blumlein’s first stereo test film 1933, EMI auditorium, Hayes, Middlesex Film, and two-track disc
18 Channel 9 (Australia) Today programme: Blattnerphone Restored 1992, Telstra Labs, Melbourne VHS off-air video / Blattnerphone tape
19 Preussische Staatskapelle Berlin cond. Herbert von Karajan — Anton Bruckner, Symphony No. 8 in C minor, WAB 108 (III: Finale) 29 September 1944; Berlin Stereo tape
20 Boston Symphony Orchestra cond. Pierre Monteux; Delibes: Coppelia suite December 1953, Manhattan Centre, New York City Stereo tape
21 Norelco 150 Cassette Recorder demo 1964, United States Duplicated Compact Cassette
22 Una furtiva lagrima from Donizetti: L’elisir D’amore February 1st 1904, Room 826, Carnegie Hall, New York City Single-sided shellac Victor 85021
23 Thomas A. Edison: Electricity and Progress for the opening of the New York Electrical Show October 3 1908 Edison Gold Moulded cylinder (unissued), NPS object catalog number: EDIS 39835

BBC Local Radio — Light in the Darkness

The British radio listening figures, the RAJAR survey, are out for the fourth quarter of 2014. They make generally disappointing reading for the management and staff of BBC local radio stations, showing drops in listenership of up to 62%. Many other types of local and regional radio have lost listeners, too. On the other hand, the cultural beacon that is BBC Radio 3 has recovered from two rather bad quarters’ results.

The whole table is here.

But let’s not worry about all BBC local radio. Someone should be knocking on the doors of at least one station, BBC Surrey and Sussex, and asking what they’re doing. Their figures have gone way up this RAJAR, measured year-on-year, which is remarkable given what the rest of the country has seen. Three out of the six quarter-on-quarter measures show a drop, but that is common in quarterly results that show merely seasonal changes.

Something I remember from my earlier career was that a station’s presence among its listeners was paramount and, within that, locally relevant and well-curated speech generated many returns to the station’s programmes.

Until the 1990s, the BBC local station kit for each of the (often smaller) areas contained a radio car, liveried reporters vehicles and plenty of Glensound outside broadcast kit or even a multi-track recording van. And the station’s engineers would often create other portable transmitting apparatus e.g. BBC Radio York’s “Radio Shoes”, _44550154_obkitposed282a back-pack transmitter that could be cycled or walked into the city’s pedestrian centre. These devices, fully branded, allowed properly-trained reporters and broadcasters to be both visible and audible to large numbers of people at public events, shopping centres, transport hubs, etc.

On a typical Monday-to-Saturday breakfast show in the 1980s and 1990s, three stations that I personally know of would think nothing of getting four different locations on the air from the radio car with properly-researched reports or colour pieces, those stations being BBC Radios York, Shropshire, and Hereford & Worcester. Now that regions are generally larger, the smaller staff surely cannot maintain the same quantity and quality of physical presence?

In my opinion, solutions will be much more difficult than my harking back to the olden days. Budgets are massively stretched, and the market fragmented to an extent we never imagined. Yet someone at BBC Surrey and Sussex likes to get the local issues on the air, and follow them seriously, as I have observed. And the figures go up from one year to the next.

An influential voice becomes quieter

If you grew up in the English Midlands, you or your parents might have listened to Ed Doolan, a radio presenter who came from Australia to join the big Birmingham independent station BRMB in 1974. Some years later, when independent radio was changing its tactics, he was recruited by, and became popular on, the BBC station in the same city.

His programmes contained plenty of campaigning, endless local relevance, and listener involvement in countless forms. Ed Doolan’s voice, opinions and style have become very familiar to me over the last forty years and clearly influenced my own much smaller and less successful career in front of radio microphones.

I no longer live in the Midlands, but always tuned the car radio to BBC WM when in the area to hear his conversations.

Lately, Ed Doolan has gradually reduced his radio workload. He went on air on BBC WM the other day to explain to the station’s host Caroline Martin why he had retired from live broadcasting.

To hear this fellow, only 23 years older than me, and familiar over four decades, say “I’ve got dementia” simply halted all I was doing this afternoon. His full interview is here:

Real-time visual pitch display

Here’s a hypnotic (or nausea-inducing) way of watching and listening to BBC radio programmes. You’ll need a modern version of FFplay, the multi-media player that’s part of the FFmpeg suite, and the open-source “get_iplayer” program. The filter that does the work is called “showcqt”.

For this example, I’m using BBC Radio 3. You will, no doubt, see how the command line can be modified to accept any audio source.

Just type this. This is from a Cygwin command line, rather similar to Unix. Windows won’t be much different.

get_iplayer --stream --type=liveradio "BBC Radio 3" | ffplay -f lavfi "amovie='pipe\:0',asplit[a][b];[a]showcqt=fullhd=0:timeclamp=0.3:fps=30[out0]; [b]anull[out1]"

Or, as another example, here’s one of my favourite on-line streams, “The Departure Lounge”:

ffplay.exe" -f lavfi "amovie='http\://',asplit[a][b];[a]showcqt=fullhd=0:timeclamp=0.3:fps=30[out0]; [b]anull[out1]"

…and, after waiting a few seconds for buffering, you’ll get this:

Audio spectrum of a fragment of a song for soprano and piano, with turntable rumble visible in the lower frequencies
Audio spectrum of a fragment of a song for soprano and piano, with turntable rumble visible in the lower frequencies

The backslash in the “pipe\:0” is because colons must be escaped with a backslash in FFmpeg/FFplay filters.

Just out of interest, I have a Python project that outputs a handy video and audio scope that needs a little refinement, but you can download it here:

The scope’s on-screen display includes a waveform monitor showing superimposed YUV levels with 16-235 markers to check BT601/709 broadcast limits, an EBU R128 loudness chart, a stereo audio sum/difference display, a colour vectorscope, a full-range video check monitor and timecode.

This is the kind of output it gives:

Screenshot of FFmpeg scope
Screenshot of FFmpeg scope

Recorded music podcasts from the UK? No.

Is it truly impossible to send out a British-made podcast where recorded music is played?

It would seem so. Phonographic Performance Limited (PPL), who licence nearly all record labels’ recorded music for public performance, do not offer an Internet-only licence to include recorded music in on-line podcasts. Broadcasts are fine, where you can’t skip forward in a show: but not podcasts that can be manipulated on demand.

If something on-line is merely replicating an already broadcast radio programme, it’s fine. But Internet-only radio from the UK, using music on most record labels, is still not allowed.

Isn’t that a curious anomaly? The Performing Right Society, and the Mechanical Copyright Protection Society, who licence the music, are fine about it. But the record labels, represented by PPL are not.

One of the greatest powers of radio is to introduce music that is new to an audience, by allowing an expert curator to showcase records they have chosen. John Peel, late of BBC Radio 1 is an example that comes immediately to mind; likewise Lucie Skeaping or Andrew MacGregor, both of BBC Radio 3. But with an increasing number of young people turning exclusively to on-line sources, why can’t the Internet be allowed to broaden the range of curators (presenters, if you like) to include those without current BBC or independent radio contracts?

A discussion about this is going on right now, on the “Radio Today” website. Perhaps PPL will join in? Or maybe I’ll just phone them for a chat and report back?

A Government Falls

For some reason this morning, while watching the featureless sky outside this window and waiting for Prime Minister’s Questions to start, I’m reminded of a turning point in British political history.

Thanks to the UK Parliamentary Recording Unit, you can hear the exact moment in 1979 when James Callaghan’s Labour government was challenged, by Margaret Thatcher’s Conservative opposition, to a vote of no confidence. As everyone knows, the vote was carried and thus an election was forced leading to a succession of Conservative governments.

The speeches surrounding this motion, by the two party leaders, can be heard in longer form by clicking this link.

Converting video for DVD with FFmpeg

Here’s another quick command line for FFmpeg. It converts interlaced video and audio into deinterlaced DVD-ready files. Your output will be a VOB file, ready to be split into a file of the correct size by any DVD authoring program (e.g. DVDStyler) without any further recoding.

The command line you see below was written for a recent film show, where interlaced material had been supplied on DVD, where the projector would not resize interlaced video correctly, and where the only replay device was a standard DVD player.

This command line is careful to apply the appropriate flags to the bitstream to signal that the video uses broadcast levels, and encodes colour according to ITU Rec.601, the standard for European (PAL) SD television.

Two filterchains are in use. The video filterchain first de-interlaces the incoming video, then applies noise-reduction because the files given to me were already noisy and, therefore, would waste bandwidth after encoding. The audio filterchain delays the sound by just over a frame: I found this to be necessary, possibly because of delays introduced by the video coder and the video filter.

ffmpeg -i VIDEO_INPUT -target pal-dvd -vf "w3fdif, hqdn3d" -af "adelay=50|50" -color_range 1 -colorspace 5 -color_primaries 5 -color_trc 5 VIDEO_OUTPUT.VOB

MXF Op-Atom files for Avid


This post shows how to convert almost any kind of video and audio into native Avid Op-Atom MXF files, suitable for placement directly in Avid’s MXF media files directory. The method is fast, and uses only open source software. Crucially, conversion takes place on any machine, not just an Avid-equipped computer.

A side note regarding AMA: it’s sometimes (?) a little flaky when linking to files that aren’t from a small subset of QuickTime, or that have their own manufacturer-tested plugins.

In this example, I am importing footage into a 25fps HD project. The Avid codec is its own DNxHD, running at 145MBit/s.

Use FFmpeg to convert your incoming footage into uncompressed audio files, and into Avid’s native video format. Note that the video is not encapsulated beyond the raw DNxHD format: but this format contains almost enough information about the file to enable import to take place. Frame rate, for example, seems to be missing.

So, convert the incoming video into DNxHD and uncompressed audio with FFmpeg like this:

ffmpeg -i "bach.flv" -vcodec dnxhd -b:v 145M -an -sws_flags lanczos -vf "scale=1920:1080, smartblur=1.0:-1.0" bach-video.dnxhd -vn -ar 48000 -acodec pcm_s16le bach-audio.wav

I have scaled the video to the correct size using what I consider to be the best scaling algorithm (Lanczos), and have added a little crispness to avoid too much softening. Obviously, you will not want to do this to footage that is already the correct dimensions and does not need restoration.

Now, we must prepare these files for Avid, in the same way that Avid itself imports files. They must be encapsulated as Avid-flavour MXFs (Op-Atom). Here, the BBC and EBU-supported raw2bmx utility, from bmxlib, comes into play. Again, this is open source software, and this is a very simple command line. Much more metadata can be included, and you’ll need to think about this if you’re going to reconform the project at any stage.

On this command line, I instruct raw2bmx to wrap both the video file and the stereo audio file into MXF. The project name is given, as is a tape name. The output file location together with the file prefix is given.

You will also need to specify the frame rate, using the ‘-f’ option, if your footage is not 25fps. The rates acceptable are: 23976, 24, 25, 2997, 30, 50, 5994 and 60. The incoming DNxHD is specified by “–vc3_1080p_1237”, naming the codec, picture size and flavour. All such flavours are listed in the help for raw2bmx.

raw2bmx -t avid -f 25 --project BACH --clip "BACH001" -o "I:\Avid MediaFiles\MXF\1\BACH001" --vc3_1080p_1237 bach-video.dnxhd --wave bach-audio.wav

In your Avid Mediafiles directory, a number of MXF files will appear: Avid’s Media Tool will pick these up as clips with combined video and audio (if that’s what you’re converting), and you can drag the clips to whichever bin you wish. Note that the raw2bmx tool is terse in its progress reporting. It prints nothing until the end of the wrapping process.

Recent builds of FFmpeg can be downloaded here, and the bmxlib project is on Sourceforge here.

A Lot To Learn

Today, Thursday 21st August, the GCSE exam results come out. In my schooldays, we went through the same results procedures for our O-levels and CSEs, although coursework generally wasn’t assessed. This was the first time we’d ever experienced result nerves, as the staff rifled through sealed envelopes until the correct name was found.

It was considered normal at my large, good, comprehensive school to take somewhere between four and ten exams. Today, teenagers regularly sit many more than this, and marvellous alternative qualifications are available for young people whose examination skills don’t match their real-world virtuosity.

We had most of the benefits that modern times bring: safe food and water, the National Health Service, easy transport with much cheaper petrol, luxuries spread around more classes than in our parents’ time, and lots of entertainment on record and cassette tape.

But we didn’t have the Internet with the immense, often anonymous, social pressures it brings to young minds.

A sixteen-year-old today can debate directly over Twitter with, for example, Richard Dawkins, Buzz Aldrin or Lily Allen; but he or she is also subject to anonymous and permanent criticism or attack on any aspect of their life, real or imagined, from any corner of the globe. Likewise, almost every media outlet was heavily edited: we had newspapers, radio and tv, but zines and self-published information were much more scarce than they are today. Blogs or instant social networks, outside radio hams and CBers, were just a dream. Now, teenagers must think editorially from their earliest exposure to the Internet, or be misled.

For sixteen-year-olds today, it seems to me that there’s much more to learn, and to refute, than there was for us in 1980, thirty-four years ago.

Timecode overlay with FFmpeg

This post describes how to use FFmpeg, a free and open-source program, to burn filename and timecode automatically into any number of video files, and then save them in a form suitable for network viewing.

In the olden days, video rushes would be burned to DVD through a VTR or, latterly, timecode plugin on Avid so as to give everyone timecoded copies.

Today, the free and open-source FFmpeg program can complete this task in the background on almost any modern computer. In this office, it’s making burnt-in timecode visible on three machines: a Mac PowerBook G4 made in 2004 (PowerPC processor) now running Debian Jessie GNU/Linux, a Seagate GoFlex Home caddy with an ARM5-compatible processor running GNU/Linux, and a Windows PC.

Here is a command line for a Windows machine that downconverts all files of a certain extension in a directory, and burns timecode onto them, along with the filename. At the moment, the timecode just starts at zero for each clip: it is no trouble to write an extra routine to read any embedded timecode and use that instead. The command also slaps a simple autolevel on the soundtracks (because this is for off-line logging) and also adds a slow-acting video AGC to make shots palatable if they need severe grading. The font I use looks clear on screen: it is a free font, downloadable from a number of sources. You could use arial.ttf instead, because it is already on all Windows machines.

In case you’re not familiar with command lines, the backslashes are escape characters, that rob the following character of any special meaning. For example, a colon : has a special meaning to FFmpeg, but preceding it with a backslash, \:, causes the colon to be treated as an ordinary printable character.

This line is for 25fps material, reducing the footage the size 512×288.

ffmpeg -i <VIDEOFILE> -n -acodec libfdk_aac -b:a 40k ^
-profile:a aac_he_v2 -vcodec libx264 -crf 22 ^
-vf "yadif, colormatrix=bt709:bt601, pp=al, scale=512:288, smartblur=1.0:-1.0, drawtext=fontfile='c\:\\windows\\fonts\\LiberationSans-Bold.ttf':text='<VIDEOFILE>\ \ \ \ \ \ ':x=120:y=h-lh-1:fontsize=16:fontcolor=white:shadowcolor=black:shadowx=1:shadowy=1:timecode='00\:00\:00\:00':timecode_rate=25" ^
-x264opts colorprim=bt470bg:fullrange=off:transfer=bt470bg:colormatrix=bt470bg ^
-af "compand=0.0|0.0:0.8|0.8:-90/-40|0/0:6:0:-30:0" ^


Here’s a step-by-step explanation of this command.

FFmpeg command name
Use your VIDEOFILE as input
Overwrite existing files without question. I use this because the command is run from a FOR…DO loop in a batch file, and may be making revisions of many earlier files.
-acodec libfdk_aac -b:a 40k -profile:a aac_he_v2
Choose Fraunhofer’s AAC codec for audio, instruct the coder to use the HEAAC V2 flavour of the codec, and use 40kbit/s as the bitrate. AAC is the successor codec to MP3, used very widely in applications such as iTunes; HE means the “High Efficiency” version of the codec, which uses spectral band replication to code the upper frequencies of its input; and Version 2 adds a more efficient form of representing the difference between two stereo channels. The Fraunhofer codec is the best codec in the market, and its source code has been released primarily for use in Android development. However, its licence is not compatible with FFmpeg’s licence, and so it must be compiled into FFmpeg by hand, as I do.
-vcodec libx264 -crf 22
Use x264 as the video codec. This open source project is widely regarded as the most accurate H.264 codec. The coder is instructed to use a constant quality, represented by the “constant rate factor” or crf parameter. 22 is a fair trade-off between bandwidth and quality for our purposes, and is suitable for distribution over a LAN or good ADSL line.
-vf "yadif, colormatrix=bt709:bt601, pp=al, scale=512:288, smartblur=1.0:-1.0, drawtext=fontfile='c\:\\windows\\fonts\\LiberationSans-Bold.ttf':text='<VIDEOFILE>\ \ \ \ \ \ ':x=120:y=h-lh-1:fontsize=16:fontcolor=white:shadowcolor=black:shadowx=1:shadowy=1:timecode='00\:00\:00\:00':timecode_rate=25"

This is a video filter chain. It does several jobs. In order, they are:

  1. De-interlace the video. My incoming source is interlaced, but the eventual film, and web, destinations demand progressive-scan video
  2. Change the colour matrix from the HD standard to the SD (and below) standard. Many external sources will confirm how YUV representations of real-world colour pictures are calculated, and how the international standards differ in this representation between high-definition and standard-definition pictures.
  3. Add the auto-level filter from the post-production-processing library (libpostproc). This is a simple, slow-acting automatic gain control for the video. It goes before the scaling process in case of overshoots caused by the change in size or any sharpening.
  4. Scaling then takes place: a raster size of 512 x 288 gives sufficient detail for logging, but does not eat up too much bandwidth after coding
  5. The smartblur filter is a semi-intelligent edge-detection algorithm that, in this case, is asked to work on neighbouring pixels, and sharpen them. The negative number literally means “the opposite of blur”
  6. The drawtext filter writes on the video. This command (with appropriate escape characters):
    1. chooses a font by pointing to its filename;
    2. colours it white;
    3. positions the line of text appropriately (“lh” means “line height”);
    4. adds a black shadow for clarity;
    5. includes the filename;
    6. tells the timecode counter where to start (timecode display is implicit when a start frame is given);
    7. instructs the counter to count 25 frames per second.
  7. The filter chain ends here. Its output is now fed to the x264 coder.

-x264opts colorprim=bt470bg:fullrange=off:transfer=bt470bg:colormatrix=bt470bg

These are private options for the x264 coder.

  • The three options specifying “bt470bg” instruct the coder exactly how to interpret, and tell the decoder how to interpret, the conversion from YUV to RGB for display. In this case, I have chosen “ITU Recommendation BT.470 systems G and R”, the standard colour encoding for PAL and SECAM standard definition video. I specify this exactly because some displays assume colour matrices wrongly if they are not explicitly given instructions. Some others get it wrong anyway, but we must try.
  • The fullrange instruction tells the x264 coder that the incoming video is at studio levels (16 <= Y <= 235), and the coder includes this information in its instructions to the decoder and display. Again, it ought to be assumed that YUV-encoded video is already at studio levels, but sometimes decoders get this wrong.
-af "compand=0.0|0.0:0.8|0.8:-90/-40|0/0:6:0:-30:0"
An audio filter is described here.
is a general audio level alteration process.
These two zeroes describe the channel attack times for the level-detection algorithm (0.0 seconds, instant)
These are the decay time constants for the stereo channels (0.8s)
A simple curve for gain amplification is described: raise sounds at -90dB up to -40dB, then alter levels proportionally until 0dB (full scale) remains at 0dB
This defines the softness of the knee down at -40dB: the curve is 6dB wide
This zero instructs the filter to apply 0dB gain make-up
-30dB is the initial volume that the filter should assume, to avoid very loud audio at the start of each file
This final zero instructs the filter to act without delaying the side-chain, thus disabling its look-ahead function. This has been included for simplicity’s sake: I didn’t have much time to tweak this figure.
The output filename: same as the input filename but with the extension .mp4

The files this command line produces vary in bit-rate between 2Mbit/s and 500kbit/s, suitable for low quality LAN or Wi-fi use. A further shell batch command (this time run on my little ARM5 Seagate GoFlex Home) further downconverts them to around 150kbit/s, suitable for ADSL streaming over my slow line.

This may seem a complex command, but it does a lot of time-saving work in a single pass.