Category Archives: Computing

Metadata for Culture and Heritage

As part of my efforts with the ICOMOS-UK digital committee, I’ve started to collect metadata specifications relevant to heritage and culture.

My aim is to produce a superset with copious documentation and guides to subsets, so that all data is interchangeable. After all, if the Digital Production Partnership can join European and North American delivery standards for television in this way, isn’t anything possible?

Work begins at the link below. Suggestions are most welcome.

Audiovisual Archive Metadata & Preservation

Showing Stream Structure with FFmpeg

Ever wondered what the structure of your H.264 or other motion-predicted video stream is? With FFmpeg and a Unix (e.g. Linux, BSD, Cygwin) command line, you can find out.

ffprobe -show_frames -select_streams v:0 YOURFILE.EXT | grep pict_type | sed s/pict_type=// | tr -d '\n'

The output is something like this:


Open Source MPEG2 Video Encoding for DVD

An open-source codec project, x262, is attempting to bring the best of x264’s coding techniques to the encoding of MPEG2 video. X264 is a very popular, world-class open-source H.264 encoder.

The open-source multimedia utility FFmpeg has its own MPEG2 video encoder, but its quality falls far behind that of x264. However, whilst it is possible to incorporate x262 within FFmpeg, one then loses the ability to compile-in the most up-to-date x264 encoder.

This is because the x262 project generates a library and a commmand-line utility that replaces x264, and extends its capabilities. Unfortunately, x262 isn’t tracking the latest x264 encoder right now, so it is best compiled as a separate utility.

In my own compilation of FFmpeg and associated utilities, I have kept x262 and x264 separate, so this blog post will show how to encode for DVD using x262 and FFmpeg, fully retaining FFmpeg’s x264 capabilities, and encoding MPEG2 video at greater quality than FFmpeg’s native encoder for this.

Here is the command-line, as set up for a 25FPS production converting the video from a 24fps cinema file. This is for video only. You’ll use a separate command line for audio, then combine the two files, again using FFmpeg.

To explain:

-r 25
Tell FFmpeg to interpret the file as 25fps, so we get the 24->25 speed up necessary when showing a cinema film on European television.
-vf scale=720:576:lanczos,smartblur=1.0:-0.4,colormatrix=bt709:bt601,setdar=16/9,setsar=64/45
This is a video filter, and it does quite a lot. First, we scale the film to PAL DVD spec, that is 720 x 576 pixels, and we use the lanczos algorithm for best quality. Then, using FFmpeg’s smartblur algorithm, we add a little inverse blur, to sharpen the image without increasing noise too much. Next, the colour matrix is converted, because we’re coming out of Rec.709 colourspace (standard for HD television), and going into Rec.601 colourspace, for standard definition television. Finally come two filters that signal the display aspect ratio, and the sample aspect ratio. These are the standard widescreen aspect ratios for the screen and each pixel in 576i television.
This ensures that there is no attempt to process audio.
-pix_fmt yuv420p -f yuv4mpegpipe - |
The video’s pixel format is changed to yuv420 if it was not in this format to begin with, and we pipe the video out using the yuv4mpeg format, which carries a simple header to instruct the program at the end of the pipe to interpret the video correctly.
x262 --fps 25 --demuxer y4m --mpeg2
These set up the x262 encoder to interpret the pipe’s input correctly using the y4m format, and insist that the frame-rate is 25fps. Then, we instruct x262 to behave as an MPEG2 encoder. This is necessary because the x262 binary also contains an H.264 encoder.
--preset placebo
This preset sets up some of x262’s slowest, and most careful, encoding options. It slows the encoder down to around 7fps on an ancient laptop (the test machine here), but this is not a particular problem for my purpose.
By permitting each group of pictures to contain B-frames that can refer to P-frames outside their own GOP, there is a slightly increased efficiency of encoding.
--tune film
This sets some tunings that result in less apparent distortion to the picture when encoding from a film or film-like source
--keyint 12
DVD specification for PAL discs requires that each group-of-pictures be 15 frames long or less. This parameter ensures this is the case. GOPs can be much longer, but discs encoded with GOPs of more than 12 frames might not play on all players.
Interlaced encoding is less efficient than progressive encoding. But if your incoming source is progressive, you can instruct the encoder in this way to encode it as if it were progressive footage, but still signal that it is interlaced in order to stay within what most DVD players expect.
--vbv-maxrate 8800 --vbv-bufsize 1835 --crf 1
Here is our bit-rate control. DVDs must never exceed 9,800kbit/s, so we allow up to 1,000kbit/s for audio and other overheads. Then we allow buffering up to 1,835kbits, which is the DVD player specification; and finally instruct the encoder to encode using the highest quality variable bit-rate within these parmeters.
--range tv --colorprim bt470bg --transfer bt470bg --colormatrix bt470bg
These define, and signal in the encoder’s output bitstream, the colour parameters of the video. In particular, these state that the signal’s range is that used for television (16-235 for luminance (Y), 16-240 for Cr and Cb for in 8-bit systems), and that the colour descriptions fit the Rec.601 standard, which is an update of BT.470BG
--sar 16:9
SAR means “Sample Aspect Ratio”: in other words, the pixel’s aspect ratio. In television, this should really be called “Display Aspect Radio”, because it describes the playback display aspect ratio, not the pixel. But it is so named in x262.
Here, the output filename is given. We give it a VOB extension, because it is a Video OBject, as required by the DVD specification. Of course, at this stage, it contains video only until we add an MPEG stream containing sound in the next step, which is another blog post.
Do not forget this dash! It instructs x262 to receive its input from a pipe and not a file.

Here is the complete command line just described. The output is an MPEG2 encoded video file, noticeably better in quality than a file of the same bandwidth produced by FFmpeg’s native MPEG2 encoder.

ffmpeg -r 25 -i 24FPS-FILM-FILE -vf scale=720:576:lanczos,smartblur=1.0:-0.4,colormatrix=bt709:bt601,setdar=16/9,setsar=64/45 -an -pix_fmt yuv420p -f yuv4mpegpipe - | x262 --fps 25 --demuxer y4m --mpeg2 --preset placebo --open-gop --tune film --keyint 12 --fake-interlaced --vbv-maxrate 8800 --vbv-bufsize 1835 --range tv --colorprim bt470bg --transfer bt470bg --colormatrix bt470bg --sar 16:9 --crf 1 -o FILENAME.vob -