Manipulate Ogg Theora Files

If you are a Linux user you can use the Ogg Video Tools (http://dev.streamnik.de/oggvideotools.html) to manipluate ogg/theora video files.

Ogg Video Tools is a toolbox for manipulating Ogg video files, which usually consist of a video stream (Theora) and an audio stream (Vorbis). Included in Ogg Video Tools are a number of handy command line tools for manipulating these video files. 

The following tools are available:

  • oggSplit
oggSplit seperates the media stream from one ogg file. This function is often called demultiplexing
  • oggJoin
oggJoin rejoins seperated ogg media streams. This function is often called multiplexing
  • oggDump
oggDump is created mainly for developers to receive packaging information
  • oggCut
oggCut cuts parts from an ogg file defined by a start and an end time position
  • oggCat
oggCat concatenates ogg video files to form one new ogg file that can be interpreted by all players, that can play ogg/theora/vorbis files (vlc, mplayer, cortado etc.)
  • oggLength
oggLength returns the length of an ogg file
  • oggScroll
oggScroll shows up every video frame in a seperate X-Window

What is Ogg, Theora and Vorbis?

Ogg is a container format, like e.g. avi, that defines the outline of a multimedia file.

Theora is a patent free video compression format.

Vorbis is a patent free audio compression format.

Every Ogg video file must consist of at least of one video stream and one audio stream. Sometimes there are more than one audio stream e.g. for different languages or special streams or for subtitles.

Using Ogg Video Tools

The following shows some basic examples of how to use the tools.

oggSplit

Demultiplexing an Ogg video file is quite easy, you would use something like this :

oggSplit myfile.ogg

For example, if the your file was called 'myfile.ogg' and you ran this command then after the oggSplit command has executed the files in your directory look somewhat similar to this :

myfile.ogg

theora_6f1634f6.ogg

vorbis_41bf6b07.ogg 

The number after theora_ and vorbis_ is the stream ID. This number is internally used by the Ogg container. The stream ID is necessary in ogg and is set at stream creation time and should be expected to be a "random" number. However sometimes (e.g. when ffmpeg creates ogg files), the streams are numbered as an ascending series.   

The new files are fully functional ogg files and can be played with vlc, mplayer etc. The 'theora_*' file contains only the video part of the file and the 'vorbis_*' file contains the audio part. 

In some cases, there are streams inserted into the ogg file that can not be interpreted. These files are also extracted and marked as unkown_<ID>.ogg.  

oggJoin

Multiplexing an Ogg file is as easy as demultiplexing. Refering to the example above you can write

oggJoin myNEWfile.ogg theora_6f1634f6.ogg vorbis_41bf6b07.ogg 

This command will create the file "myNEWfile.ogg" which consists of the theora_* stream and the vorbis_* stream.

As oggJoin uses it's own timestamp creation method, both streams start exactly at start time '0'. This is always the case even if the original files started at a different time (due to internal timing information). So the video and audio streams are always synchronized.

In case of other stream types  (other than theora or vorbis), there is actually (as of version 0.4) no timing interpreter available. So you can not use these streams for multiplexing. 

oggDump

This tool is mainly meant for developers who wish to analyse ogg video files. Therefore you need to know a bit more about the Ogg container format and Ogg streams within the container.

In short: Ogg files physically consist of ogg pages, which should have a defined length (e.g. 4096 bytes). This pages consist of a header with framing information and a body with the data and belongs to one stream. Every Ogg page carries a timestamp, which should be increasing within a file from page to page.

From the stream point of view, every video or audio stream consists of successive packets (e.g. a frame or a block of audio samples). These packets are placed into the physical pages.

To print out the pages from an Ogg file you can use the "-g" command line option

 oggDump -g myFile.ogg

To print out the packets from an Ogg file, use the "-p" command line option 

oggDump -p myFile.ogg

If you don't want to dump all information about a file, as you are only interested into the ogg page headers, you can use a different page dump level by using the "-l" command line option

oggDump -g -l1 myFile.ogg

In this case, only the header information are printed out. To increase the information level, just increate the number after the -l option. "-l 5" is a full dump with all available information. 

oggCut

oggCut extracts parts of an ogg file. The usage is quite easy:

oggCut -i inputFile.ogg -o outputFile.ogg -s 2000 -e 60000 

This command creates a new ogg file named 'outputFile.ogg'. This file consists of a subpart of the original "inputFile.ogg". The new starts at milisecond 2000 (2 seconds) of the original file and ends at the millisecond 6000 (6 seconds).

As a video stream consists of I-frames (which are full pictures) and P-frames (which are delta pictures to the leading I-frame), the oggCut algorithm searches for the first I-frame. If a video file starts with a p-frame, the player would not be able to interpret this picture, as the leading I-frame (on where it is based) is not available.

oggCut starts the I-frame search at the given start time given by the '-s' option. So expect a shorter time than the calculated seconds for the new file.

If you really want to cut a film at a particular frame position, all the pictures at least up to the first I-frame must be recalculated. In that case using a movie cutter like kino would be a better choice.

oggCat

Sometimes it would be nice to concatenate (join) two or more video files. For that you can use oggCat, which creates a continuous Ogg video file from the given files.

oggCat newFile.ogg firstFile.ogg secondFile.ogg ... 

However, the video files must correspond in framerate, keyframe gap, framesize etc.

The first file is always taken as the "corresponding" file. The parameters given by this file are checked against the proceeding files. If a file does not match, this file is not used for the concatenation and the next file is tested against the parameter set.

For example, if the framesize does not match, the following information is printed :

theora parameter compare: height or width are not matching:360:288 != 640:480
I could not find enough matching streams for file <secondFile.ogg>

The frame position for both, video and audio is completely recalculated for the new file, so that there are no timestamp problems (e.g. with players like cortado).

oggLength

Even if the Xiph Foundation (the developers of OGG) has recently created an additional header for ogg media files including new information (e.g. for the file time length), this additional data is not widely used. Therefore oggLength does not rely on this information and calculates the time length of an ogg media file by stream analysis and prints the calculated value

oggLength analysisFile.ogg

oggScroll

oggScroll displays every video frame of a video stream within an ogg media file and prints out the frame position of the frame that is shown.

oggScroll showFile.ogg 

To show the next frame, just hit any key (e.g. space). For jumping to the next keyframe, hit the "+" key and to exit oggScroll, hit "q".

If the focus is accidentally placed on the video frame, change to the focus to the console oggScroll is running in (otherwise the keypress is lost).

Getting more information

If you are interested in news of the Ogg Video Tools or if you would like to help developing new tools or advance the existing ones, join the streamnik mailinglist (http://lists.streamnik.de/mailman/listinfo/streamnik-server-dev) or visit the streamnik webpage (http://dev.streamnik.de/).