Captioning for Quicktime
Using SMIL to Add Captions to Quicktime Movies
Article Contents
- Page 1: Introduction to Captioning for Quicktime
- Page 2: Creating the Caption File
- Page 3: Creating the Quicktime Text Track
- Page 4: Adding Captions to a Quicktime Movie
- Current page: Page 5: Using SMIL to Add Captions to Quicktime Movies
- Overview of SMIL
- SMIL Basics
- Cool Things You Can Do with SMIL
- Supported media types in Quicktime
- Playing media files in sequence
- Playing media files in parallel
- Creating alternative content based on user settings
- Load content from another location
- Repeat media clips
- Playing a portion of a clip
- Beginning and ending at a certain time
- Play media until one of them completes, then stop
- Combining
parandseq - Creating interactive movies
- Page 6: Adding Quicktime Content to a Web Page
Overview of SMIL
SMIL (Synchronized Multimedia Integration Language) is very much like HTML. It is actually a form of XML (Extensible Markup Language). SMIL is used to control the way multimedia content is displayed on the web. You can read about the SMIL standard at http://www.w3.org/AudioVideo/ - external link.
Note
SMIL is not supported in Quicktime versions before Quicktime 4.1. If you are targeting audiences that may not have Quicktime 4.1+, use Quicktime Pro to create a self-contained, captioned movie.
SMIL provides support for multiple languages for audio and text tracks and is much easier to implement if you are streaming Quicktime video from a Quicktime server. SMIL is also used by RealPlayer for media control, though the way in which it is implemented is slightly different.
SMIL Basics
<smil>
<head>
<layout>
[layout stuff goes here]
</layout>
</head>
<body>
[body stuff goes here]
</body>
</smil>
This is the basic layout for SMIL files. A few things to be aware of:
- SMIL files are just text files. You can create them using any text editor.
- The
<smil>tag must come first in the file. - All SMIL tags are case sensitive - they must be lower case.
- All SMIL tags must be closed -
<layout>layout stuff</layout>or if a tag doesn't have an associated closing tag, it must be formatted like this -<meta name="title" content="Wizard of Oz Example" />. Notice the/before the closing bracket. - The layout section is used to define areas within the presentation. We will be defining an area for the video and an area for the captions.
- The body section tells Quicktime where to get individual media elements.
- You can include HTML comments
<!-- comment -->.
Although it's easy to create your own SMIL file using a text editor, it is usually easiest to start with a template and make changes to accommodate your video and caption files. Download a template (see the following sample SMIL file code and descriptions of that code). on the file and select Be sure to save the SMIL file with the .smil extension. The extension of .smi is also allowed, but this may conflict with Microsoft's SAMI which allows the same extension. Because SMIL files typically open with Quicktime or RealPlayer, to open the file, you may need to open your text editor, then select , browse to your .smil file, and open it.
A sample Quicktime SMIL file for captions
<smil xmlns:qt="http://www.apple.com/quicktime
/resources/smilextensions" qt:autoplay="true" qt:time-slider="true">
<head>
<meta name="title" content="Wizard of Oz Example"/>
<layout>
<root-layout background-color="white" width="320" height="290"/>
<region id="videoregion" top="0" left="0" width="320" height="240"/>
<region id="textregion" top="240" left="0" width="320" height="50"/>
</layout>
</head>
<body>
<par>
<!-- VIDEO -->
<video src="wizard_of_oz.mov" region="videoregion"/>
<!-- CAPTIONS -->
<textstream src="wizard_of_ozCaptions.mov" region="textregion" system-language="en" system-captions="on" title="english captions" alt="english captions"/>
</par>
</body>
</smil>
Let's take a look at each part of the file:
<smil xmlns:qt="http://www.apple.com/quicktime
/resources/smilextensions" qt:autoplay="true" qt:time-slider="true">
/resources/smilextensions" qt:autoplay="true" qt:time-slider="true">
The smil tag tells Quicktime that this is a SMIL file. The xmlns stuff informs the Quicktime player that you are using Quicktime's version of SMIL. The address at apple.com is not actually accessed, but is used to identify the type of SMIL you are using. The xmlns section is required, after which you can add what Quicktime calls 'extensions' to the <smil> tag. This file contains qt:autoplay=true which tells the player to automatically start playing the movie when it is opened and qt:time-slider="true" which will display the time slider control in the player.
Other Quicktime SMIL extensions are listed on Apple's SMIL Scripting Guide for QuickTime: QuickTime SMIL Extensions - external link. You can do neat things with SMIL extensions, such as play a continual looping program or jump from movie to movie. The smil tag must be closed at the end of the file.
<head>
Within the head section, you define the size of the presentation window as well as individual component areas (called regions) within the presentation. You can also provide information about your movie within <meta> tags. The head tag (as with all SMIL tags) must be closed.
<meta name="title" content="Wizard of Oz Example"/>
This code allows the developer to display information about the presentation. The name value can be author, title, or copyright. Notice the "/> at the end. This information is supposed to display within the Quicktime player, but in current versions, does not.
<layout>
This tag informs Quicktime that the layout of the presentation is about to be defined. The layout tag must be closed.
<root-layout>
This tag defines the properties of the entire presentation window. The background color can be in #rrggbb format or a color name. Height and width are in pixels.
<region>
You can have several regions within your presentation. Each must have a unique id attribute. The top attribute defines how many pixels from the top of the presentation window (as defined in root-layout) to begin a region. Left defines how many pixels from the left to begin the region. Height and width define the dimensions of the region. Regions may overlap, but must all be contained within the root-layout dimensions. You can also use percentage values if you choose.
<body>
This part of the SMIL file specifies the content that will display in defined regions. The body tag must be closed.
<par>
This tag stands for parallel. Any items defined within the <par> tags will play at the same time. We want our video file and the captions to play at the same time, so we'll use <par>.
<video src="wizard_of_oz.mov" region="videoregion"/>
This defines where to get video content and which region (as defined by <region> tags) to display the content in. The region value must match the id value of one of the region tags. It is usually best to make sure the dimensions of the video you are loading are the same as (or proportionate to) the region it will be displayed in. Scaling or resizing Quicktime movies usually produces poor video quality.
<textstream src="wizard_of_ozCaptions.mov" region="textregion" ...
This line tells Quicktime to get a file named wizard_of_ozCaptions.mov and display it in the region named textregion. The region value must match the id value of one of the region tags. The other attributes tell Quicktime to turn on the caption feature (which it does anyway), identify the language of the captions, and provide a title and alternate text to display while the captions are loading.
Note
You do not have to convert your caption file to a Quicktime movie if you are using SMIL. You can point the textstream src directly to your caption text file.
<textstream src="wizard_of_ozCaptions.txt" region="textregion" ...
This method saves a lot of time in converting captions to Quicktime text tracks manually and Quicktime Pro is not required.
Example SMIL layout
In this SMIL example, we defined the root-layout to be 320 x 290 pixels. The first region was defined as 320 x 240 pixels, which would begin 0 pixels (top="0") from the top of the root-layout area and 0 pixels (left="0") from the left of the root-layout area, which basically means to start the region in the upper left corner of the root-layout.
The second region has dimensions of 320 x 50. It begins 240 pixels from the top and 0 pixels from the left. Because the first region is 240 pixels high, the second region begins exactly where the second ends.
You usually want to account for all the area in your presentation, without leaving empty areas. SMIL allows for very complex layouts and multimedia presentation.
Cool Things You Can Do with SMIL
Supported media types in Quicktime
Use the following tags for certain media types:
<audio />: Specifies sound-only movies. Can be mp3, wav, aiff, mov, au, etc.<video />: Specifies movie files - mov, mpg, etc.<img />: Specifies still image files - jpg, gif, bmp, png, tiff, pict, qti, etc.<text />: Specifies text files with the .txt extension. It will appear the same way it would if the text were imported into Quicktime. You can add descriptors - external link for additional formatting.<textstream />: Specifies Quicktime text track movies (.mov) or Quicktime text files (.txt).<animation />:Specifies animation files (flc or animated gif)
An entire list of Quicktime supported formats can be found at http://www.apple.com/quicktime/player/specs.html - external link
Playing media files in sequence
Within the body, add the <seq> tag. It contains the media files you wish to play, in order:
<seq>
<audio src="audio1.mp3" />
<audio src="audio2.aiff" />
<audio src="audio3.wav" />
</seq>
This above code would play three sound files of different types in succession.
<seq>
<image src="image1.jpg" region="r1" dur="5 sec" />
<image src="image2.gif" region="r1" dur="7 sec" />
</seq>
The dur attribute defines how long each item should be displayed. This is useful with images or if you only want to play a portion of a video/audio file. Times can also be displayed in timecode - 00:01:12.50 would be 1 minute, 12 and a half seconds.
Playing media files in parallel
Within the body tags, use the par tag.
<par>
<audio src="themesong.mp3" />
<image src="poster.jpg" region="r1" dur="30 sec" />
<text src="lyrics.txt" region="r2" dur="30 sec" />
</par>
Creating alternative content based on user settings
<par>
<switch>
<textstream src="french.mov" region="caps" system-language="fr" />
<textstream src="german.mov" region="caps" system-language="de" />
<textstream src="english.mov" region="caps" system-language="en" />
</switch>
<switch>
<audio src="french.aif" system-language="fr" />
<audio src="german.aif" system-language="de" />
<audio src="english.aif" system-language="en" />
</switch>
</par>
Based on the end user's Quicktime settings, the user will hear audio and see captions in their language if it is specified in the preferences and matches one of the languages available in the system-language attribute. Users can change the language played using .
<switch>
<video src="high.mov" system-bitrate="192000" />
<audio src="medium.mov" system-bitrate="52800" />
<audio src="low.mov" system-bitrate="24000" />
</switch>
Based on the user's connection speed, as set in the Quicktime player preferences, a different movie will be displayed. System-bitrate is set in bits per second.
Load content from another location
<video src="http://www.someotherserver.com/qtmovie.mov" region="r1" />
or
<video src="rtsp://www.somequicktimeserver.com/qtmovie.mov" region="r1" />
You can easily load a clip from anywhere on the Internet. RTSP is used if you are streaming Quicktime from Quicktime Streaming Server or Darwin Streaming Server. This is used if you want to stream a video file, but the caption file is located on a web server.
Repeat media clips
<seq>
<video src="mymovie.mov" region="r1" repeat="2" />
<audio src="background.mp3" repeat="indefinite" />
</seq>
Repeat any number of times, or indefinitely. This would play the first movie two times, then play the mp3 until the cows come home.
Playing a portion of a clip
<video src="mymovie.mov" region="r1" clip-begin="5 sec" clip-end="00:00:35.00" />
This would begin the clip at the 5 second point and end at the 35 second mark. Unless you are streaming the video or audio, the first 5 seconds worth of data must still be downloaded before the clip begins to play. Again, you can use seconds (sec) or timecode.
Beginning and ending at a certain time
<audio src="music.mp3" id="music" begin="5 sec" end="15 sec" />
The music will begin after the movie (or seq or par) has been open for 5 seconds then end 10 seconds later.
Play media until one of them completes, then stop
<par endsynch="id(backgroundMusic)">
<audio src="music.mp3" id="backgroundMusic" />
<img src="somebanner.gif" region="top" dur="10 sec" repeat="indefinite" />
<text src="info.txt" region="info" dur="10 sec" repeat="indefinite" />
</par>
Endsynch indicates that as soon as the mp3 file (id="backgroundMusic") ends, the others will stop.
Combining par and seq
<seq>
<img src="intro.gif" region="top" dur="10 sec" />
<par>
<audio src="music.mp3" id="music" />
<img src="banner.jpg" region="top" begin="5 sec" end="id(music)(end)" />
<text src="copy.txt" region="bottom" begin="id(music)(10 sec) end="id(music)(end)" />
</par>
<video src="conclusion.mov" region="bottom" />
</seq>
Intro.gif will display for 10 seconds after which the music will begin. Five seconds later, banner.jpg will appear. When the music has played for 10 seconds, copy.txt will appear. When the music ends, banner.jpg and copy.txt will end, thus ending the par. The conclusion.mov video will then play.
<par>
<audio src="music.mp3" id="music" />
<seq>
<animation src="animation.gif" region="bottom" dur="5 sec" repeat="2" />
<img src="graphic.gif" region="bottom" dur="5 sec" />
</seq>
<video src="end.mov" region="r1" />
</par>
The music, video, and animation will all begin. Animation will play for 5 seconds, then repeat for another 5 seconds, after which graphic.gif will display for 5 seconds then disappear.
Creating interactive movies
<a href="hello.html"><animation src="welcome.mov" region="top"/></a>
<a href="http://google.com/"><video src="anothermovie.mov" region="r1"/></a>
When the user clicks on the screen (or region of the screen), they will go to the document you have specified. You can even link to other SMIL files for interactive presentations. The Quicktime player does not indicate that the user can or should click the region, so you will need to indicate this in some way.