In spite of the high profile of media types such as video, audio and images, many multimedia presentations rely extensively on text content. Text can be used for incidental labels, or as subtitles or captions that accompany other media objects. In a multimedia document, text content is not only constrained by the need to support presentation styles and layout, it is also constrained by the temporal context of the presentation. This involves intra-text and extra text timing synchronization with other media objects. This paper describes a new timed-text representation language that is intended to be embedded in a non-text host language. Our format, which we call aText (for the Ambulant Text Format), balances the need for text styling with the requirement for an efficient representation that can be easily parsed and scheduled at runtime. aText, which can also be streamed, is defined as an embeddable text format for use within declarative XML languages. The paper presents a discussion of the requirements for the format, a description of the format and a comparison with other existing and emerging text formats. We also provide examples for aText when embedded within the SMIL and MLIF languages and discuss our implementation experiences of aText with the Ambulant Player.
, , , , ,
,
ACM
Network Infrastructure Support for Convergent Interactive Media
ACM Symposium on Document Engineering
Distributed and Interactive Systems

Bulterman, D., Jansen, J., César Garcia, P. S., & Cruz-Lara, S. (2007). An efficient, streamable text format for multimedia captions and subtitles. In Proceedings of the ACM Symposium on Document Engineering 2007 (pp. 101–110). ACM.