Hypermedia: The Link with Time

Lynda Hardman, Jacco van Ossenbruggen, Lloyd Rutledge, Dick C.A. Bulterman

CWI
Kruislaan 413, PO Box 94079, 1090 GB Amsterdam, The Netherlands
Lynda.Hardman@cwi.nl, http://www.cwi.nl/~lynda

Abstract:
This essay presents a brief discussion of combining temporal aspects of multimedia presentations with hypertext links. Three ways of combining linking with temporally synchronized components of a presentation are described. We describe work that has been done to incorporate both temporal and linking information within the W3C language SMIL (Synchronized Multimedia Integration Language). We conclude with a discussion of future directions, namely providing support for linking within and among non-linear presentations and the ability to add temporal information to existing XML document languages.

Categories and Subject Descriptors: I.7.2 [Text Processing]: Document Preparation - hypertext/hypermedia; H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems - Hypertext navigation and maps

General Terms: Languages

Additional Key Words and Phrases: Multimedia, document models, W3C recommendations

The hypertext community has based its research on documents which contain discrete pieces of information connected by links. In order to be able to compare systems and provide a basis for interoperability, the Dexter hypertext reference model was developed [Halasz and Schwarz, 1994]. The Dexter model has formed the basis of a large amount of research and discussion within the hypertext community. (See, for instance, the Communications of the ACM's special issue on hypermedia [Grønbæck and Trigg, 1994]). The Dexter hypertext model is, however, not strictly text based, since it is a mechanism for combining nodes of different media types into composite objects, among which links can be created. While Dexter is in this sense a hypermedia reference model, it does not include the notion of time explicitly. This becomes a problem when multimedia presentations, presentations composed of multiple media items with predetermined temporal constraints among them, need to be expressed and tools for manipulating and creating them need to be developed.

Adding time to a hypertext document brings with it a number of problems, of which some are conceptual and some technical. On the conceptual side, the semantics of composition may or may not include temporal aspects. In addition, providing linking constructs within multimedia documents requires not only defining the start and end of the link, but also how this relates to the behaviour of the presentation when a reader follows a link. On the technical side, network bandwidth plays a prominent role when media items are continuous, such as audio, and also high-bandwidth, such as video. In addition synchronization relations specified within the document need to be met. While these problems are traditionally addressed in classical multimedia, providing support for linking has the consequence that standard solutions such as pre-loading media items or buffering streaming media increase the burden on network resources and CPU time excessively.

In this essay we cover the conceptual aspects of incorporating time in a hypermedia document model. Additionally we discuss the role these aspects play in the context of the World Wide Web Consortium's (W3C) Synchronized Multimedia Integration Language [Hoschka et al. 1998].

1 Incorporating time in a hypermedia document model

While a hypermedia document model needs to combine a wide range of aspects [Hardman, 1998], we concentrate here on the relationship between time-based presentations and linking. In brief, a time-based, or multimedia, presentation is composed of a number of media items, each of which has its own duration. These can be combined into a presentation by specifying the temporal relationships among the different items. These relationships are often in the form of constraints, e.g. in [Buchanan and Zellweger, 1993, Ackermann, 1994, Jourdan et al. 1998, Kim and Song, 1995]. We term a collection of constrained media items a temporal composite.

1.1 Linking within linear multimedia presentations

Interaction with a temporal composite component can take place through playing, pausing or stopping the presentation, using the player controls supplied. Alternatively, links can be created which allow jumps into the presentation to be specified by the author. If the link source and destination are within a single presentation (temporal composite), then the link is in effect specifying a fast forward or reverse operation that is otherwise available through the player controls.

1.2 Linking among multimedia presentations

The main characteristic of hypermedia presentations is that there are multiple pages, or scenes, among which a reader can select and thus create an individual path through the information. When traversing a link in hypertext the behavior is normally determined by the hypertext system itself. For example, in Notecards [Halasz et al. 1987] a new window is opened for each node, whereas in KMS [McCracken and Akscyn, 1984] a maximum of two windows are on the screen and subsequent nodes replace those being displayed. In common HTML browsers, e.g., the choice is given to the reader, by allowing the link destination to be displayed in the original or a new window. In time-based hypermedia, it is often useful to have even more control of the behaviour, since a running presentation consumes both network resources and CPU time for displaying a presentation. When the reader selects a new presentation it may be useful to have the original presentation continue in parallel, have it pause, or indeed replace it completely.

1.3 Linking within and among non-linear multimedia presentations

In addition to allowing jumps among presentations, links can be created among parts of presentations, allowing the reader to compose their own presentations as they navigate, breaking the traditional linearity of a multimedia presentation. This allows scenarios such as the following. A video, along with some accompanying music, is played in the background of a window. On top of the video a list of text choices is displayed, which the reader can browse. As the reader navigates through the text choices, the video and music continue uninterrupted. In this example, the background video and music form a temporally synchronized unit (a temporal composite), but are grouped together with the text elements atemporally. This is similar to the use of frames in HTML 4.0 [Raggett et al. 1998].

In order to supply this (apparently basic) functionality, a means of gathering together multiple scenes is needed, along with a way of specifying potential traversal routes. We term a composite for providing a mechanism for collecting scenes, but with no further temporal information, an atemporal composite. In order to provide this functionality within a document model, the source and destination of the link have to be specified in combination with the atemporal composition in the document. Further details on modelling these types of links can be found in [Hardman et al. 1999].

2 Links with the real world

The types of composition described above are useful to include in an academic model for time-based hypermedia. To extend their acceptance on a wider scale, they need to be incorporated in a real-world environment, such as the World Wide Web. In 1995, a birds of a feather session was held during the WWW5 conference where a number of parties came together to discuss the requirements for a declarative multimedia language for the Web. While multimedia presentations have already been available via the Web for some years, these were based on document formats that were not part of the W3C suite of languages.

Out of this first session, and subsequent meetings [Hoschka, 1997], the W3C Synchronized Multimedia working group was set up in 1997. The charter of the group was to develop a declarative language, editable using a text editor, which could describe the synchronization and linking relationships among multiple media elements. The result of the working group was the publication of the SMIL 1.0 Recommendation [Hoschka et al. 1998] in June 1998. The goals for this first version of SMIL were to provide a complete, but not overly complex, hypermedia document format that could be tried out in the wider context of the Web and form a basis of further developments within W3C.

SMIL 1.0 includes a number of aspects of linking within time-based hypermedia, such as temporal composition (the par and seq elements), linking within temporal composites, and linking among temporal composites. When specifying a link, the show attribute allows the choice of behaviour of the source presentation using the replace, new or continue values. The more complex aspects of atemporal composition were omitted in SMIL 1.0, although a particular type of atemporal composition for allowing the selection of one of a number of media items (e.g. different language commentary, or different bandwidth media) was included as the switch element.

3 The future of time in hypermedia

Time-based hypermedia has emerged from the ivory tower and has become part of the Web's infrastructure - SMIL 1.0 is currently supported by RealNetworks' G2 player [RealNetworks, 1999] and the Oratrix GRiNS editor [Oratrix, 1999], among others. SMIL 1.0 is only the beginning, however. Work on SMIL Boston [Ayars et al. 1999] is underway, which increases the power of SMIL 1.0 by, for example, adding a form of atemporal composition that allows for truely non-linear multimedia presentations. Other expected extensions include a more general event-based model integrated with a scheduled timing model. Link traversal can then be seen as a special case of a more general event type.

In order to make time a feature as ubiquitous as linking, scheduling features such as those found in SMIL need to be applicable to other document languages. There are still many open issues that need to be solved before temporal and spatial multimedia semantics can be generally combined with existing structured document formats and style sheets. However, a first step towards integrating time in the larger Web infrastructure has been made by the development of SVG (Scalable Vector Graphics, [Ferraiolo, 1999]), which uses SMIL timing to created animation effects. Work on the integration of time in XML documents in general [Bray et al. 1998] is currently part of the SMIL Boston [Ayars et al. 1999] effort.

References

Ackermann, 1994: Direct Manipulation of Temporal Structures in a Multimedia Application Framework, Philipp Ackermann, Proceedings of ACM Multimedia 1994, San Francisco, CA, USA (pages 15-58 ) October 1994.
Ayars et al. 1999: Synchronized Multimedia Integration Language (SMIL) Boston Specification, W3C Working Draft, November 15, 1999, Edited by Jeff Ayars, Aaron Cohen, Ken Day, Erik Hodge, Philipp Hoschka, Rob Lanphier, Nabil Layaïda, Jacco van Ossenbruggen, Lloyd Rutledge, Bridie Saccocio, Patrick Schmitz, Warner ten Kate, Ted Wugofski, Jin Yu and Thierry Michel . http://www.w3.org/TR/smil-boston/. W3C Working Drafts are available at http://www.w3.org/TR.
Bray et al. 1998: Extensible Markup Language (XML) 1.0 Specification, Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, W3C February 10, 1998. W3C Recommendations are available at http://www.w3.org/TR.
Buchanan and Zellweger, 1993: Automatically Generating Consistent Schedules for Multimedia Documents, M. Cecelia Buchanan and Polle T. Zellweger, Multimedia Systems 1(2), pp 55-6, 1993.
Ferraiolo, 1999: Scalable Vector Graphics (SVG) Specification, W3C Working Draft, December 3, 1999, Edited by Jon Ferraiolo. http://www.w3.org/TR/SVG/. W3C Working Drafts are available at http://www.w3.org/TR.
Grønbæck and Trigg, 1994: Special Issue on Hypermedia, Edited by K. Grønbæck and R. Trigg, Communications of the ACM 37(2), February 1994.
Halasz et al. 1987: NoteCards in a Nutshell, F.G. Halasz, T.P. Moran and T.H. Trigg, Proceedings of the ACM Conference on Human Factors in Computing Systems Toronto, Canada April 1987.
Halasz and Schwarz, 1994: The Dexter Hypertext Reference Model, F. Halasz and M. Schwarz, Communications of the ACM 37(2), pp 30-39, February 1994, Edited by K. Grønbæck and R. Trigg.
Hardman, 1998: Modelling and Authoring Hypermedia Documents, Lynda Hardman, Phd Thesis, Universty of Amsterdam 1998, ISBN: 90-74795-93-5. http://www.cwi.nl/~lynda/thesis
Hardman et al. 1999: Do You Have the Time? Composition and Linking in Time-based Hypermedia, Lynda Hardman, Jacco van Ossenbruggen, Lloyd Rutledge, K. Sjoerd Mullender and Dick C. A. Bulterman, Proceedings of ACM Hypertext 99, Darmstadt, February 1999.
Hoschka, 1997: Toward Synchronized Multimedia on the Web, Philipp Hoschka, World Wide Web Journal, Spring 1997, see http://www.w3journal.com/6/s2.hoschka.html.
Hoschka et al. 1998: Synchronized Multimedia Integration Language (SMIL) 1.0 Specification, W3C June 15, 1998, Edited by Philipp Hoschka. http://www.w3.org/TR/REC-smil/. W3C Recommendations are available at http://www.w3.org/TR.
Jourdan et al. 1998: Madeus, an Authoring Environment for Interactive Multimedia Documents, M. Jourdan, N. Layaida, C. Roisin, L. Sabry-Ismail and L. Tardif, Proceedings of ACM Multimedia '98, Bristol UK.
Kim and Song, 1995: Multimedia Documents with Elastic Time, M.Y. Kim and J. Song, Proceedingsof ACM Multimedia '95, San Francisco CA, pp 143-154.
McCracken and Akscyn, 1984: Experiences with the ZOG Human Computer Interface System, D. McCracken and R.M. Akscyn, International Journal of Man-Machine Studies,volume 21, pages 293-310, 1984.
Oratrix, 1999: The GRiNS Player and Editor for SMIL, Oratrix Developement B.V., 1999. See http://www.oratrix.com/GRiNS/
Raggett et al. 1998: HTML 4.0 Specification, W3C Recommendation, revised 24 Apr 1998, Edited by Dave Raggett, Arnaud Le Hors and Ian Jacobs. http://www.w3.org/TR/REC-html40/.
RealNetworks, 1999: RealSystem G2, RealNetworks, 1999. See http://www.real.com/g2/

See also the ACM SIGWEB Work group "Time in Hypermedia": http://www.acm.org/sigs/sigweb/Timegroup.html.