Universität Stuttgart

Fakultät Informatik

Fakultät Informatik
Institut für Parallele und
Verteilte Höchstleistungsrechner
Universität Stuttgart
Breitwiesenstraße 20 - 22
D-70565 Stuttgart

CINEMA - An Architecture for
Configurable Distributed Multimedia
Applications

Kurt Rothermel, Ingo Barth, Tobias Helbig

CINEMA - An Architecture for

Configurable Distributed Multimedia

Applications

Kurt Rothermel, Ingo Barth, Tobias Helbig

CR-Klassifikation: C.2.4, D.2.2, H.5.1

Fakultätsbericht 3/1994
Technical Report
April 1994


Abstract

Distributed multimedia applications combine the advantage of distributed 
computing with the capability of processing discrete and continuous 
media in an integrated fashion. The development of multimedia 
applications in distributed environments requires specific abstractions 
and services, which are usually not provided by generic operating 
systems. Those services are typically realized by software components, 
often referred to as middleware. The CINEMA (Configurable INtEgrated 
Multimedia Architecture) project aims at the development of powerful 
abstractions for multimedia processing in distributed environments. This 
paper presents a flexible mechanism for the dynamic configuration of 
applications. The proposed mechanism allows for the definition of 
arbitrary complex flow graphs connecting various types of multimedia 
processing elements. Further, processing elements can simply be composed 
from other ones to provide higher levels of abstraction. We also propose 
the abstraction of a clock hierarchy to permit grouping, controlling, 
and synchronization of media streams. An appealing property of this 
abstraction is that it harmonizes well with component nesting.

1 INTRODUCTION 2

1 INTRODUCTION

Advances in the computer and communication technology have stimulated 
the integration of digital audio and video with computing, leading to 
the development of distributed multimedia systems. This class of systems 
combines the advantages of distributed computing with the capability of 
processing discrete media, such as text or images, and continuous media, 
such as audio or video, in an integrated fashion. The capability of 
integrated multimedia processing not only enhances conventional 
application environments, but also opens the door for new and innovative 
applications. A major advantage of multimedia computing in distributed 
environments is the possibility of sharing resources among applications 
and users, where shared resources may be data objects such as multimedia 
titles, special processing elements such as compression modules, or 
special devices such as professional VCRs.

The processing and communication of media streams requires specific 
system services. In general, media streams are associated with a certain 
quality that has to be maintained by the underlying system. To be able 
to guarantee the required stream quality, system services for allocating 
and reserving system resources, such as CPU cycles or network bandwidth, 
are needed. Moreover, applications need to control the flow of streams, 
i.e. they should be able to start, pause, continue or scale individual 
streams. In many scenarios, it is desirable to group related streams and 
to control groups of streams rather than individual streams. Finally, 
powerful services to synchronize multiple streams are required. Those 
services should permit applications to specify which streams are to be 
synchronized and how these streams temporally relate to each other.

Generic operating systems usually do not provide those specific 
multimedia services. The gap between the functionality offered by 
operating systems and the specific needs of distributed multimedia 
applications is closed by software components often referred to as 
middleware. The CINEMA (Configurable INtEgrated Multimedia Architecture) 
system, which is currently under development at the University of 
Stuttgart, belongs to this system category. It provides abstractions for 
the dynamic configuration of distributed multimedia applications. 
Clients may define arbitrary data flow graphs, connecting various 
processing elements called components. Moreover, component nesting is 
supported to achieve higher levels of abstractions by simply composing 
more complex components from already existing ones. The abstraction of a 
session allows for atomic resource allocation and reservation for any 
group of connected components. CINEMA provides the concept of a clock 
hierarchy for grouping and controlling streams and groups of streams. 
The same abstraction permits to express arbitrary complex stream 
synchronization requirements.

2 RELATED WORK 3

The remainder of the paper is organized as follows. In the next section, 
a brief overview of related work is given. Then, in Section 3, the way 
how applications are configured in CINEMA is described in some detail. 
This section also introduces the concept of component nesting. The 
abstractions for grouping, controlling and synchronizing media streams 
and groups of streams are presented in Section 4. Finally, we conclude 
with a brief summary.

2 RELATED WORK

The multitude of problems that arise when integrating multimedia 
processing into conventional computer systems and attempting to develop 
distributed multimedia applications are addressed in several projects, 
which lay emphasis on different issues. In the SUMO project [CBRS93], 
the Chorus [RAA+90] micro-kernel is extended to support continuous 
media. This is done by using the real-time features of Chorus and adding 
stream-based data transfer and quality of service control inside the 
operating system. The features are accessible by a lowlevel API. The 
focus of this work lies on operating system issues like scheduling but 
not on providing a universal platform and high-level abstractions for 
developing and configuring distributed multimedia applications. The 
problem of configuring distributed applications by using software 
components that are interconnected by linked ports is addressed by Conic 
[KrMa85] and its follow-up project REX [MKSD90]. Conic offers languages 
for programming components and configuring applications without 
supporting multimedia data handling. The configuration process is 
centralized in a configuration manager which accepts change 
specifications for altering configurations.

Specific abstractions for controlling multimedia data streams have been 
proposed as well. Some of them apply to non-distributed environments 
only (e.g. QuickTime [Appl91] or IBM's Multimedia Presentation Manager 
[IBM92]), while others are tailored to specific configurations (e.g. 
ACME [AGH90] and Tactus [DNNR92]), and essentially are extensions of 
network window systems to support streams of digital audio and video 
data. General requirements that should be met by architectures 
supporting distributed multimedia applications are specified in the 
Request for Technology [IMA92] of the Interactive Multimedia Association 
(IMA). A response to this request contributed by some companies [Hewl93] 
proposes abstractions to structure and control distributed multimedia 
environments while using multi-vendor processing equipment. The proposal 
assumes generic multimedia processing elements producing and consuming 
multimedia data via ports that are associated with formats. However, the 
nesting of processing elements is not supported and, although grouping 
is used to handle resource acqui-

3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 4

sition, stream control and the specification of end-to-end quality of 
service, no means to specify synchronization relationships between data 
streams are provided.

3 CONFIGURATION OF MULTIMEDIA APPLICATIONS

In order to build large software systems it is necessary to decompose a 
system into modules each of which can be separately programmed and 
tested. The system is then composed as a configuration of these software 
components. Component programming and component configuration are 
separate activities which have been referred to as 
?programming-in-the-small? and ?programming-in-the-large?, respectively 
[DeKr75].

Configuration may be static or dynamic. In the first approach to system 
building, all components of the system are configured at the same time. 
If a modification of the system is required, the complete system has to 
be stopped and rebuilt according to the new configuration specification. 
Obviously, static configuration is not a feasible approach in the 
context of distributed multimedia systems, in which configurations often 
depend on the available resources and the quality of service the user 
asks for at run time. Moreover, multimedia applications are often highly 
dynamic in the sense that users may join and leave the application 
during run time. Usually, each change in the user community implies a 
modification of the configuration. Examples for those applications can 
be found in the area of video conference systems or CSCW systems. 
Consequently, for multimedia systems the ability to extend and modify a 
system while it is running definitely is required. The approach of 
dynamic configuration provides this ability: new components can be 
introduced, existing ones may be replaced and the interconnection of 
components can be modified at run time.

In CINEMA, an application consists of at least one client and a set of 
data flow graphs. In a data flow graph, the nodes represent so-called 
components, while the edges are communication links interconnecting the 
components. A component provides the basic abstractions for the 
processing of continuous media streams, such as video or audio streams. 
A continuous media stream is defined to be a sequence of data units, 
each of which is associated with a media time (for a detailed definition 
e.g. see [Herr91]). The nature of a component's processing depends on 
the type of the component. We distinguish between source components, 
which produce (e.g. capture) data streams, sink components that consume 
(e.g. play out) streams, and intermediate components acting as both 
consumers and producers (e.g. filters or mixers). Media streams may 
originate at multiple sources, traverse a number of intermediate 
components and end at multiple sinks.


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 5

A client is a software entity that - by using the CINEMA services - 
defines data flow graphs and controls the flow of data within these 
graphs during run time. It configures (its portion of) an application 
just by naming the components to be used and interconnecting them 
according to the application logic that has to be achieved. Further, it 
may dynamically change the initial configuration during run time as 
needed. A data flow graph may be arbitrarily distributed over several 
nodes of a distributed system. As will be seen below, components are 
configuration independent, which means that their internal logic is 
independent of the configuration they are used in. Thus, from the 
client's point of view, there is no conceptual difference whether two 
adjacent components run either on the same node interconnected by a 
local link or on different nodes connected by a remote link.

A client may only control the flow of streams in the flow graphs defined 
by itself. In particular, a client may start, halt or scale data streams 
only in its so-called application domain, which is defined to be the set 
of data flow graphs specified by this client. Depending on the type of 
application, one or more clients may participate in the process of 
configuring the application. If multiple clients participate, the 
application is structured into several application domains, one for each 
participant. Each client only knows and controls the objects in its 
domain. When sharing components between clients, their domains overlap. 
The overlapping portions contain the shared components. Clearly, shared 
components may be controlled by multiple clients. For example, consider 
the simple conferencing scenario depicted in Figure 1. In this scenario, 
the application consists of several domains, each of which links two 
components - a virtual microphone and speaker of a given user - to a 
shared mixer component. Whenever a new user joins the application, a new 
domain linking the new user's (virtual) microphone and speaker to the 
shared mixer is added.

Figure 1:     Application Domains in a Conferencing Scenario

microphone speaker

mixer

microphone speaker


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 6

After this brief overview of the process of configuration in CINEMA, we 
can now take a closer look at the concepts provided for defining flow 
graphs, which are components, ports and links.

3.1  Components and Ports

The processing of continuous media data streams is done by software and 
hardware modules, called devices. Devices may be e.g. microphones or 
speakers having specific hardware interfaces and software drivers. In 
CINEMA, the processing functionality is abstracted by components which 
at least cover one device. When creating a component, a client specifies 
the devices that are to be used. Components consume data units of 
streams reading from their input ports and produce data by writing to 
their output ports. To build up data flow graphs, components are 
interconnected by links between the components' ports.

From the client's point of view, a component offers different interfaces 
to control and manipulate its behavior, the component control interface, 
the port interface, and the clock interface. The component control 
interface is used to access state information of a component and alter 
its stream handling behavior. It is specific in the sense that it 
depends on the processing functionality performed by the component. For 
example, the interface of a component abstracting from a speaker device 
may provide a method to adjust the volume of the presentation.

The port interface is used by components to send stream data to other 
components that are interconnected by links or to receive data from 
them. This decouples the multimedia processing from the transmission of 
data units between processing stages and allows the usage of the same 
component in scenarios having local as well as remote communication. To 
be able to check mismatching connections, each port is associated with a 
stream type. If a component handles multiple stream types, a new stream 
type containing the others may be defined. In Figure 3, we show an 
example of a stream type hierarchy. In this example, a port of type 
"video"

Figure 2:     The Component's Interfaces

component

port

port

component
control
clock
stream
data
stream
data


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 7

can be connected to either one of type "video", "video-grey", or 
"video-color". In a stream type hierarchy, the descendents of a node are 
specializations of this node.

The clock interface is optional for sources and mandatory for sinks and 
is used to control the flow of data units. A detailed description of 
clocks and the interface to control them is given in Section 4.

The interfaces described above are used by clients to control components 
and to connect them to build up data flow graphs. In the CINEMA system, 
components are managed by additional interfaces. An example for such an 
internal interface is the resource allocation interface, that is used to 
negotiate the quality of service and to reserve the required resources 
to ensure the negotiated quality of service.

After looking at the interfaces provided by the components, we now focus 
on the definition of components. Configuration independence [KrMa85] is 
a major property to build up components that can be used in a 
dynamically configured distributed system. This makes it possible to use 
a component in arbitrary configurations without having to change its 
processing functionality.

Configuration independence is achieved by developing components using a 
special programming language and compiling and linking them to 
independent objects. In CINEMA, we use an object-oriented programming 
language, the Component Programming Language (CPL) that is based on C++, 
to program components. All methods of a component can only access local 
objects. A component exchanges stream data by reading data from its 
local input ports and writing data to its local output ports, i.e. 
components need not know their neighbours in the data flow graph.

Programming components in an object-oriented programming language 
enables the creation of a class hierarchy with inheritance to build up 
specialized component classes out of existing

Figure 3:     Stream Type Hierarchy

video

video-grey video-color

root


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 8

ones. The following example shows the programming of a microphone 
component in the Component Programming Language.

COMPONENT microphone
:: SOURCE // class to derive from
MAP ( device MICRO ); // device parameter
dev_name = MICRO; // handle the device parameter
ENDMAP
INIT ( int sensitivity ); // specific client-IF
dev = open(dev_name,"r"); // open the device
dev_set_samplerate(dev, 8000); // rate = 8000 Hz
dev_set_sensitivity(dev, sensitivity); //set value ENDINIT
// 8KHz_Audio is a specialized form of Audio
TYPE 8kHz_Audio :: Audio; // stream-type definition // a port named 
audio is defined
OUTPORT audio 8kHz_Audio;
// the interface provides a method to adjust
// the sensitivity of the microphone
METHOD int sensitivity_adjust(int sensitivity)
dev_set_sensitivity(dev, sensitivity); // set value result = 
dev_get_sensitivity(dev); // get value return result; // return old 
value
ENDMETHOD
// the stream-handling function
ACTION // manipulate data units
data = dev_get_data(dev); // get audio samples
audio->put(data); // put samples to output port
ENDACTION
ENDCOMPONENT

In the CINEMA system, the code segments of a component are executed in 
different threads. The stream handling segment, defined in the ACTION 
clause, is periodically executed in a real-time thread, whereas the 
methods of the component control interface are executed in a 
non-realtime thread. The resource requirements of the real-time thread 
are calculated when a session is established (see Section 4.1).

3.2  Creation of Data Flow Graphs

So far, we have introduced the definition of the components, the 
functional building blocks. In this section, we will describe how a 
client builds up data flow graphs by connecting the components' ports by 
means of links.

3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 9

To build an application, a client first establishes the processing 
functionality by creating the appropriate components. This is done by 
using a library with a set of functions and classes that is provided by 
the CINEMA system. No specialized configuration language is needed which 
offers the advantage to expand and shrink applications dynamically at 
run time depending on actual requirements. Moreover, it allows the 
integration of multimedia processing functionality into existing 
(non-multimedia) applications. Creating and accessing components does 
not differ from accessing normal C++-objects. It is done by using 
appropriate object methods.

As shown above, components may be shared by multiple clients, if more 
than one client participates in the configuration of an application. In 
CINEMA, shared components are associated with a globally unique 
identifier. All clients sharing a given component create this component 
in their application domain by providing the component's global 
identifier. Of course, only the create operation issued first 
establishes the component, while all succeeding ones just enable the 
callers to access the (already existing) component.

The following code fragment shows the creation of the component objects 
in the conferencing example illustrated in Figure 1. The mixer component 
is defined as a shared component using the global identifier conference.

micro = COMPONENT("microphone",micro_dev);
mixer = COMPONENT("audio_mixer",NULL,"conference"); speaker = 
COMPONENT("speaker",speaker_dev);

For component initialization, each component provides a method called 
"init". The code example below initializes the microphone and the 
speaker component and specifies the sensitivity to 50 and the volume to 
40. The initialization has to be done before defining a session (see 
Section 4.1).

micro->init("sensitivity",50); // initialization
speaker->init("volume",40);

After component objects have been created, they can be connected by 
creating links between their ports. The component's port objects are 
accessed by using the method port in connection with the port 
identifier. In our code fragment, we link the output port of the 
microphone component (named audio) and the input port of the mixer 
component (named audio_in). A second link is established between the 
output port of the mixer component (audio_out) and the input port of the 
speaker component (audio).


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 10

link(micro->port("audio"),mixer->port("audio_in")); 
link(mixer->port("audio_out"),speaker->port("audio"));

It is important to mention that building up a data flow graph only 
describes the topology of an application. Linking components does not 
imply the reservation of resources. To enable communication, so-called 
sessions have to be established, which are the abstraction for atomic 
resource reservation (see Section 4.1.).

3.3  Nesting Components

In many areas, nesting has turned out to be a very powerful concept for 
building higher levels of abstractions. In CINEMA, more complex 
components, called compound components, can be composed from other 
components. Compound components contain a part of a data flow graph. 
They are used like non-nested, basic components, i.e. from the client's 
point of view, there is no difference in using basic or compound 
components since the internal structure of a compound component is 
hidden.

Constructing compound components from existing ones is straightforward. 
Instead of programming an ACTION clause, a part of a data flow graph is 
defined using already existing components. The component control 
interfaces of the nested components are accessed through a common 
interface provided by the compound component. The mapping of these 
interfaces must be defined when building a compound component.

Compound components have no ACTION clause. Instead, the components used 
to build the compound component must be declared (in the USE clause) and 
the way they are intercon-

Figure 4:     Compound Component

stereomicrophone

port

component
control

port

microphone

microphone


3 CONFIGURATION OF MULTIMEDIA APPLICATIONS 11

nected by links is defined (in the LINK clause). The interface methods 
are declared in the same manner as for basic components.

As an example for the programming of a compound component (see Figure 
4), we show the definition of a component representing a stereo 
microphone component. This component uses two components of class 
microphone as they were declared in Section 3.1.

COMPONENT stereo_micro
MAP ( device MICRO_l, device MICRO_r );
dev_MICRO_l = MICRO_l; // handle the device param dev_MICRO_r = MICRO_r; 
// handle the device param ENDMAP
INIT ( int sensitivity );
// initialize the contained components with the
// provided parameters
micro_l->init(sensitivity); // initialize mirco_l 
micro_r->init(sensitivity); // initialize mirco_r ENDINIT
// define the ports of the compound component
OUTPORT audio_l 8kHz_Audio ;
OUTPORT audio_r 8kHz_Audio ;
USE
// create component objects
micro_l = COMPONENT("microphone",dev_MICRO_l);
micro_r = COMPONENT("microphone",dev_MICRO_r);
ENDUSE
LINK
// build up flow graph with links
// use "this" to refer to compound component
link(micro_l->port("audio"),this->port("audio_l")); 
link(micro_r->port("audio"),this->port("audio_r")); ENDLINK
// map the specific interfaces
METHOD int sensitivity_adjust(int sensitivity)
// use the interfaces of the used components
result = micro_l->sensitivity_adjust(sensitivity); result = 
micro_r->sensitivity_adjust(sensitivity); return result; // return value
ENDMETHOD
ENDCOMPONENT


4 COMMUNICATION AND SYNCHRONIZATION 12

4 COMMUNICATION AND SYNCHRONIZATION

Multimedia data streams are transmitted in arbitrarily structured flow 
graphs of interconnected components. Determining their temporal 
properties, controlling them at run time, and guaranteeing a certain 
stream quality arises a multitude of requirements that need to be 
fulfilled by the abstractions of a multimedia platform. Usually, 
multimedia data streams are designed to be consumed by human users. 
Thus, their quality is determined by the sensitivity of human senses. 
Ensuring a satisfying stream quality over long periods of time while 
using current computer and network equipment makes the reservation of 
resources inevitable.

Due to the temporal dimension of time dependent data streams, there is a 
need to specify and control temporal properties of streams. Setting 
initial parameters like data rate or start values has to be enabled as 
well as scaling (i.e. changing speed or direction) at presentation time. 
The appropriate control interface in CINEMA is the media clock. However, 
an interface that only allows to handle individual data streams is 
insufficient. Due to tight relationships between different streams, they 
need to be grouped together and be handled as a unit. This facilitates 
the control over complex scenarios and is a prerequisite for specifying 
synchronization relationships between data streams. Especially, the 
latter is essential in a multimedia system where the quality of a 
presentation of time dependent data streams strongly depends on 
observing given synchronization requirements (e.g. lip synchronization 
of audio and video where the tolerable skew is in the range of 80 ms 
[StEn93]). The grouping of data streams has to be supported by concepts 
that are adaptive to the dynamics of interactive and cooperative 
multimedia applications where at any time new users enter running 
applications (e.g. teleconferencing) and others leave. In CINEMA, the 
means to group control interfaces, to handle them as a unit and to 
specify synchronization relationships is given by the concept of clock 
hierarchies. In the following, the abstractions to fulfil the 
requirements are explained in detail.

4.1  Session

In CINEMA, a session is the abstraction of resource reservation. It is 
associated with a set of quality of service parameters. By creating a 
session, a client causes the CINEMA system to reserve the resources that 
are needed to guarantee the specified quality of service requirements. 
This is done in an all-or-nothing fashion. After a session has been 
established, the transmission and processing of multimedia data may be 
started.


4 COMMUNICATION AND SYNCHRONIZATION 13

A session encompasses parts of the flow graph which is defined by a 
client. Its actual extension is defined by specifying a set of source 
and sink components. Intermediate components and interconnecting data 
paths are determined from the data flow graph by the CINEMA system. For 
example, a point-to-point audio session may be created by the following 
statement. It describes the components and their ports that are part of 
the session as well as desired quality of service parameters:

create_session(micro ->port("audio"),
speaker->port("audio"),
QoS(Rate(min = 8000, max = 44100),
SampleSize(min = 8, max = 16),
Delay(min = 50, max = 150));

The success or failure of the establishment of a session determines 
whether a given application can be started and maintained according to 
the specified quality of service. Thus, creating a session is the 
prerequisite to transmit and process data units. Based on this, the 
following sections describe how temporal properties of streams are 
specified and data streams are controlled at run time.

4.2  Clocks

The temporal dimension of continuous media streams is defined by 
so-called media time systems. The media time system associated with a 
stream is the temporal framework to determine the media time of the 
stream's data units. In CINEMA, media time systems are provided by media 
clocks (or clocks for short). A clock C is defined as follows:

C ::= ( R, M, T, S )

The clock attributes have the following meaning: R determines the ratio 
between real-time and media time. R time units in media time correspond 
to 1 second in real-time. M is the start value of the clock in media 
time, i.e. the value of the clock at the first clock tick. T is the 
start time of the clock in real-time. S determines the speed of the 
clock: S*R time units in media time correspond to one second in 
real-time. Consequently, media time progresses in normal speed if S 
equals 1. A speed greater than 1 causes the clock to move faster, a 
speed less than 1 causes it to progress slower, and a negative speed 
causes it to move backwards. A clock relates media time

4 COMMUNICATION AND SYNCHRONIZATION 14

to real-time. It ?ticks? after it has been started and media time (m) 
can be derived from realtime (t):

Clocks are the basic abstraction for clients to control the flow of 
media streams. They may be attached to source and sink components, but 
never to intermediate ones. A clock attached to a sink component 
controls the temporal progress of all data streams processed by this 
component. This is expressed more precisely by the clock condition: a 
data unit having media time m is processed at real-time t only if the 
controlling clock is ticking and its value equals m at time t. 
Conceptually, this means that the presentation of a stream is started, 
paused or scaled when the controlling clock is started, halted or the 
clock speed is changed, respectively. Clocks attached to source 
components are typically required in flow graphs where multiple sources 
contribute data to a given sink (e.g. in a mixer scenario). Here, source 
clocks are needed to individually start sources and to determine their 
start values. For more details on source clocks refer to [RoHe94].

The most important clock operations for controlling streams are the 
following. The operation Start(M) starts the clock at media time M, by 
doing this it starts the controlled stream(s) as well. The clock 
attribute T is set to the real-time at which the clock is actually 
started. Halt(M) halts the clock when it reaches clock value M, i.e. the 
stream(s) controlled by this clock are paused. Prepare(M) prepares the 
starting of the clock at media time M by preloading the buffers along 
the communication paths of the controlled stream(s). After Prepare has 
been performed, the clock can be started immediately when Start is 
issued. Clear() clears the internal buffers associated with the 
controlled stream(s). Scale(M,S) changes the speed of the clock to S 
when media time M is reached, i.e. it scales the stream(s) controlled by 
the clock.

In the simple scenario shown in Figure 5 clock C controls the 
presentation of a video stream. The play out is started with frame 15. 
The play out rate is doubled when the presentation reaches frame 3000, 
and the presentation is halted when reaching frame 5000.

Figure 5:     Controlling a Video Stream

m M S R t T
?
( )
?
+
=

C C.Start(15)
C.Scale(3000,2)
C.Halt(5000)
video
source
video
sink

video
stream


4 COMMUNICATION AND SYNCHRONIZATION 15

4.3  Clock Hierarchies

In this section, we will introduce the notion of a clock hierarchy, 
which is the basic abstraction for grouping media streams, controlling 
groups of streams, and stream synchronization.

Remember that a clock attached to a component controls all streams 
processed by this component. A number of streams can be grouped by 
linking their controlling clocks in a hierarchical fashion to a common 
clock, which then controls the entire group. Stream groups can be 
grouped again to groups at a higher level. In the example given in 
Figure 6, clock C6 controls streams S1 and S2, while C7 controls S4 and 
S5. C8 controls the subgroups represented by C6 and C7 as well as stream 
S3, and thus all streams in the given scenario can be started, halted or 
scaled collectively by means of this clock.

A clock operation issued at a clock not only affects this clock but the 
entire (sub)hierarchy of this clock. Conceptually, an operation called 
at a clock is propagated in a root-to-leaf direction through the clock's 
(sub)hierarchy, where it is performed at every clock in this hierarchy. 
In general, clock operations can be issued at every level of the clock 
hierarchy.

Additionally, clock hierarchies may dynamically grow and shrink even if 
clocks are ticking. This feature together with the capability of halting 
and starting individual subhierarchies is very important in interactive 
applications, especially in those where multiple users with their 
individual needs participate in the same application.

Clocks provide individual media time systems which may relate to each 
other in various ways. Clock synchronization and propagation of clock 
operations is done on the basis of so-called reference points. A 
reference point defines the temporal relationship of two media time sys-

Figure 6:     Grouping Streams

C1
S1

C2
S2

C3
S3

C4
S4

C5
S5

C6

C8

C7


4 COMMUNICATION AND SYNCHRONIZATION 16

tems. More precisely, reference point [C1 : P1, C2 : P2] defines that 
media time P1 in C1's time system corresponds to media time P2 in C2's 
time system, which means that P1 and P2 relate to the same point in 
real-time (see Figure 7). Given this reference point, media time can be 
transformed from one to the other time system as follows:

Clocks may be linked in two different ways: a link may establish either 
a control or a synchronization relationship between two clocks. A 
control relationship between two clocks enables the propagation of clock 
operations without synchronizing them. Typically, control relationships 
are defined in settings where groups of streams are to be controlled 
collectively and a rather loose temporal coupling of the grouped streams 
is sufficient. Although control hierarchies include reference points, 
this information is considered only when clock operations are propagated 
to automatically transform the operation's arguments. However, after a 
hierarchy has been started, its clocks may drift out of synchronization 
and may be manipulated arbitrarily. For example, two different 
subhierarchies of the same hierarchy may be scaled in different ways, or 
clocks in the hierarchy may be halted and continued at any later time 
with arbitrary start values.

A synchronization relationship goes a step further. In addition to 
propagation, it ensures that the involved clocks progress synchronously. 
From the clock condition introduced in the previous section it can be 
concluded that two streams are synchronized if their controlling clocks 
are synchronized. Thus, synchronization hierarchies are a general and 
very powerful concept to specify arbitrary synchronization requirements 
between media streams. The structure of the synchronization hierarchy 
specifies which streams have to be synchronized, while the reference 
points in the hierarchy define how the temporal dimensions of the 
streams relate to each other. The system guarantees that all streams 
controlled by the clocks of the hierarchy are processed synchronously. 
When a subhierarchy is halted and started once again at a later point in 
time, this is performed in conformance with the temporal constraints.

Figure 7:     Transforming Media Time

m2 m1 P1
?
( )
R2
R1
------ P2
+
?
=

C1

C2

P1

P2

[C1: P1, C2 : P2]


4 COMMUNICATION AND SYNCHRONIZATION 17

4.3.1  Example

Figure 8 shows a simple telecooperation scenario with two users. Subject 
to the cooperation is an experiment shown on video V2. We assume that 
extra speech channels exist which allow the users to talk to each other. 
The two users commonly view V2. To ensure that both see the same 
information at the same time, V2 must be played out synchronously. 
Besides V2, user 1 views video V1, which shows the same experiment from 
a different perspective. Consequently, V1 and V2 are to be synchronized. 
User 2 additionally views video V3, which shows a similar experiment. 
Since the two experiments roughly correspond to each other in their 
temporal dimension, V1 and V3 are grouped by a control relationship. We 
assume that media time 500 in V3 corresponds to media time 5 in V2.

The presentation of all video streams can be started by issuing Start at 
clock C5. Moreover, this clock can be used to collectively scale, pause 
and restart the entire configuration. User 1 may pause V1 or V2 by 
halting C1 or C2, respectively. Halted clocks may be continued in a 
synchronized fashion, i.e. after restart of C2, for example, the 
presentation of V2 is not only synchronized with V1 but also with V2's 
presentation at the site of user 2. Since C3 and C4 are linked with a 
control edge, V3 can be scaled, paused and restarted at any position 
independent of V1's and V2's state of the presentation. So, the 
presentation V3 can be adjusted manually as needed.

If another user desires to join the scenario, the clock hierarchy has to 
be extended dynamically. After the corresponding session has been 
established, the clock controlling V2's presentation at the new user's 
site is linked by a sync edge to clock C5. By issuing the start 
operation, V2's presentation is started synchronously to the ongoing 
presentations.

Figure 8:     A Simple Telecooperation Scenario

C1

C2

C3

C4

V1

V2

V3

sync [5,5]

sync [5,5]
sync [5,5]

ctrl[5,500]

User1

User2

C5


5 SUMMARY 18

4.4  Clock Hierarchies and Nesting

In the context of synchronization, nesting means that arbitrary complex 
clock hierarchies may be defined within compound components and thus 
remain invisible for the components' outside world. A clock hierarchy of 
a compound component is defined at the time the component is composed 
and specifies internal synchronization and control relationships between 
the clocks defined within this component. Only the root of internal 
clock hierarchies is exported and thus becomes visible to the 
components' outside world. The operations issued at an exported clock 
are propagated through the clock hierarchy and thereby control the 
internal processing. Exported clocks may again be involved in clock 
hierarchies at higher levels of abstraction.

The compound component shown in Figure 9 provides the abstraction of a 
television set, capable of playing out a video stream and two audio 
streams in a synchronized fashion. The component shown contains two 
basic components, a video decompression component (D) and sink component 
implementing a video output window (W). In addition, it includes another 
compound component, which consists of two filter components (F) and two 
speaker components (S). The nested compound component provides the 
abstraction of an audio output device, whose operation is controlled by 
clock C2. The TV component exports clock C1, which is used to start, 
pause or scale the audio-visual output.

5 SUMMARY

The efficient development of distributed multimedia applications 
requires abstractions and services that are provided by a specialized 
software layer. CINEMA is such a development platform. It uses the 
functionality of given transport and operating systems, adds the 
necessary features

Figure 9:     Nested Components

C2

C1
Video

Audio 1

Audio 2

D

F

F

W

S

S

sync

sync


6 REFERENCES 19

to support the processing of multimedia data in distributed environments 
and makes them accessible by appropriate abstractions. The abstractions 
of CINEMA and their usage were presented in this paper. We described 
components that provide multimedia processing functionality and produce, 
respectively consume, units of data streams via typed ports. To 
facilitate the reusability of software and to achieve higher levels of 
functional abstractions components may be nested.

Distributed multimedia applications are created by interconnecting the 
components' ports with links which allows the definition of arbitrary 
flow graphs. Before starting the flow of data units, the creation of 
sessions allows the specification of quality of service requirements and 
results in the reservation of system resources that are needed to ensure 
the requirements. Having established an application, the need for 
controlling its behavior (i.e. the flow of data units) arises. With 
media clocks and clock hierarchies we proposed abstractions to control 
individual data streams as well as groups of streams. We discussed the 
usage of clock hierarchies to specify synchronization relationships 
between data streams and showed how they may be used to handle the 
requirements of highly dynamic, interactive and cooperative multimedia 
applications. Finally, it was outlined how clock hierarchies are used to 
control the propagation of operations in compound components.

Currently, the implementation of CINEMA is in progress. A first 
prototype is working. It supports a restricted set of the functionality 
described in this paper. For example, it is possible to establish 
applications in a distributed environment and to control and to 
synchronize the flow of data units in limited configurations. Our future 
work is directed at completing our prototype. By experimenting with 
applications, we aim at gaining more experience in using the 
abstractions. This will enable us to verify their practical usefulness.

6 REFERENCES

[AGH90] David P. Anderson, Ramesh Govindan, George Homsy. Abstractions 
for Continuous Media in a Network Window System. Report No. UCB/CSD 
90/596, Computer Science Division (EECS), University of California, 
Berkeley, CA, 11 1990.

[Appl91] Apple Computer Inc., Cupertino, CA, USA. QuickTime Developer's 
Guide, 1991.

[CBRS93] Geoff Coulson, Gordon S. Blair, Philippe Robin, Doug Shepherd. 
Extending the Chorus Micro-Kernel to support continuous media 
applications. In Proceedings of the 4th International Workshop on 
Network and Operating Systems Support for Digital Audio and Video, pp. 
49?60, 11 1993.


6 REFERENCES 20

[DeKr75] F. DeRemer, H. Kron. Programming-in-the-large versus 
programming-in-thesmall. In Proceedings of the Conference on Reliable 
Software, pp. 114?121, 1975.

[DNNR92] Roger B. Dannenberg, Tom Neuendorffer, Joseph M. Newcomer, Dean 
Rubine. Tactus: Toolkit-Level Support for Synchronized Interactive 
Multimedia. In 3nd International Workshop on Network and Operating 
System Support for Digital Audio and Video, 11 1992.

[Herr91] Ralf Guido Herrtwich. Time Capsules: An Abstraction for Access 
to Continuous- Media Data. The Journal of Real-Time Systems, Kluwer 
Academic Publishers, pp. 355?376, 3 1991.

[Hewl93] Hewlett-Packard Company and International Business Machines 
Corporation and SunSoft Inc. Multimedia System Services, Version 1.0, 
available via ftp from ibminet.awdpa.ibm.com, 7 1993.

[IBM92] IBM Corporation. Multimedia Presentation Manager Programming 
Reference and Programming Guide 1.0, IBM Form: S41G-2919-00 and 
S41G-2920-00, 3 1992.

[IMA92] Interactive Multimedia Association, Compatibility Project, 
Annapolis, MD, USA. Request for Technology: Multimedia System Services, 
Version 2.0, available via ftp from ibminet.awdpa.ibm.com, 11 1992.

[KrMa85] Jeff Kramer, Jeff Magee. Dynamic Configuration for Distributed 
Systems. IEEE Transaction on Software Engineering, SE-11(4):424?436, 4 
1985.

[MKSD90] Jeff Magee, Jeff Kramer, Morris Sloman, Naranker Dulay. An 
Overview of the REX Software Architecture. In 2nd IEEE Computer Society 
Workshop on Future Trends of Distributed Computing Systems, 10 1990.

[RAA+90] M. Rozier, V. Abrossimov, F. Armand, I. Boule, M. Gien, M. 
Guillemont, F. Herrmann, C. Kaiser, S. Langlois, P. L?onard, W. 
Neuhauser. Overview of the Chorus Distributed Operating System. Chorus 
Syst?mes CS/TR-90-25, 4 1990.

[RoHe94] Kurt Rothermel, Tobias Helbig. Clock Hierarchies: An 
Abstraction for Grouping and Controlling Media Streams. Technical Report 
2/94, University of Stuttgart, 4 1994.

[StEn93] Ralf Steinmetz, Clemens Engler. Human Perception of Media 
Synchronization. Technical Report 43.9310, IBM ENC, Heidelberg, Germany, 
1993.