Universität Stuttgart

Fakultät Informatik

The Shadow Approach: An Orphan

Detection Protocol for Mobile Agents

Joachim Baumann, Kurt Rothermel

Bericht 1998/08

July 1998

The Shadow Approach: An Orphan Detec-

tion Protocol for Mobile Agents

Joachim Baumann, Kurt Rothermel

Email: Joachim.Baumann@informatik.uni-stuttgart.de

Institut für Parallele und Verteilte

Höchstleistungsrechner (IPVR)

Fakultät Informatik

Universität Stuttgart

Breitwiesenstr. 20 - 22

D-70565 Stuttgart

1 Introduction 1

The Shadow Approach: An Orphan Detection Protocol

for Mobile Agents

Joachim Baumann, Kurt Rothermel

IPVR (Institute for Parallel and Distributed High-Performance Systems) 
Breitwiesenstraße 20-22 D-70565 Stuttgart 
EMail:Joachim.Baumann@informatik.uni-stuttgart.de

Abstract. Orphan detection in distributed systems is a well researched 
field for which many solutions exist. These solutions exploit well 
defined parent-child relationships given in distributed systems. But 
they are not applicable in mobile agent systems, since no similar 
natural relationship between agents exist. Thus new protocols have to be 
developed. In this paper one such protocol for controlling mobile mobile 
agents and for orphan detection is presented. The `shadow' approach 
presented in this paper uses the idea of a placeholder (shadow) which is 
assigned by the agent system to each new agent. This defines an 
artificial relationship between agents and shadow. The shadow records 
the location of all dependent agents. Removing the root shadow implies 
that all dependent agents are declared orphan and eventually be 
terminated. We introduce agent proxies that create a path from shadow to 
every agent. In an extension of the basic protocol we additionally allow 
the shadow to be mobile. The shadow approach can be used for termination 
of groups of agents even if the exact location of each single agent is 
not known.

1 Introduction

A mobile agent is regarded as a piece of software roaming the network on 
behalf of a user, e.g. searching for information in different databases, 
buying a flight ticket and renting a car, or trying to find the cheapest 
flower shop. Mobile agents seem to be the solution to many of the 
problems in the area of distributed systems. But while the idea of 
mobile agents is quite appealing, and while many researchers are working 
in this area, some very important problems have not been solved. Most of 
the research concentrates on providing the basic system support for 
migration, communication, the security of the platform underlying the 
agent system and for the asynchronous operation of agents. Some 
solutions for these problems already exist and have been implemented in 
different agent systems (e.g. [12], [4], [8], [14], [7], [6]). But until 
now no protocols exist for orphan detection in mobile agent systems.

Orphan detection in an agent system is very important both from the 
user's and from the system side, because a running agent uses resources 
which are valuable to both user and system. The user has to pay for 
resources (at least in principle), and the system has only a limited 
amount of them. So if the user does not need the results of a 
distributed computation in progress anymore, he wants to be able to 
terminate the computation to minimize the resulting cost. With an orphan 
detection mechanism the user simply declares the agents to be terminated 
as orphans. Orphan detection guarantees that the now useless agents can 
be determined by the system and ended, thus freeing the resources they 
have bound. In this paper we will present a new protocol, the shadow 
protocol, that al-

2 The Agent Model 2

lows both control of mobile agents and orphan detection. The paper is 
organized as follows: Sect. 2 presents our agent model. In Sect. 3 the 
shadow protocol is presented with different extensions and 
optimizations. Sect. 4 presents related work, and in Sect. 5 the 
conclusion and outlook is given.

2 The Agent Model

In this section we will give you a short overview of our agent model, 
that has been described in more detail in [12], [1] and [4]. Our model 
of an agent-based system - as many other models - is mainly based on the 
concepts of agents and places. Places provide the environment for safely 
executing local as well as visiting agents. An agent system consists of 
a number of (abstract) places, being the home of various services. 
Agents are active entities, which may move from place to place to meet 
other agents and access the places' services. Each agent is identified 
by a globally unique agent identifier. An agent's identifier is 
generated by the system at agent creation time. The creating place can 
be derived from this name. It is independent of the agent's current 
place, i.e. it does not change when the agent moves to a new place. In 
other words, the applied identifier scheme provides location 
transparency. A place is entirely located on a single node of the 
underlying network, but multiple places may be situated on a given node. 
For example, a node may provide a number of places, each one assigned to 
a certain agent community, allowing access to a certain set of services 
or implementing a certain prizing policy. Places are divided into two 
types, depending on the connectivity of the underlying system. If a 
system is connected to the network all the time (barring network 
failures and system crashes), a place on this system is called 
connected. If a system is only part-time connected to the network, e.g. 
a user's PDA (Personal Digital Assistant), the place is called 
associated.

3 The Shadow Protocol

In this section we discuss the basic Shadow Protocol with its agent 
proxies, the extension that allows the shadows to be mobile, and discuss 
possible optimizations.

3.1 The Idea

In the shadow concept each application creates one or more shadows, a 
data structure on a connected place. The place where the shadow is 
created does not necessarily have to run on the same host on which the 
creating application runs. Each agent created by the application depends 
on such a shadow (Fig. 2). As long as the shadow exists in the system, 
no contact of agents to the application itself or to the computer system 
on which

Hi

migration

place B

place C application

service agent mobile agent

place A

Fig. 1. The Agent Model

comm.

Application creates

Place

Agent

Shadow

Fig. 2. The Creation of a Shadow

3 The Shadow Protocol 3

the application runs is necessary. In regular intervals (called time to 
live or ttl) the system checks for each agent if the associated shadow 
still exists. If the shadow does no longer exist (because the 
application removed it), the agent is declared to be an orphan and is 
removed.

If an agent creates a new agent, the system assigns the to this new 
agent the shadow of the creating agent, and the same remaining ttl until 
the next check (Fig. 3). This assignment cannot be changed by the 
agents. Limiting the time span to the remaining ttl of the creating 
agent (and not to the original time interval) is necessary to prevent 
malicious agents from living infinitely. Otherwise the mechanism could 
be circumvented simply by creating a new agent with again the whole ttl 
just before the life span of the old agent ends. If a place on which a 
shadow resides cannot be reached, the system tries to contact the place 
several times. If still the place cannot be reached, the shadow is 
presumed no longer existent and its associated agents are killed. The 
disadvantage of this approach is that regardless of what an agent does, 
it has to connect to its shadow's place in regular intervals. The 
advantage on the other hand is that we have a worst-case time bound for 
the termination of agents through removing the shadows. This upper bound 
is exactly the sum of ttl of the agents and the timeout for contacting.

Until now the protocol only allows passive termination. By removing a 
shadow all dependent agents are declared orphans, and after the ttl it 
is guaranteed that all agents have been removed by the orphan detection. 
By adding the path concept to this protocol, we also allow active 
termination, i.e. termination of an agent while its ttl is greater 0. 
Agent proxies are structures at each place that keep track of the 
movement of all agents dependent of a specific shadow, thus creating a 
path leading to the agent. By storing the place at which the agent got 
checked the last time we can find the beginning of a path for every 
agent. Even if the path gets lost, the agent will contact the shadow 
after the ttl.

If an agent arrives at a place where not yet an agent proxy for this 
shadow exists, one is created (Fig. 4). As soon as the agent migrates to 
another place, the destination (being part of the path leading to the 
agent) is stored in the proxy together with the ttl.

When the end of the ttl is reached, the agent's shadow gets a request 
for extending the agent's life, and thus the new place of the agent is 
made known to the shadow (Fig. 5). The path entries stored in the 
different agent proxies along the agent's way is now superfluous and can 
be removed using the knowledge about the ttl stored in the proxy. An 
entry can also be removed if the agent migrates back to this place (this 
simply optimizes the now circular path by removing the loop).

An agent proxy contains, for a specific place, all path segments of 
agents belonging to the same shadow. It exists exactly as long as there 
is a path entry in it. As soon as the agent proxy contains no more 
entries, it can be removed as well. This is especially helpful if the 
agents are actively terminated, i.e. the system actively sends messages 
to terminate the agents as fast as possible. In that case, all entries 
are removed from the agent proxy, allowing the system to delete the 
proxy as well.

creates

Place

Agent

Shadow

Place

depend Agent

Fig. 3. Creating a New Agent

3 The Shadow Protocol 4

3.2 The Protocol

We will discuss the different parts of the protocol separately. The 
protocol is presented in an objectoriented pseudo-code notation.

The place on which the agent resides, decrements in regular intervals 
the ttl of the agent. As soon as the ttl of the agent is 0, a message is 
sent back to the home place of the shadow, containing the id of agent 
and shadow. At the same time a timer is started with a timeout, and the 
agent enters the check phase (Fig. 6). To allow greater flexibility each 
shadow (and thus the group of associated agents) can have a timeout of 
its own. This allows for a loophole by setting a very long timeout. But 
this can be corrected by introducing a per-place timeout. The timeout 
finally chosen is the minimum of agent timeout and place timeout.

Fig. 4. Proxie Paths Fig. 5. Regular Update of Proxies

Shadow

Place

Agent

Shadow

Place

Agent

Agent Proxy

Agent Proxy

Shadow

Place

Agent

Agent Proxy

Shadow

Place

Agent

Regular Intervals: for each agent agent.timeToLive - -; if 
(agent.timeToLive == 0) sendCheck(agent.shadowHome, current- Place, 
agent.shadowId, agent.id); startTimer(min(place.TimeOut,agent.time- 
Out), agent.proxy, agent);

onArrival(agent) agentproxy = proxyList.find(agent.shadowId); 
if(agentproxy == null) agentproxy = new Proxy(agent.id, agent.timeTo- 
Live, agent.shadowHome, current- Place); proxyList.add(agentproxy); else
agentproxy.add(agent.agentId, agent.timeTo- Live); agent.proxy = 
agentproxy; agentList.add(agent); agent.start();

onLeaving(agent, target)

Fig. 6. System Methods

3 The Shadow Protocol 5

The check message is received by the home place of the shadow. First a 
timer is stopped that has been started the last time the ttl has been 
sent back to the agent. This allows to detect agents that have been 
terminated (see below). The ttl is requested from the responsible 
shadow, and if greater 0 is sent back by the system to the requesting 
agent. As soon as the message is received, the timer for the timeout is 
stopped, and the agent's ttl is set (see Fig. 7). This ends the agent's 
check phase and allows it to migrate again. When an agent arrives at a 
place, the list of agent proxies is searched for a proxy of that agent. 
If none exists, a new one is created, and the agent gets a reference on 
it. As soon as an agent wants to leave, its ttl is checked. This is done 
to prevent an agent who is in the check phase to migrate. If it is not 
in the check phase, the information in the agent proxy is updated to 
point to the target place. At the same time a timer is started that 
removes the path after the sum of remaining ttl and timeout (see Fig. 
6). The shadow can decide on a case-by-case basis if an agent's life 
time is to be extended, and by which interval.

In Fig. 8 we present an example policy, that for all of the agents 
returns the same ttl. This method checks first if an agent entry already 
exists for this agent (in case a newly created agent contacts the 
shadow), updates the information about the location of the agent, and 
returns the ttl. The shadow is also called if the system has detected 
(via the timeout), that an agent has been terminated. The simplest 
policy is to remove the related entry from the list. We now discuss the 
reaction to the different timeouts (see Fig. 9). One possible reaction 
to the timeout of the check message has been sketched out above. Here we 
present a

receiveCheck(from, shadowId, agentId) stopTimer(agentId); shadow = 
shadowList.find(shadowId); timeToLive = shadow.timeToLive(from, 
agentId); if (timeToLive > 0) startTimer(timeToLive + 
shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from, 
agentId, timeToLive);

receiveAllowance(agentId, timeToLive) stopTimer(agentId); agent = 
agentList.findAgent(agentId); agent.timeToLive = timeToLive; Fig. 7. The 
Check Phase

timeToLive(from, agentId, shadowId) [here an example policy is 
presented] shadowproxy = listOfProxies.find(shadowId); agententry = 
shadowproxy.get(agentId) if( agententry != null) agententry.target = 
from; else
agententry = new AgentEntry(from, agentId, timeToLive); shadowproxy.add 
(agententry); return agententry.timeToLive;

remove(agentId) [implement policy] agentproxy = 
listOfProxies.find(agentId); Fig. 8. Methods in the Shadow Object

onTimer(proxy, agent) // check timeout [here an example policy is 
presented] agentList.remove(agent); agentproxy.remove(agentId); 
if(agentproxy.entries() == 0) proxyList.remove(agentproxy);

onTimer(agentproxy, agentId)// path redundant [implement policy] 
agentproxy.remove(agentId); if(agentproxy.entries() == 0) 
proxyList.remove(agentproxy);

onTimer(shadow, agentId) // ag. terminated Fig. 9. System: Reaction to 
Timeouts

3 The Shadow Protocol 6

simple alternative; the agent is removed at once. The next timeout 
affects the paths. As soon as an agent migrates, the path segment 
pointing to its new location is created, and a timer started. As soon as 
this timer ends, we know that the path information in the shadow itself 
has been updated, and this part of the path can safely be removed. The 
last method is called if an agent has not tried to contact the shadow 
for the sum of ttl and timeout. In this case the agent has terminated. 
The shadow method (see Fig. 9) is called to react to it.

Finding Agents. If we want to actively terminate a specific agent, we 
have to find it first. This can be done with the help of the information 
stored in the agent proxies. If the agent is in the local list of active 
agents, it is already found. If not, the related agent proxy is 
searched. If it is not found, an error is returned. If it is discovered, 
a find request is sent to the target found in the proxy. At the target 
place the list of active agents is again examined. If the agent is 
found, a success message is sent back. If not, the related agent proxy 
is searched again. If no proxy exists, an error is sent back. Otherwise, 
the message is sent on. This is repeated until the agent is found or the 
path ends (see Fig. 10).

3.3 Mobile Shadows

In cases where many of the agents depending on a shadow move somewhere 
far away (i.e. communication costs are high), every one of the agents 
has to contact the shadow independently, resulting in unnecessarily high 
communication costs. If the migration behaviour is known in advance, the 
shadow can be placed in a way that reduces the communication cost. But 
in many cases the behaviour is not known in advance, or the group moves 
as a whole from area to area (e.g. from one organization to another). In 
these cases it would be much better if the shadow moved with the agents. 
Possible policies where to place the shadow could be:

- at a place where the communication cost to all dependent agents would 
be lowest.

- where one agent important for the computation is situated. If the 
place becomes unavailable (e.g. crashes), both shadow and agent would 
not be reachable, and the other dependent agents would be terminated.

While in the first case the shadow would have to be persistent, in the 
second case it would have to be transient to implement the policy.

find(agentId) if (agentList.find(agentId) != null) return(this); 
agentproxy = shadowList.find(agentId); if(agentproxy != null) 
sendFind(agentproxy.target(agentId), this, agentId); else
return(notFoundError);

receiveFind(searcher, agentId) if (agentList.find(agentId) != null) 
sendFound(searcher, this, agentId); if((agtproxy = 
proxyList.find(agentId)) != null) sendFind(agtproxy.target(agentId), 
searcher, agentId); else
sendError(searcher, notFoundError, agentId);

receiveFound(from, agentId) return(from);

receiveError(error, agentId)

Fig. 10. Finding Agents

3 The Shadow Protocol 7

To move a shadow two problems have to be dealt with. The first is that 
the agents depending on the shadow have somehow to be notified about the 
new location of the shadow. The second is that the application still has 
to be able to reach the shadow, e.g. in case it wants to terminate the 
agents. Both problems can be solved similar to the approach used with 
the agent proxies. When a shadow moves, a shadow proxy stays behind. 
Thus over time a shadow path is built. By contacting the copy at the 
home place in regular intervals this path can be cut short. As 
alternative to intervals at which to cut the path short, a maximum path 
length would be suitable. But using a maximum path length adds 
communication along the path, because as soon as the maximum path length 
has been reached the shadow proxies along the path have to be notified 
that they are no longer needed. A combination of these policies seems 
the most flexible.

Now, when an agent requests a new ttl, the shadow might already have 
moved somewhere else. In this case, the request is sent to the new place 
of the shadow. If the shadow already has moved again, the request is 
forwarded along the path of shadow proxies until the shadow itself is 
reached. The shadow sends a new grant back to the agent together with 
its new place. The next time the agent sends its request directly to the 
new place.

The shadow proxies can be removed as soon as the path is no longer 
needed and no agent still has the reference to a shadow proxy. Thus the 
maximum of agent and shadow ttl is the maximum time the proxy has to be 
hold. One exception has to be made though. The first proxy, that stays 
at home, cannot be removed as long as the shadow is elsewhere.

The Protocol. We first examine the shadow part of the protocol. Moving 
the shadow to another place creates a path to the target and starts a 
timer. After the timeout of this timer the path has to be deleted. The 
path is created by leaving a shadow proxy behind. Removing the shadow is 
done by sending a message along the path (see Fig. 11). Each shadow gets 
a ttl, after which it must contact its home place. This time is not 
necessarily the same as for the agents.

In regular intervals this ttl is decremented. As soon as the shadow's 
ttl is 0, the shadow enters the check phase. A message containing the 
shadow id and its current place is sent to the home place and a timer is 
started (see Fig. 12). The check message for the shadow contains the new 
place of the shadow. If the shadow proxy at home still exists, it is 
updated and the ttl is sent back. If the answer

move(target) if (timeToLive != 0) sendShadow(target, this); 
if(currentPlace != null) // part of path pathTimeOut = timeToLive + 
timeOut; startTimer(pathTimeOut, shadow); currentPlace = target;

terminateShadow() if (currentPlace != null) // shadow moved 
sendTerminate(currentPlace, id);

Fig. 11. Additional Shadow Methods

Regular Intervals: [agent related part stays the same] for each shadow 
if (shadow.homePlace != place.name()) shadow.timeToLive--; if 
(shadow.timeToLive == 0) sendCheck(shadow.homePlace, shadow.id); 
startTimer(shadow.timeOut, Fig. 12. Extended System Methods: Regular 
Intervals

3 The Shadow Protocol 8

is not received until the timeout, the shadow is removed (more complex 
reactions with retries can be chosen instead).

As soon as it is received, the timer is stopped and the ttl is set (see 
Fig. 13). The shadow proxies creating the path between home place and 
shadow get a similar timeout after the sum of ttl of the shadow, of the 
agent (see below) and the communication timeout. At that point the path 
is redundant and can be removed (see below). This way the path created 
by the shadow is cut short in regular intervals. If the shadow comes 
back to its home place, the shadow proxy is replaced by the original.

In the basic protocol the agent check message is sent to the shadow's 
home place. Now it is sent to the place from which the last ttl message 
has been received. This is done by storing it in an additional 
attribute. If the shadow moves between two such messages, the check 
message is sent to a shadow proxy (somewhere on the path) instead of the 
original. The shadow proxy now forwards this agent check message along 
the path. The original, upon receiving the message, sends back the ttl 
and its own place. The path is superfluous as soon as the shadow's place 
is known at the home place and no agent still references a part of it 
(see Fig. 14).

onTimer(shadow) // this path seg. is redundant 
shadowList.remove(shadow);

receiveAllowance(shadowId, timeToLive) shadow = 
shadowList.find(shadowId); stopTimer(shadow); shadow.timeToLive = 
timeToLive;

receiveCheck(from, shadowId) shadow = shadowList.find(shadowId); 
if(shadow != null) shadow.currentPlace = place; sendAllowance(from, 
shadowId, Fig. 13. Additional System Methods: Checking the Shadow

receiveCheck(from, shadowId, agentId) stopTimer(agentId); 
if(currentPlace != place.name()) sendCheck(currentPlace, from, shadowId, 
agentId); else
shadow = shadowList.find(shadowId); timeToLive = shadow.timeToLive(from, 
agentId); if (timeToLive > 0) startTimer(timeToLive + 
shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from, 
place.name(), agentId, timeToLive);

receiveAllowance(shadowPlace, agentId, timeToLive) stopTimer(agentId); 
agent = agentList.findAgent(agentId); agent.timeToLive = timeToLive;

Fig. 14. Changed System Methods: Extending the Agent's Life

3 The Shadow Protocol 9

Together with sending back the ttl to the agent the shadow starts a 
timer. If after this timeout the agent did not send a check message, the 
shadow knows that the agent has terminated. But since the timeout is 
detected at a place and not inside the shadow, the information might 
only reach a proxy and not the shadow itself. In this case the shadow 
has to be informed. Thus a message is sent along the path containing the 
information that the agent has terminated. Every proxy sends the 
information onward until it reaches the shadow. Now the agent entry is 
removed (see Fig. 15).

3.4 Optimizing the Communication

As soon as more than one agent belongs to a shadow, optimizations of the 
communication are possible. Three optimizations exist:

- If two agents belonging to the same shadow come to the same place, the 
ttl of the one with the lower remaining time interval is set to the ttl 
of the other one. This works with an arbitrarily large number of agents 
on a place and happens conveniently at the arrival of a new agent.

- If an agent's shadow has been checked, then this information also gets 
transferred to all other agents belonging to the same shadow on the same 
place as the agent.

- The combination of shadow and agent proxies creates a spanning tree 
that follows the agents' movements with the shadow as the root. The tree 
can be optimized by simply using common paths for the parts of the paths 
that are the same for different agents. This effectively reduces the 
number of messages that flow without changing the functionality. 
Furthermore, the agents on nodes along the tree can be updated 
simulataneously.

The proxies allow to find an agent, e.g. to terminate it actively. But 
with all of the mentioned optimizations the path to a specific agent can 
be lost. This can happen if an agent gets additional ttl from another 
agent, and the path assuming the original ttl is removed. The 
optimizations make it impossible to terminate a specific agent.

The interesting point though is that this doesn't matter for the 
termination of the whole group of agents. If the termination message is 
sent to all known proxies, then these proxies forward the termination 
message along all of the paths they are part of. Ultimately this 
termination message reaches all of the agents, even those no longer 
directly known to the shadow. The path segment for an agent exists 
exactly for the current ttl of the agent. So if it got additional time, 
then at that place the agent proxy holds the path from that place for 
that remaining time. Every time an agent gets additional time from 
another

onTimer(shadow, agentId) // agent terminated shadow.remove(agentId); if 
(shadow.currentPlace != place.name() ) sendRemoved( currentPlace, 
shadowId, agentId);

receiveRemoved(shadowId, agentId) shadow = shadowList.find(shadowId); 
if(shadow != null) if(shadow.currentPlace != place.name()) sendRemoved( 
currentPlace, shadowId, agentId); else
shadow = shadowList.find(shadowId); Fig. 15. Changed System Methods: 
Detecting Terminated Agents

3 The Shadow Protocol 10

agent, there exists a valid path to that other agent. So, by first 
following the path to the other agent, and then the still valid path to 
our agent, every agent gets the termination message. This way, all of 
the mentioned optimizations can be used without compromising 
functionality for the group as a whole.

3.5 Fault Tolerance

Our fault model contains two types of failures, node failures 
(fail-stop) and network partitions. It is important to note that from 
the viewpoint of a node these failures are not distinguishable. By 
introducing a path of proxies the fault sensitivity of the protocol is 
increased. If only one of the nodes containing a proxy is not reachable, 
either through node failure or network partitioning, the path is broken. 
Different mechanisms have to be used for the two different kinds of 
paths. While in the case of a broken agent proxy path only one agent is 
no longer reachable until its ttl is 0, in the case of a broken shadow 
proxy path the agents trying to extend their life are threatened. The 
mechanism employed for the agent proxy paths has already been presented 
in Sect. 3, and is only discussed briefly. The mechanism used for shadow 
proxy paths has not yet been discussed in the protocol section and is 
examined in the following in detail.

Agent Proxy Path. By introducing the ttl, after which the agent has to 
contact the shadow's place, it is guaranteed that even if the path is 
broken, the new location of the agent can be identified after the ttl 
(as a worst-case bound), as long as either the network partition is 
short-term, or agent place and shadow place are in the same partition. 
If after the ttl (plus the timeout) the agent has not contacted the 
shadow, the shadow knows that the agent does not exist any longer 
(either because it has terminated or has been declared orphan and 
removed by the system).

Shadow Proxy Path. Two strategies are possible for dealing with a broken 
shadow proxy path. The first strategy does not change the 
characteristics of the protocol, but manages only short-term failures. 
It lets the last shadow proxy of the still-existing path try to contact 
the next shadow proxy again. The problem though is that the new ttl has 
to be sent to the agent before the system decides to terminate it.

The second strategy allows for longer failures but changes the 
worst-case bound for passive termination of the agents (the worst-case 
bound is 2ttl  in this variant). If the last shadow proxy detects the 
break, it sends a new ttl back to the agent, but with the home place of 
the shadow as the new location. The new ttl is the minimum of the 
remaining shadow ttl and the agent ttl. If the shadow would have been 
removed, then the shadow proxy would know about it (and would have been 
removed as well). Thus the shadow still exists and it is correct to send 
the allowance. The home place of the shadow is sent instead of the 
location of the next shadow proxy in the path, to guarantee that the 
agent has a valid place to send the request for the next ttl. If the ttl 
of the agent is shorter than the remaining time of the shadow proxy 
path, then the next request will be sent along the same path (that 
hopefully is connected again). If the ttl of the path is shorter, then 
the agent will contact the home place of the shadow when the shadow 
itself has requested a new ttl. This means that the home place holds the 
new location of the shadow and forwards the request correctly.

4 Related Work 11

4 Related Work

In the area of mobile agent systems the current research concentrates on 
the basic system support. But now that many different agent systems 
existing support the functionality needed to realize applications, 
mechanisms providing the functionality presented in this paper are 
essential. Thus the problem areas of orphan detection and termination of 
agents are beginning to evoke the interest of the research community. 
But apart from the mechanisms developed at the University of Stuttgart 
(see [5] describing a group concept or [2] discussing an energy concept 
and a path concept) no publications present similar functionality for 
mobile agent systems. However, in the area of distributed systems many 
algorithms exist that solve similar problems. The area of distributed 
algorithms, and especially distributed termination detection (in [9] and 
in [13] a discussion of many algorithms can be found) and distributed 
garbage collection (one example is the work on Stub Scion Pair Chains 
[11]), has to be seen as related work.

But two differences prevent the use of these algorithms for mobile agent 
systems. First of all, the fault model is different. The possibility of 
network partitions or node crashes does not exist in the fault model 
used for most distributed algorithms. Mobile agent systems explicitly 
include these faults in their fault model. Furthermore, the fault model 
supports the asynchrony of agents. The second difference is the autonomy 
of the ?objects? in question that very much influences the processing 
model. A process (or object) in the distributed system area is not 
normally seen as autonomous. Here a process is seen as a cooperating 
part of a larger application. For a mobile agent the autonomy is one of 
the important prerequisites. This autonomy leads to the problem that a 
malicious agent might try to remove itself from the control by the 
system. These differences make it impossible to use the existing 
distributed algorithms in the area of mobile agent systems. It might be 
possible to use one such algorithm as the basis for a new design 
tailored to the needs of mobile agent systems. But the changes in the 
fault model and in the processing model effect so many changes in the 
algorithm itself that a correct transformation would be problematic at 
best. Nevertheless we believe that in principle it is possible to 
transform these algorithms correctly into algorithms that take the 
peculiarities of mobile agent systems into account. The key to this is 
an automatic transformation that, used on e.g. an algorithm for 
distributed garbage collection, turns it into a orphan detection and / 
or termination algorithm for mobile agent systems. An analogon to such 
an algorithm exists for the automatic transformation of termination 
detection algorithms into distributed garbage collection algorithms 
[10].

5 Conclusion and Future Work

In this paper we presented the shadow protocol. The shadow protocol has 
still some disadvantages: it introduces additional communication into 
the system and resources (memory) are bound to store the different path 
information. But the advantages outweigh the disadvantages by far: the 
mechanism is robust against malicious or faulty agents, the path 
information is updated without additional communication costs (no 
outdated path information exists), and the time until all agents are 
terminated in the worst case can be determined exactly. The presented 
protocol has been implemented in our agent system Mole (for a 
description of Mole see [12], [1], and [4]).

5 Conclusion and Future Work 12

We will examine the area of fault tolerance in detail. The presented 
mechanism is robust against short time network partitioning and system 
faults, but does not cope well with lasting faults. We will investigate 
in which way the shadow concept can be made fault resilient by 
replication of the control structures.

Acknowledgements: Parts of the protocol have been implemented by M. 
Zepf. The comments of F. Hohl, M. Schwehm and M. Straßer improved the 
quality of the paper.

A The Protocol

In this appendix the protocols are listed as a whole. Each of them is 
presented as methods in a pseudo object-oriented fashion. Some basic 
object types, e.g. lists are assumed existing. Methods can be called 
asynchronously, e.g.

- startTimer(time, agentId) will call a method onTimer(agentId) after 
time.

- sendMessage(place , ?Hello?) is sent to place and calls 
receiveMessage(?Hello?).

Methods bodies containing [here an example policy is presented] can be 
used to implement a specific policy, e.g. for reacting to a message 
asking for more energy (in the energy concept). The presented 
implementation is one of the possible policies.

A.1 Objects needed

Needed Objects

Object Shadow Method timeToLive(AgentId); Method remove(AgentId); 
Attribute listOfProxies:List of AgentProxy; Attribute 
timeToLive:Integer; Attribute timeOut:Integer; ...

Object AgentProxy Attribute id:AgentId; Attribute timeToLive:Integer; 
Attribute target:placeName;

Object ShadowProxy Attribute agents:List of AgentProxy;

Object Agent Attribute id:AgentId; Attribute timeToLive:Integer; 
Attribute timeOut:Integer; Attribute proxy:ShadowProxy; Attribute 
shadowId:ShadowId; Attribute shadowHome:PlaceName; Attribute 
home:PlaceName; ...


5 Conclusion and Future Work 13

A.2 Methods in the Object Shadow

Shadow

timeToLive(from, agentId) [here an example policy is presented] 
agentProxy = listOfProxies.find(agentId); if(agentProxy != null) 
agentProxy.target = from; else
agentProxy = new AgentProxy(from, agentId, timeToLive); return 
agentProxy.timeToLive;

remove(agentId) agentProxy = listOfProxies.find(agentId); 
listOfProxies.remove(agentProxy);

A.3 Basic Protocol with Proxies

Place: Methods

Regular Intervals: for each agent agent.timeToLive - -; if 
(agent.timeToLive == 0) sendCheck(agent.shadowHome, currentPlace, 
agent.shadowId, agent.id); startTimer(min(localTimeOut, agent.timeOut), 
agent.proxy, agent);

onArrival(agent) agentproxy = proxyList.find(agent.shadowId); 
if(agentproxy == null) agentproxy = new Proxy(agent.id, 
agent.timeToLive, agent.shadowHome, currentPlace); 
proxyList.add(agentproxy); else
agentproxy.add(agent.agentId, agent.timeToLive); agent.proxy = 
agentproxy; agentList.add(agent); agent.start();

onLeaving(agent, target) if (agent.timeToLive > 0) 
agentList.remove(agent); agent.proxy.setTarget(agent.id, target)); 
startTimer(agent.timeToLive + agent.timeOut, agent.proxy, agent.id); 
SendAgent(target, agent); else
SendException (agent);

onTimer(agentproxy, agentId) // time for the proxy path has ended 
agentproxy.remove(agentId); if(agentproxy.entries() == 0) 
proxyList.remove(agentproxy);

5 Conclusion and Future Work 14

onTimer(agentproxy, agent) // allowance has not been sent in time [here 
an example policy is presented] agentList.remove(agent); 
agentproxy.remove(agentId); if(agentproxy.entries() == 0) 
proxyList.remove(agentproxy);

onTimer(shadow, agentId) // agent has been killed due to timeout 
shadow.remove(agentId);

receiveAllowance(agentId, timeToLive) stopTimer(agentId); agent = 
agentList.findAgent(agentId); agent.timeToLive = timeToLive; 
proxyList.setTime(agentId, timeToLive);

receiveCheck(from, shadowId, agentId) stopTimer(agentId); shadow = 
shadowList.find(shadowId); timeToLive = shadow.timeToLive(from, 
agentId); if (timeToLive > 0) startTimer(timeToLive + 
shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from, 
agentId, timeToLive);

createAgent(creatingAgent, AgentClass, parameterList) newAgent = new 
AgentClass(parameterList); newAgent.timeToLive = 
creatingAgent.timeToLive; newAgent.timeOut = creatingAgent.timeOut; 
newAgent.shadowId = creatingAgent.shadowId; newAgent.shadowHome = 
creatingAgent.shadowHome; onArrival(newAgent);

A.4 Finding Agents

Place: Methods

find(agentId) if (agentList.find(agentId) != null) return(currentPlace); 
if(shadowList.find(agentId) != null) 
sendFind(agentproxy.target(agentId), currentPlace, agentId); else
return(notFoundError);

receiveFind(searcher, agentId) if (agentList.find(agentId) != null) 
sendFound(searcher, currentPlace, agentId); if(proxyList.find(agentId) 
!= null) sendFind(proxy.target(agentId), searcher, agentId); else
sendError(searcher, notFoundError, agentId);

5 Conclusion and Future Work 15

receiveFound(from, agentId) return(from);

receiveError(error, agentId) if (error == notFoundError) return(error);

A.5 Mobile Shadows

Shadow: Additional Attributes

Object Shadow Method move(placeName); Attribute currentPlace:PlaceName; 
// null if shadow is at home Attribute homePlace:PlaceName; Attribute 
timeToLive:Integer;

Shadow: Additional Methods

move(target) if(timeToLive != 0) sendShadow(target, this); 
if(currentPlace != null) // we are part of the path 
startTimer(timeToLive + timeOut, shadow); currentPlace = target; 
terminateShadow() if (currentPlace != null) // the shadow has moved out 
sendTerminate(currentPlace, id); delete(this);

Place: Additional and Extended Methods

Regular Intervals: for each agent agent.timeToLive - -; if 
(agent.timeToLive == 0) sendCheck(agent.shadowHome, this, 
agent.shadowId, agent.id); startTimer(min(localTimeOut, agent.timeOut), 
agent.proxy, agent); for each shadow if (shadow.homePlace != 
place.name()) // only if not at home place shadow.timeToLive--; if 
(shadow.timeToLive == 0) sendCheck(shadow.homePlace, shadow.id); 
startTimer(shadow.timeOut, shadow);

onTimer(shadow, agentId) // agent has been killed 
shadow.remove(agentId); // due to timeout if (shadow.currentPlace != 
place.name() ) sendRemoved(currentPlace, shadowId, agentId);

onTimer(shadow) shadowList.remove(shadow); // shadow path can be removed

5 Conclusion and Future Work 16

receiveAllowance(shadowPlace, agentId, timeToLive) stopTimer(agentId); 
agent = agentList.findAgent(agentId); agent.timeToLive = timeToLive; 
agent.shadowHome = shadowPlace; proxyList.setTime(agentId, timeToLive);

receiveAllowance(shadowId, timeToLive) shadow = 
shadowList.find(shadowId); stopTimer(shadow); shadow.timeToLive = 
timeToLive;

receiveCheck(from, shadowId, agentId) stopTimer(agentId); 
if(currentPlace != place.name()) // not the current shadow 
sendCheck(currentPlace, from, shadowId, agentId); else
shadow = shadowList.find(shadowId); timeToLive = shadow.timeToLive(from, 
agentId); if (timeToLive > 0) startTimer(timeToLive + 
shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from, 
place.name(), agentId, timeToLive);

receiveCheck(from, shadowId) shadow = shadowList.find(shadowId); 
if(shadow != null) shadow.currentPlace = place; sendAllowance(from, 
shadowId, shadow.timeToLive);

receiveShadow(shadow) if(shadow.timeToLive != 0) if(shadow.homePlace != 
place.name()) shadow.currentPlace = place.name(); 
shadowList.add(shadow); else // shadow comes back home 
shadowList.find(shadow.shadowId); shadowList.remove(orig_shadow); 
shadowList.add(shadow); shadow.currentPlace = null;

receiveRemoved(shadowId, agentId) shadow = shadowList.find(shadowId); 
if(shadow != null) if(shadow.currentPlace != place.name()) 
sendRemoved(currentPlace, shadowId, agentId); else
shadow = shadowList.find(shadowId); shadow.remove(agentId);

5 Conclusion and Future Work 17

B References

1. J. Baumann, F. Hohl, N. Radouniklis, K. Rothermel, M. Straßer. 
?Communication Concepts for Mobile Agent Systems?, in Mobile Agents `97, 
LNCS 1219, Springer-Verlag, pp. 123 - 135, 1997. 2. J. Baumann. ?A 
Protocol for Orphan Detection and Termination in Mobile Agent Systems?, 
Tech. Report 1997/09, Fac. of Computer Science, U. of Stuttgart, 1997. 
3. J. Baumann, F. Hohl, K. Rothermel, M. Straßer. ?Mole - Concepts of a 
Mobile Agent System?, in WWW Journal, Special Issue on Software Agents, 
to appear. 4. J. Baumann, N. Radouniklis. ?Agent Groups for Mobile Agent 
Systems?, in Distributed Applications and Interoperable Systems, H. 
König et al., Eds., Chapman & Hall, pp. 74 - 85, 1997. 5. J. Baumann, C. 
Tschudin, J. Vitek. ?Mobile Object Systems: Workshop Summary?, Workshop 
Proceedings for the 2nd Workshop on Mobile Object Systems, in Workshop 
Reader ECOOP '96, d-punkt.verlag, pp. 301 - 308, 1996. 6. General Magic, 
?Odyssey Web Site?. URL: http://www.genmagic.com/agents/ 7. IBM. ?The 
Aglets Workbench?. URL: http://www.trl.ibm.co.jp/aglets/ 8. F. Mattern. 
?Verteilte Algorithmen?, Springer-Verlag, 1989. 9. G. Tel, F. Mattern. 
?The Derivation of Distributed Termination Detection Algorithms from 
Garbage Collection Schemes.?, ACM TOPLAS 15:1, pp. 1-35, 1993. 10. M. 
Shapiro, P. Dickman, D. Plainfoss?. ?SSP Chains: Robust, Distributed 
References supporting acyclic Garbage Collection?, Tech. Report No. 
1799, INRIA, Rocquencourt, Frankreich, 1992. 11. M. Straßer, J. Baumann, 
F. Hohl. ?Mole - A Java Based Mobile Agent System?, in Workshop Reader 
ECOOP '96, d-punkt, pp. 327 - 334, 1996. 12. G. Tel. ?Distributed 
Algorithms?, Cambridge University Press, 1994. 13. J. E. White. 
?Telescript Technology: The Foundation of the Electronic Marketplace?, 
General Magic, 1994.