3 The Shadow Protocol 10
agent, there exists a valid path to that other agent. So, by first following the path to the other agent, and then the still valid path to our agent, every agent gets the termination message. This way, all of the mentioned optimizations can be used without compromising functionality for the group as a whole.
3.5 Fault Tolerance
Our fault model contains two types of failures, node failures (fail-stop) and network partitions. It is important to note that from the viewpoint of a node these failures are not distinguishable. By introducing a path of proxies the fault sensitivity of the protocol is increased. If only one of the nodes containing a proxy is not reachable, either through node failure or network partitioning, the path is broken. Different mechanisms have to be used for the two different kinds of paths. While in the case of a broken agent proxy path only one agent is no longer reachable until its ttl is 0, in the case of a broken shadow proxy path the agents trying to extend their life are threatened. The mechanism employed for the agent proxy paths has already been presented in Sect. 3, and is only discussed briefly. The mechanism used for shadow proxy paths has not yet been discussed in the protocol section and is examined in the following in detail.
Agent Proxy Path. By introducing the ttl, after which the agent has to contact the shadow's place, it is guaranteed that even if the path is broken, the new location of the agent can be identified after the ttl (as a worst-case bound), as long as either the network partition is short-term, or agent place and shadow place are in the same partition. If after the ttl (plus the timeout) the agent has not contacted the shadow, the shadow knows that the agent does not exist any longer (either because it has terminated or has been declared orphan and removed by the system).
Shadow Proxy Path. Two strategies are possible for dealing with a broken shadow proxy path. The first strategy does not change the characteristics of the protocol, but manages only short-term failures. It lets the last shadow proxy of the still-existing path try to contact the next shadow proxy again. The problem though is that the new ttl has to be sent to the agent before the system decides to terminate it.
The second strategy allows for longer failures but changes the worst-case bound for passive termination of the agents (the worst-case bound is 2ttl in this variant). If the last shadow proxy detects the break, it sends a new ttl back to the agent, but with the home place of the shadow as the new location. The new ttl is the minimum of the remaining shadow ttl and the agent ttl. If the shadow would have been removed, then the shadow proxy would know about it (and would have been removed as well). Thus the shadow still exists and it is correct to send the allowance. The home place of the shadow is sent instead of the location of the next shadow proxy in the path, to guarantee that the agent has a valid place to send the request for the next ttl. If the ttl of the agent is shorter than the remaining time of the shadow proxy path, then the next request will be sent along the same path (that hopefully is connected again). If the ttl of the path is shorter, then the agent will contact the home place of the shadow when the shadow itself has requested a new ttl. This means that the home place holds the new location of the shadow and forwards the request correctly.
Seite 10: vorherige Seite | nächste Seite | Übersicht
4 Related Work 11
4 Related Work
In the area of mobile agent systems the current research concentrates on the basic system support. But now that many different agent systems existing support the functionality needed to realize applications, mechanisms providing the functionality presented in this paper are essential. Thus the problem areas of orphan detection and termination of agents are beginning to evoke the interest of the research community. But apart from the mechanisms developed at the University of Stuttgart (see [5] describing a group concept or [2] discussing an energy concept and a path concept) no publications present similar functionality for mobile agent systems. However, in the area of distributed systems many algorithms exist that solve similar problems. The area of distributed algorithms, and especially distributed termination detection (in [9] and in [13] a discussion of many algorithms can be found) and distributed garbage collection (one example is the work on Stub Scion Pair Chains [11]), has to be seen as related work.
But two differences prevent the use of these algorithms for mobile agent systems. First of all, the fault model is different. The possibility of network partitions or node crashes does not exist in the fault model used for most distributed algorithms. Mobile agent systems explicitly include these faults in their fault model. Furthermore, the fault model supports the asynchrony of agents. The second difference is the autonomy of the ?objects? in question that very much influences the processing model. A process (or object) in the distributed system area is not normally seen as autonomous. Here a process is seen as a cooperating part of a larger application. For a mobile agent the autonomy is one of the important prerequisites. This autonomy leads to the problem that a malicious agent might try to remove itself from the control by the system. These differences make it impossible to use the existing distributed algorithms in the area of mobile agent systems. It might be possible to use one such algorithm as the basis for a new design tailored to the needs of mobile agent systems. But the changes in the fault model and in the processing model effect so many changes in the algorithm itself that a correct transformation would be problematic at best. Nevertheless we believe that in principle it is possible to transform these algorithms correctly into algorithms that take the peculiarities of mobile agent systems into account. The key to this is an automatic transformation that, used on e.g. an algorithm for distributed garbage collection, turns it into a orphan detection and / or termination algorithm for mobile agent systems. An analogon to such an algorithm exists for the automatic transformation of termination detection algorithms into distributed garbage collection algorithms [10].
5 Conclusion and Future Work
In this paper we presented the shadow protocol. The shadow protocol has still some disadvantages: it introduces additional communication into the system and resources (memory) are bound to store the different path information. But the advantages outweigh the disadvantages by far: the mechanism is robust against malicious or faulty agents, the path information is updated without additional communication costs (no outdated path information exists), and the time until all agents are terminated in the worst case can be determined exactly. The presented protocol has been implemented in our agent system Mole (for a description of Mole see [12], [1], and [4]).
Seite 11: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 12
We will examine the area of fault tolerance in detail. The presented mechanism is robust against short time network partitioning and system faults, but does not cope well with lasting faults. We will investigate in which way the shadow concept can be made fault resilient by replication of the control structures.
Acknowledgements: Parts of the protocol have been implemented by M. Zepf. The comments of F. Hohl, M. Schwehm and M. Straßer improved the quality of the paper.
A The Protocol
In this appendix the protocols are listed as a whole. Each of them is presented as methods in a pseudo object-oriented fashion. Some basic object types, e.g. lists are assumed existing. Methods can be called asynchronously, e.g.
- startTimer(time, agentId) will call a method onTimer(agentId) after time.
- sendMessage(place , ?Hello?) is sent to place and calls receiveMessage(?Hello?).
Methods bodies containing [here an example policy is presented] can be used to implement a specific policy, e.g. for reacting to a message asking for more energy (in the energy concept). The presented implementation is one of the possible policies.
A.1 Objects needed
Needed Objects
Object Shadow Method timeToLive(AgentId); Method remove(AgentId); Attribute listOfProxies:List of AgentProxy; Attribute timeToLive:Integer; Attribute timeOut:Integer; ...
Object AgentProxy Attribute id:AgentId; Attribute timeToLive:Integer; Attribute target:placeName;
Object ShadowProxy Attribute agents:List of AgentProxy;
Object Agent Attribute id:AgentId; Attribute timeToLive:Integer; Attribute timeOut:Integer; Attribute proxy:ShadowProxy; Attribute shadowId:ShadowId; Attribute shadowHome:PlaceName; Attribute home:PlaceName; ...
Seite 12: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 13
A.2 Methods in the Object Shadow
Shadow
timeToLive(from, agentId) [here an example policy is presented]
agentProxy = listOfProxies.find(agentId); if(agentProxy != null)
agentProxy.target = from; else
agentProxy = new AgentProxy(from, agentId, timeToLive); return
agentProxy.timeToLive;
remove(agentId) agentProxy = listOfProxies.find(agentId); listOfProxies.remove(agentProxy);
A.3 Basic Protocol with Proxies
Place: Methods
Regular Intervals: for each agent agent.timeToLive - -; if (agent.timeToLive == 0) sendCheck(agent.shadowHome, currentPlace, agent.shadowId, agent.id); startTimer(min(localTimeOut, agent.timeOut), agent.proxy, agent);
onArrival(agent) agentproxy = proxyList.find(agent.shadowId);
if(agentproxy == null) agentproxy = new Proxy(agent.id,
agent.timeToLive, agent.shadowHome, currentPlace);
proxyList.add(agentproxy); else
agentproxy.add(agent.agentId, agent.timeToLive); agent.proxy =
agentproxy; agentList.add(agent); agent.start();
onLeaving(agent, target) if (agent.timeToLive > 0)
agentList.remove(agent); agent.proxy.setTarget(agent.id, target));
startTimer(agent.timeToLive + agent.timeOut, agent.proxy, agent.id);
SendAgent(target, agent); else
SendException (agent);
onTimer(agentproxy, agentId) // time for the proxy path has ended agentproxy.remove(agentId); if(agentproxy.entries() == 0) proxyList.remove(agentproxy);
Seite 13: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 14
onTimer(agentproxy, agent) // allowance has not been sent in time [here an example policy is presented] agentList.remove(agent); agentproxy.remove(agentId); if(agentproxy.entries() == 0) proxyList.remove(agentproxy);
onTimer(shadow, agentId) // agent has been killed due to timeout shadow.remove(agentId);
receiveAllowance(agentId, timeToLive) stopTimer(agentId); agent = agentList.findAgent(agentId); agent.timeToLive = timeToLive; proxyList.setTime(agentId, timeToLive);
receiveCheck(from, shadowId, agentId) stopTimer(agentId); shadow = shadowList.find(shadowId); timeToLive = shadow.timeToLive(from, agentId); if (timeToLive > 0) startTimer(timeToLive + shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from, agentId, timeToLive);
createAgent(creatingAgent, AgentClass, parameterList) newAgent = new AgentClass(parameterList); newAgent.timeToLive = creatingAgent.timeToLive; newAgent.timeOut = creatingAgent.timeOut; newAgent.shadowId = creatingAgent.shadowId; newAgent.shadowHome = creatingAgent.shadowHome; onArrival(newAgent);
A.4 Finding Agents
Place: Methods
find(agentId) if (agentList.find(agentId) != null) return(currentPlace);
if(shadowList.find(agentId) != null)
sendFind(agentproxy.target(agentId), currentPlace, agentId); else
return(notFoundError);
receiveFind(searcher, agentId) if (agentList.find(agentId) != null)
sendFound(searcher, currentPlace, agentId); if(proxyList.find(agentId)
!= null) sendFind(proxy.target(agentId), searcher, agentId); else
sendError(searcher, notFoundError, agentId);
Seite 14: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 15
receiveFound(from, agentId) return(from);
receiveError(error, agentId) if (error == notFoundError) return(error);
A.5 Mobile Shadows
Shadow: Additional Attributes
Object Shadow Method move(placeName); Attribute currentPlace:PlaceName; // null if shadow is at home Attribute homePlace:PlaceName; Attribute timeToLive:Integer;
Shadow: Additional Methods
move(target) if(timeToLive != 0) sendShadow(target, this); if(currentPlace != null) // we are part of the path startTimer(timeToLive + timeOut, shadow); currentPlace = target; terminateShadow() if (currentPlace != null) // the shadow has moved out sendTerminate(currentPlace, id); delete(this);
Place: Additional and Extended Methods
Regular Intervals: for each agent agent.timeToLive - -; if (agent.timeToLive == 0) sendCheck(agent.shadowHome, this, agent.shadowId, agent.id); startTimer(min(localTimeOut, agent.timeOut), agent.proxy, agent); for each shadow if (shadow.homePlace != place.name()) // only if not at home place shadow.timeToLive--; if (shadow.timeToLive == 0) sendCheck(shadow.homePlace, shadow.id); startTimer(shadow.timeOut, shadow);
onTimer(shadow, agentId) // agent has been killed shadow.remove(agentId); // due to timeout if (shadow.currentPlace != place.name() ) sendRemoved(currentPlace, shadowId, agentId);
onTimer(shadow) shadowList.remove(shadow); // shadow path can be removed
Seite 15: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 16
receiveAllowance(shadowPlace, agentId, timeToLive) stopTimer(agentId); agent = agentList.findAgent(agentId); agent.timeToLive = timeToLive; agent.shadowHome = shadowPlace; proxyList.setTime(agentId, timeToLive);
receiveAllowance(shadowId, timeToLive) shadow = shadowList.find(shadowId); stopTimer(shadow); shadow.timeToLive = timeToLive;
receiveCheck(from, shadowId, agentId) stopTimer(agentId);
if(currentPlace != place.name()) // not the current shadow
sendCheck(currentPlace, from, shadowId, agentId); else
shadow = shadowList.find(shadowId); timeToLive = shadow.timeToLive(from,
agentId); if (timeToLive > 0) startTimer(timeToLive +
shadow.getTimeOut(agentId), shadow, agentId); sendAllowance(from,
place.name(), agentId, timeToLive);
receiveCheck(from, shadowId) shadow = shadowList.find(shadowId); if(shadow != null) shadow.currentPlace = place; sendAllowance(from, shadowId, shadow.timeToLive);
receiveShadow(shadow) if(shadow.timeToLive != 0) if(shadow.homePlace != place.name()) shadow.currentPlace = place.name(); shadowList.add(shadow); else // shadow comes back home shadowList.find(shadow.shadowId); shadowList.remove(orig_shadow); shadowList.add(shadow); shadow.currentPlace = null;
receiveRemoved(shadowId, agentId) shadow = shadowList.find(shadowId);
if(shadow != null) if(shadow.currentPlace != place.name())
sendRemoved(currentPlace, shadowId, agentId); else
shadow = shadowList.find(shadowId); shadow.remove(agentId);
Seite 16: vorherige Seite | nächste Seite | Übersicht
5 Conclusion and Future Work 17
B References
1. J. Baumann, F. Hohl, N. Radouniklis, K. Rothermel, M. Straßer. ?Communication Concepts for Mobile Agent Systems?, in Mobile Agents `97, LNCS 1219, Springer-Verlag, pp. 123 - 135, 1997. 2. J. Baumann. ?A Protocol for Orphan Detection and Termination in Mobile Agent Systems?, Tech. Report 1997/09, Fac. of Computer Science, U. of Stuttgart, 1997. 3. J. Baumann, F. Hohl, K. Rothermel, M. Straßer. ?Mole - Concepts of a Mobile Agent System?, in WWW Journal, Special Issue on Software Agents, to appear. 4. J. Baumann, N. Radouniklis. ?Agent Groups for Mobile Agent Systems?, in Distributed Applications and Interoperable Systems, H. König et al., Eds., Chapman & Hall, pp. 74 - 85, 1997. 5. J. Baumann, C. Tschudin, J. Vitek. ?Mobile Object Systems: Workshop Summary?, Workshop Proceedings for the 2nd Workshop on Mobile Object Systems, in Workshop Reader ECOOP '96, d-punkt.verlag, pp. 301 - 308, 1996. 6. General Magic, ?Odyssey Web Site?. URL: http://www.genmagic.com/agents/ 7. IBM. ?The Aglets Workbench?. URL: http://www.trl.ibm.co.jp/aglets/ 8. F. Mattern. ?Verteilte Algorithmen?, Springer-Verlag, 1989. 9. G. Tel, F. Mattern. ?The Derivation of Distributed Termination Detection Algorithms from Garbage Collection Schemes.?, ACM TOPLAS 15:1, pp. 1-35, 1993. 10. M. Shapiro, P. Dickman, D. Plainfoss?. ?SSP Chains: Robust, Distributed References supporting acyclic Garbage Collection?, Tech. Report No. 1799, INRIA, Rocquencourt, Frankreich, 1992. 11. M. Straßer, J. Baumann, F. Hohl. ?Mole - A Java Based Mobile Agent System?, in Workshop Reader ECOOP '96, d-punkt, pp. 327 - 334, 1996. 12. G. Tel. ?Distributed Algorithms?, Cambridge University Press, 1994. 13. J. E. White. ?Telescript Technology: The Foundation of the Electronic Marketplace?, General Magic, 1994.
Seite 17: vorherige Seite | Übersicht