Universität Stuttgart Fakultät Informatik Fakultät Informatik Institut für Parallele und Verteilte Höchstleistungsrechner Universität Stuttgart Breitwiesenstraße 20 - 22 D-70565 Stuttgart Problem Formulations, Models and Algorithms for Mapping Distributed Multimedia Applications to Distributed Computer Systems Problem Formulations, Models and Algorithms for Mapping Distributed Multimedia Applications to Distributed Computer Systems Alexander Hagin, Gabriel Dermler, Kurt Rothermel CR-Klassifikation: C.2.4, C.4, G.1.6, G.2.2, I.6 Fakultätsbericht 3/1996 Technical Report Februar 1996 G.Dermler, K.Rothermel A.A.Hagin University of Stuttgart/IPVR Technical University of St.-Petersburg Breitwiesenstr. 20 - 22 Polytechnicheskaja str. 29 D-70565 Stuttgart 195251 Saint-Petersburg Germany Russia Abstract In the report, the problem of optimal allocation of Distributed Multimedia Applications (DMA) into Distributed Computer Systems (DCS) is examined. We are given precedence graphs representing topologies of the DMA and data streams between components of DMA. Nodes and arcs of the graphs are weighted by the computational and communica- tion resources needed to meet quality of service requirements of DMA. A special-purpose precedence graph model is proposed to present the structure of DCS including computers, virtual channel connections and communication resources of DCS over which the chan- nels are routed. Nodes and links of the graph are weighted by the computational resources and capacities of communication resources available to mapped DMA. An approach, based on the solving two kinds of the mapping problem, is proposed. The first one is formulated as a nonlinear integer programming problem with cost function under constraints on DCS resources available to mapped DMA. If the first one has not an acceptable solution, then other problem, formulated as minimax nonlinear integer one to find the DMAlocation into the DCS with minimum DCS resource gap, is solved. To solve both problems, effective algorithms based on the branch-bound method are proposed. Computational efficiency of the algorithms is examined and illustrated by numerical examples. 2 Contents 1. Introduction 1.1. Framework for DMA implementation 1.2. Mapping problem 2. Graph models for DMA and DCS specification 2.1. Parametrization of the application graph 2.2. Parametrization of the system graph 3. Approach to the mapping problem solution 4. Original mapping problem formulation 5. Procedure for location of attached components 6. Optimal solution procedure 6.1. Problem solution technique 6.2. Problem relaxation 6.3. Forward procedure 6.3.1. SubAlgorithm 1 6.3.2. SubAlgorithm 2 6.3.3. Example using subAlgorithm 2 6.4. Roll-back procedure 6.5. Penalty algorithm 7. Reduction procedure 7.1. Problem formulation 7.2. Problem solution technique 7.3. Algorithm 8. Examples 8.1. Example 1 8.2. Example 2 9. Computational complexity analysis of algorithms 9.1. Algorithms of optimization procedure 9.2. Algorithm of reduction procedure 10. Conclusions and future works 1 Introduction 3 1 Introduction Recent technological developments in high speed networks and multimedia workstations have given rise to entirely new classes of distributed applications such as distance learning, desktop videoconferencing, remote multimedia database access and so on. Multimedia systems combine a variety of information sources, such as voice, graphics, animation, images, audio, and full- motion video, into a wide range of applications. Research and development efforts in multime- dia computing can be divided into two groups. One group centers on the stand-alone multimedia workstation and associated software systems and tools. The other combines multimedia com- puting with distributed systems. Potential new applications are based on distributed multimedia systems [1]. 1.1 Framework for DMA implementation A distributed multimedia application (DMA) needs a convenient framework for its implemen- tation in a distributed computer system (DCS). One successful approach is the CINEMA (Con- figurable INtEgrated Multimedia Architecture) development platform providing abstractions for dynamic configuration of DMA and for arbitrary complex stream synchronization require- ments [2]. Distributed multimedia applications (DMA) are employed to generate, process and consume (e.g. present) continuous (e.g. audio, video) data streams across distributed locations into com- puter network. CINEMA allows a client to compose an application out of components and links as one or some data flow graphs. A component is an individually schedulable unit (e.g. by mapping to a thread). In the graph the nodes represent components, while links (arcs) represent the data streams between components. The streams may have different rates. A component accesses the data units of streams via ports. Further, the initial configuration may be dynamically changed during run time as need. Dynamic configuration of DMA allows to take into account changes of available resources in distributed computer system (DCS) and changes of the required quality of service the user asks for at run time. Moreover DMA are often highly dynamic in the sense that users may join and leave the application during run time that causes necessity of the reconfiguration of DMA without session interruption. On the other hand, DCS is a dynamic system with set of different changeable DMA executed, changeable available computational and communication resources for allocation and execution of a new DMA. Components encapsulate processing of multimedia data, e.g. for generating (source compo- nents), presenting (sink components) or manipulating (filters and mixers) data. To provide a uni- form data access point for the components, ports are used that deliver data units to the compo- nent (input port) or take the data units from the component (output port). A component designer 1 Introduction 4 has to associate with each component port the streamtype to be used making all related infor- mation available at the port. A client constructs an application by specifying a topology of com- ponents interconnected via links. A link provides an abstraction from underlying communica- tion mechanisms which may be used to perform the transport of data units. A data flow graph may be arbitrary distributed over several nodes of a distributed computer sys- tem. Components are configuration independent, which means that their internal logic is inde- pendent of the configuration they are used in.Thus, from the client's point of view, there is no conceptual difference whether two adjacent components run either on the same computer-node or on different computer-nodes connected by a remote link. Before using an application, a client has to specify desired QoS (Quality of Service) to CIN- EMA with respect to output data generated by sink components (e.g. presented video frame size and rate). Client requests are related to sink component ports, assuming that the client has knowledge about sensed output QoS and corresponding media specific QoS required as input to sink components. CINEMA offers the concept of a session allowing the client to specify the application to be instantiated and the QoS expected with respect to output generated by sink components. A ses- sion is the unit of resource reservation and QoS negotiation. By creating a session, a client causes the CINEMA system to reserve the resources that are needed to guarantee the specified quality of service requirements. After a session has been established, the transmission and pro- cessing of multimedia data may be started. A session encompasses parts of the data flow graph which is defined by a client. Its actual exten- sion is defined by specifying a set of source and sink components. Intermediate components and interconnecting data paths are determined from the data flow graph by the CINEMA system. In CINEMA, a new protocol type termed negotiation and resource reservation protocol NRP is proposed. NRP is carried out by a corresponding protocol engine (PE) in three phases during which (in phase 1) the so-called application flowspec (AFS) available at the sink port is propa- gated towards the source components for resource reservation, (in phase 2) AFS from source ports propagated back to the sink to prepare resource relaxation, and (in phase 3) resource res- ervation are relaxed while propagating the AFS towards source again. Thus, general framework supported by CINEMA session is following. 1. Constructing and specifying by client an DMA (data flow graphs) and QoS with respect to consumed data. 2. Checking the logical correctness (agreement) between application specification and QoS requirements taking into account limited functional capabilities of components. (For instance, all components connected to an output port of the same mixer have to get a picture with same characteristics of size, rate and so on. Two interconnected component ports have to be associated with the same streamtype. A limited resource availability of any compo- nent or intermediate link affects QoS to be provided by all other components or links and so on). 1 Introduction 5 3. Mapping media specific parameter values (e.g. video frame size and rate and so on) to com- putational and communication resource requirements that agree QoS requests of client. 4. Mapping weighted data flow (application) graphs of DMA to DCS taking into account required computational and communication resources of DMA and, on the other hand, available computational and communication resources of DCS, and cost expenses of DCS resources that will be reserved. 5. Resource reservation in DCS by NRP protocol. Return(s) from each of these 5 stage to previous one may be needed depending on result obtained on the stage. The paper deals with the problem 4 of mapping a DMA to a DCS. 1.2 Mapping problem A DMA can be represented by one or some application precedence graphs [2]. In an application graph (or simply DMA graph), nodes represent components that are interconnected by arcs rep- resenting data streams between components. Each component is associated with at least one device that produces (a source component) or processes (an intermediate component - filter or mixer) or consumes (a sink component) data streams. Each node of the application graph is weighted by computational requirements of the corresponding component and each arc is weighted by the channel capacity needed for remote communication between adjacent compo- nents. Media streams can originate at multiple sources, traverse a number of intermediate com- ponents and end at multiple sinks. The DCS considered is heterogeneous and consists of computers (workstations) communicating with each other through point-to-point physical link, local area network(s) (LAN) and/or wide area network(s) (WAN). For every DMA, the part of the DCS that will always be considered consists of computers each of which can be used for executing the functions of one or more components. The communication structure of a DCS is described by a system oriented graph (or simply DCS graph). In the system graph, nodes represent computers that are interconnected by arcs representing virtual channels of the DCS. The DCS graph includes weights per nodes (values of available computational resources and cost functions for such resources) and per arcs (values of available channel capacity and capacity cost functions). The mapping problem has been thoroughly investigated for parallel systems [3]; if usually arises when the number of computational modules required by the application exceeds that of processors available, or when the interconnection structure of the application computational modules differs from that of the parallel machine [4,5]. In [6], the mapping problem is defined as the assignment of modules to processors and maximizing the number of pairs of communi- cating modules that are allocated to pairs of directly connected processors. It is shown that the mapping problem is equivalent to the graph isomorphism problem, or to the bandwidth reduc- 1 Introduction 6 tion problem. In either case, an exact and simple algorithm for the mapping problem is extremely difficult to develop. Research in this area has concentrated on efficient heuristics which give good solutions in most cases [3 -6]. The problem of mapping a DMA to a DCS differs from the one mentioned above in the follow- ing: a) a DCS is heterogeneous, b) the relevant part of the DCS network with computational and communication resources available may not be fully interconnected, in particular because of the channel utilization by other currently performed applications, c) communication is performed via data passing between the computers through a point-to point channel, local and/or global networks by using access to channel resources belonging to an adjacent computer pair or shared by some computer pairs connected to the same transmission media, d) the purpose of the DMA allocation scheduling in a DCS is to minimize the cost of DCS resources used for the DMA allo- cation, provided the DCS provides resources for the DMA needed to satisfy the quality of user service requirements. Mapping a DMA to a DCS can be used to provide two kinds of service: an advance reservation service and an immediate one [10]. In the first case the provider has to do some planning for future allocations using two time parameters of the client request - the starting time and the duration of DMA. The mapping server takes into account all accepted requests of advance ser- vice, time parameters of which are overlapped by the ones of the arrived request. The mecha- nism of interval tables [10] can be used to determine resources of DCS available for the DMA request. Such advance reservation service mode allows to assume that the utilization state of the DCS will be not changed in the future interval of the DMA execution and therefore mapping server deals with the static state of DCS resource utilization. The immediate reservation service has to do in the real-time mode. The mapping is made during the session establishment phase and is adjustable depending on the current utilization states of the DCS computers and channels. In the paper a mathematical formulation of the mapping problem is considered. Algorithms for its solution is proposed. The algorithms can be used for an advance reservation service and, after some modifications, an immediate ones. In section 2, the graph models for DMA and DCS resource specification as input data for map- ping problem are presented. In section 3, an approach to the mapping problem solution is pro- posed. In section 4, problem formulation of the mapping a DMA to a DCS is presented. In sec- tion 5, pre-starting procedure for locating attached (source and sink) components of DMA onto DCS are examined. In section 6, some algorithms, based on the branch-bound method, for obtaining the optimal location of DMA onto DCS are considered. In section 7, a problem for- mulation and an algorithm of reduction procedure, based on the branch-bound method, for obtaining a feasible location of DMA onto DCS with minimum resource requirement relaxation are presented. Reduction procedure is performed if available resources of DCS can not meet resource requirements of DMA. In section 8, numerical examples are used to demonstrate the algorithms and to evaluate its computing efficiency. In section 9, conclusions about this approach and its use for DCSs are discussed. 2 Graph models for DMA and DCS specification 7 2 Graph models for DMA and DCS specification 2.1 Parametrization of DMA graph Let us consider a DMA graph and determine the values of node and arc weights. Every node is weighted by computational requirements of the corresponding component of the DMA. Let denote the arrival rate (messages per second) of input data streams to component in the DMA graph. To process every message, component needs processor operations. Let denote the component computational requirement (operations per second). To exclude computer saturation, it is necessary that . In this case we get utilization factor for computer with virtual processor speed . To calculate the component com- putational requirement, we must construct a model for computer performance evaluation and take into account the quality of service parameters of the DMA. For instance, if we use the sim- ple classic M/M/1 queueing system model and acceptable mean delay of a message in the component , then the computational requirement of component is . On the other hand virtual processor speed is a function of real processor speed and values of main (high-speed) and disk memories used for storage and distribution of program modules and data. We could use more detailed and exact models for computer performance evaluation, for exam- ple [7]. For the present, it is important that we can weight every node of the DMA graph by computational requirement of the corresponding component. l i i i Vi C i Ci liVi > r liVi Ci ? = Ci Ti i i C i V i 1 l i V i + ( ) T i ? = C i i Ci b a c d e f g 1 3 2 4 1 3 2 3 4 3 3 3 3 Figure 2.1. An example of DMA graph 2 Graph models for DMA and DCS specification 8 Every arc in the DMA graph is weighted by capacity requirement (bits per second) that can be calculated similarly to component computational requirement . An example of a DMA graph is presented in Figure 2.1. The topology of DMA is composed of three source-components connected to two mixing components and last of which provides data streams to two sink components and . Numbers at nodes and arcs denote com- putational and communication resource requirements respectively. 2.2 Parametrization of DCS graph Let us now consider a system graph of the DCS. Every node is weighted by computational resource available (operations per second) and cost functions of used computational resource . If is the total computational resource of computer in the DCS and is its computational resource already used by all applications processed in the DCS, then the computational resource available of computer is The system graph representation of the DCS shows possible virtual channel connections (VC) between the computers of the DCS. A VC1 is a direct oriented logical connection between two computers (endsystems) with some assigned capacity. A VC is routed over one or more com- munication resources of DCS (physical links, networks) to achieve sender-computer to receiver-computer connectivity. By a communication resource of DCS we shall basically mean maximum aggregated physical communication unit of DCS that is characterized by bandwidth (capacity), transmission direction and is used for connecting at least one computer pair of DCS. 1 A VC concept is similarly to virtual path concept in an ATM network [8] excluding that here it is used to the application level. i j , ( ) Cij C i a b c , , d e f g n Rn f n x ( ) x Bn n bn n Rn Bn bn ? = 1 2 3 4 LAN a) 1 3 2 b) 4 Figure 2.2,a) DCS example, b) Representation of computer connections by VCs in the system graph 2 Graph models for DMA and DCS specification 9 For DMA mapping problem we consider only such VCs each of that includes only two comput- ers of DCS, first one as a sender and second one as a receiver, and is not routed over other com- puters, as intermediate ones. For example Figure 2.2 presents the DCS structure and system graph using acceptable VCs. A VC begins from the output interface of a computer-sender and ends at the input interface of a computer-receiver, i.e. a VC includes the interfaces of the end- systems. The available capacity of a VC is equal to minimum available capacities of all DCS communi- cation resources over which the VC is routed. Let be the capacity of DCS communication resource ; be the set of DCS communication resources used by VC from com- puter to computer of DCS. Then the available capacity of the VC is (2.1) Let us consider following basic kinds of computer connections: by an individual point-to-point (physical) channel, through a wide area network (WAN), through a local area network (LAN). In the first case (see Fig. 2.3) the capacity of the point-to-point physical channel belongs to the pair of computers connected by this channel. However the capacity of output and input inter- faces of each computer are shared by all VCs routed over the output and input interfaces of the computer. Therefore, if is the available capacity of the channel in direction from computer to computer , are available capacities of the output and input interfaces of computer , then the capacity of VC available for the mapped DMA is (2.2) In the second case (computer connection through a WAN, see Figurte 2.4) the system graph rep- resentation is similar to one shown in Figure 2.3,b. Formula (2.2) can be used also but here denotes the throughput of the WAN from network access point of computer to one of com- puter . To calculate the throughput of gigabit networks, the relation between latency and bandwidth must be taken into account [9]. A s s rnm n m , ( ) n m n m , ( ) Rnm m i n As s rnm ? , { } = Anm n m In out In in , n n m , ( ) Rnm min In out Im in Anm , , { } = Anm n m 2 Graph models for DMA and DCS specification 10 In the third case the capacity of the LAN transmission (see Figure 2.5) line does not belong to any pair of computers but is distributed among all computers of the LAN. The LAN provides a virtual channel between any pair of computers. Therefore, a system graph for the LAN is a log- ical graph representing all possible VCs between computers connected to the LAN (see Figure Inout Im in Inin Im out Anm Amn Computer n Computer m point-to-point channel a) n m VC (n,m) , Rnm VC (m,n) , Rmn b) Figure 2.3, a) Computer connection by point-to point physical channel, b) Representation of computer connection by VCs in the system graph I n out I m in I n in I m out A Amn Computer n Computer m nm WAN Figure 2.4. Connections in the DCS through wide area network 2 Graph models for DMA and DCS specification 11 2.6). Available capacity of the LAN transmission line is distributed among all data-exchang- ing computer pairs. Therefore , (2.3) Capacity is available for every possible VC in the system graph. It means that if the capac- ity available of the LAN, for example, is decreased by , then the capacity available of every possible virtual channel with decreases by the same value , too. Note that interfaces of computers are distributed among all channels routed over the interfaces. Therefore for every interface of a computer routed by some VCs the following capacity con- straints have to be satisfied . Figure 2.6,a illustrates a more complicated case of the DCS structure that can be represented by the logical system graph in Figure 2.6,b. Here the capacities available for every computer pair connection are the following: , , etc. A 0 Rnm A ? n m , ( ) ? ? Rnm m i n I n ou t Im i n A , , { } = A a Rnm A = a n Rnm In ou t Rmn In i n ? m ? , ? m ? 1 2 3 4 LAN A 1 4 2 3 LAN A a) b) Figure 2.5,a,b) Connections in the DCS through local area network, 1 2 3 4 c) c) Representation of computer connections by VCs in the system graph R12 R21 R12 m i n I1 out I2 in ALAN1 , , { } = R13 m i n I1 ou t I3 i n ALAN1 AWAN ALAN2 , , , , { } = 2 Graph models for DMA and DCS specification 12 Further every arc of the system graph represents corresponding VC of DCS and is weighted by available capacity (bits per second) and communication cost function of the VC . Let us consider an example of system graph construction for DCS depicted in Figure 2.7. DCS is composed of two LANs Ethernet, WAN and endsystems. Points indexed by 1 and 2 denote points through which LAN1 and LAN2 are connected to WAN. Points 3, 4, and 5 denote WAN access points for LANs and computer . Suppose, that each of LANs has bus with transmission rate C = 10 Mb/s and physical length L = 0.8 Km, length of network packet l = 1 Kb. Assuming the signal propagation delay equal to 5 mcs/Km we get one for the bus = 4 mcs/Km. Then the packet delay t = l/C = 100 mcs and maximum throughput of the LAN Ethernet is determined by formula [13] A = C / [1 + (2e + 1)p/t] = 8 Mb/s. Suppose that LAN1 has and LAN2 has of available capacities for a DMA location. Let us assume that WAN access points 3, 4, 5 have available capacities , and respectively. In Figure 2.8 the DCS is represented by the system graph. The points on links indexed by num- bers correspond to access points to communication resources of DCS. (We shall name the points shortly commpoints). The commpoint representation in the system graph is useful to compute available capacity of every virtual channel connection using formula (2.1). For example, capac- ity of channel (A,C) is . n m , ( ) n m , ( ) Rnm gnm x ( ) n m , ( ) WAN 1 2 3 4 a) LAN1 LAN2 1 3 2 4 LAN1 WAN LAN2 b) Figure 2.6, a) Communications in the DCS through some differnet networcs. b) Representation of computer connections by VCs in the system graph C p 5 L ? = A1 5Mb s ? = A2 6Mb s ? = A3 3Mb s ? = A4 10Mb s ? = A5 20Mb s ? = RAC min A1 A3 A4 , , { } 3Mb s ? = = 2 Graph models for DMA and DCS specification 13 A . . . . . . B E D C WAN LAN 1 LAN 2 Figure 2.7. An example of DCS topology 1 2 3 4 5 A B C D E WAN LAN 1 LAN 2 1 1 1 1 3 3 3 3 3 3 1 1 1 4 4 4 4 2 2 2 2 2 2 2 5 5 5 5 5 5 Figure 2.8. DCS graph representation with commpoints 2 Graph models for DMA and DCS specification 14 Preliminary analysis of commpoint capacities allows to simplify DCS graph and further capac- ity computations. For example, summary capacity of all virtual channels routed over access point 4 (line of all commpoints 4) is less than capacity of the access point, i.e. , , and consequently = . Therefore we can discard commpoint 4. In the same way commpoint 5, for which , , , can be discarded too. In Figure 2.9, the final system graph is presented. Commpoints 1, 2, 3 corre- sponds to DCS communication resources of LAN1, LAN2 and WAN respectively. Assuming that available resources of computers are , one can represent initial values of DCS resources available to mapped DMA by following matrix R (2.3) . Note that capacities of virtual channels are interdependent through common communication resources used (see Figure 2.9). A B C D E A 5 5 3 3 3 B 5 4 3 3 3 C 3 3 6 6 6 D 3 3 6 8 6 E 3 3 6 6 7 RAC RCA RBC RCB + + + m i n A1 A3 A4 , , { } A3 3 = = = RDC RCD + R + EC RCE + m i n A2 A4 A5 , , { } A2 6 = = = RAC RCA RBC RCB REC RCE RDC RCD + + + + + + + A3 A2 + 9 A4 < 10 = = min A1 A3 A5 , , { } A3 = min A2 A4 A5 , , { } A2 = A3 A2 + 3 6 + 9 A5 < 20 = = = RA 5 RB , 4 RC , 6 RD , 8 RE , 7 = = = = = 3 Approach to the Mapping Problem Solution 15 3 Approach to the Mapping Problem Solution An approach proposed consists of following procedures (see Fig. 3.1): 1. Attached Component Location Procedure for locating attached components1 of DMA onto DCS. If not all attached components can be assigned because of a gap of DCS resources then Reduction Procedure is performed as next one to obtain a relaxed solution for DMA allo- cation. 1 For such component there is only one particular computer that the component is assigned to. Such components are called attached ones. Generally, set of computers on which a component can be assigned depends on whe- ther the computer configuration has devices and resources required to perform multimedia functions of a certain component type. On the other hand, source-components and/or sink-components can be assigned to certain computers in advance. Then the components are attached ones too. A B C D E WAN LAN 1 LAN 2 1 1 1 1 3 3 3 3 3 3 1 1 1 2 2 2 2 2 2 2 Figure 2.9. Final DCS graph representation with commpoints 3 Approach to the Mapping Problem Solution 16 2. Optimal Solution Procedure. The procedure obtains the solution of the mapping problem with cost objective function and unrelaxed resource requirements. This procedure uses Forward procedure, that exe- cutes component assignments, and Roll-back Procedure, that allows to look possible com- ponent assignments over. SubAlgorithm 1 and subAlgorithm 2 solve relaxed subProblems of computing bounding functions used in the branch-bound method. 3. Reduction Procedure The procedure is performed only if DCS resources available to mapped DMA are not enough to meet resource requirements of DMA. The procedure obtains the location of Attached Component SubAlgorithm 1 SubAlgorithm 2 Forward Procedure Roll-back Procedure Optimal Solution Procedure Location Procedure Figure 3.1. Scheme of the approach to mapping problem solution Reduction Procedure 4 Problem Formulation 17 DMA that guarantees minimum resource requirement relaxation for components and links of DMA. 4 Problem Formulation We are given - application graph of DMA with set of nodes (components) , set of directed arcs connecting components with each other , set of required computational resources for components and required commu- nication resources , - DCS graph with set of nodes (computers) , set of VCs (or simply channels) connecting computers with each other , set of available (vacant) computational resources of computers , set of communication resources in the DCS1, set of communication resources of DCS used by channel , , set of channels routed over shared communication resource , , set of communication resource capacities available to mapped DMA , set of resource cost functions for computers and for channels , set of acceptable locations of every component in the DCS , The solution variables are such that , if component is assigned to computer , and otherwise. Then the Original Mapping Problem is , (4.1) 1 Interfaces of DCS computers can be also included into set . h l i j , ( ) i j h ? , , { } = di i h ? , { } dij i j , ( ) l ? , { } z p n m , ( ) n m z ? , , { } = Rn n z ? , { } r r rnm n m , ( ) rnm r rnm n m , ( ) ? ? ? , ? ps s r ? p s s r ? ? p = A s s r ? , fn D ( ) n z ? , { } gnm D ( ) n m , ( ) ? ? , { } zi i h ? , { } x i n x i n 1 = i n xin 0 = F xin ( ) minxin x i n f n d i ( ) x i n n m , ( ) p ? ? i j, ( ) l ? ? x j mgnm dij ( ) } + n zi ? ? i h ? ? { = gnm d i j ( ) g s d i j ( ) s pnm ? ? = 5 Procedure for Location of Attached Components 18 subject to (4.2) (4.3) (4.4) where is the cost function for channel ; if ; if and ; if or . Here is the available capac- ity of channel that is not enough to satisfy a resource requirement of any arc of component graph, ; In this formulation, objective function minimizes the summary cost of computational and communication resources used for the DMA allocation in the DCS. The first term in the objec- tive function identifies the cost of resources of computers that components are assigned to. The second term represents the cost of communication resources of channels on which DMA arcs are placed. Constraint set (4.2) guarantees that every component will be placed into the DCS and only onto one computer. Constraint set (4.3) guarantees that resources used by components assigned to a computer do not exceed the available resource of the computer. Constraint set (4.4) guarantees that capacity of communication resource in DCS used by all DMA arcs placed on resource do not exceed the available capacity of the resource. Analysis of the objective function and constraints of the mapping problem (4.1) - (4.4) shows that it is, in general, a nonlinear integer programming problem with Boolean variables. 5 Procedure for Location of Attached Components Let us present an additional notation of variable sets used further: be the set of components assigned to computers, ; be the set of components that are not yet assigned to computers, ; be the set of components assigned to computer , ; be the function returning the index of DCS computer to which component is assigned; x i n n zi ? ? 1 i h ? " , = x i ndi Rn n z ? " , ? i h ? ? x i n i j, ( ) l ? ? x j mdi j As s r ? " , ? n m , ( ) p s ? ? gnm d ( ) n m , ( ) gnm 0 = n m = gnm 0 ? n m ? n m , ( ) p ? gnm - = n m , ( ) p ? Rnm enm 0 ? = enm n m , ( ) p ? enm m i n D i j { } < F i h ? s s ha ha h ? hu hu h \ ha = hn n z ? hn n z ? ? ha = c i( ) n z ? i ha ? 5 Procedure for Location of Attached Components 19 be the set of computers on which component may be assigned provided that components of the are already allocated in the DCS, ; be the set of computers that are acceptable for component locations provided that components of the are already allocated in the DCS, , , . be the set of DMA arcs already assigned, 1; be the set DMA arcs that are not yet assigned, ; be the set of DMA arcs assigned to DCS channel ; be the set of DMA arcs assigned to communication resource , ; be the available capacity of channel ; The algorithm of the procedure is as follows: 1. Initializing work variables: ; , ; ; ; ; ; ; 2. Assigning attached components and their adjacent arcs: for every attached component ; ; ; ; for every output arc such that ; ; ; for every communication resource ; ; for every input arc such that 1 We treat that a DMA arc is assigned if it is located into a DCS channel or components, connected by the arc, are placed into a same DCS computer. The last case we call as arc absorption. zi hu ? ? ? ö i hu ? ha h \ hu = z i hu ? ? ? ö z i h ( ) hu h ? , ? z hu ? ? ? ö ha h \ hu = z hu ? ? ? ö zi hu ? ? ? ö i hu ? ? = z hu ? ? ? ö z h ( ) ? hu h ? la la l ? lu lu l \ la = lnm n m , ( ) p ? ls s r ? ls lnm n m , ( ) ps ? ? = DRnm n m , ( ) hu h = ha ? = z i hu ? ? ? ö z i i hu ? " , = z hu ? ? ? ö z = DRn Rn = hn ? n " z ? , = lu l = la ? = lnm ? n m , ( ) p ? " , = DAs As = l s ? s r ? , = i DRc i() DRc i() di ? Ü hc i() hc i( ) i ? Ü ha ha i ? Ü hu hu\i Ü i j , ( ) lu ? j ha ? ? ? ? ö c i( ) c j ( ) ? ( ) ( ) ? la la i j , ( ) ? Ü lu lu\ i j , ( ) Ü lc i( ) c j( ) lc i( ) c j( ) i j , ( ) ? Ü s rc i() c j( ) ? l s ls i j , ( ) ? Ü DAs DA s di j ? Ü j i , ( ) lu ? j ha ? ? ? ? ö c i( ) c j ( ) ? ( ) ( ) ? 6 Optimal Solution Procedure 20 ; ; for every communication resource ; ; 3. Computing capacities of DCS channels: for every channel 4. Checking a resource gap of the DCS and computing cost of such assignments: If resources of DCS was enough for allocation of attached components, i.e. and , then computing initial value of the cost function and go to Optimization Procedure else go to Reduction Procedure. 6 Optimal Solution Procedure The procedure obtains the optimal solution of the Original Mapping Problem (4.1) - (4.6) with cost objective function and unrelaxed resource requirements. 6.1 Problem Solution Technique A solution technique proposed is based on a branch-bound method [11]. Performance of the method can be represented by a directed search-tree (see Figure 6.1). The vertices of the tree correspond to the subsets of complete set = { or 1, } of possible DMA locations into DCS. The vertices are partitioned into levels in the following way: There is a single vertex of level 0 (called the root of the tree) which corresponds to the complete set consisting of elements1, where is number of components in DMA and is number of computers in DCS. To construct the level 1 of the tree, we must first choose a component, for example indexed by 1. The level 1 contains vertices denoted by (1,1), (1,2),..., (1,N). The first one corresponds to the subset of for which , i.e. component 1 is assigned to compute 1. The last vertex of level 1 corresponds to the subset of for which , i.e. component 1 is assigned to computer 1 In accordance with constraint (4.2), every component can be placed into only one computer. Therefore set consists of feasible elements. la la j i , ( ) ? Ü lu lu \ j i , ( ) Ü lc j( ) c i( ) lc j( ) c i( ) i j , ( ) ? Ü s rc j() c i( ) ? l s ls i j , ( ) ? Ü DAs DA s dj i ? Ü n m , ( ) p ? DRnm m i n DAs s rnm ? " , { } = F DRn 0 n z ? " , ? DAs 0 s r ? " , ? F0 fc i() di ( ) gc i() c j() d i j ( ) i j, ( ) l a ? ? + i ha ? ? = S x i n 0 = i h n z ? " , ? " S 2I N ? I h = S NI N z = N S 1 1 , ( ) S x11 1 x12 , 0 x13 , 0 ? x1N , , 0 = = = = 1 N , ( ) S 1 N , ( ) S x11 0 x12 , 0 x13 , 0 ? x1N , , 1 = = = = 6 Optimal Solution Procedure 21 . We have, evidently, . Subsets form a partition of set . We say that we have `branched' with respect to the component 1. In the same way for every vertex of level 1, we construct level 2. In Figure 6.1 set is branched with respect to the component 2 and obtained subsets , form a partition of set . The lowest level of the tree corresponds to placements of the last component of DMA. Every subset of terminal vertex of level corresponds to a single element of and determines one of DMA placements. The placement is determined by all vertices intersected by the branch joining the root to the terminal vertex. For example vertex determines DMA placement , other . N S 1 n , ( ) S n , ? 1 N , = S 1 n , ( ) n , 1 N , = S S . . . 1,1 1,2 1,n 1,N possible placements of component 1 Level 1: Level 0 . . . S0 S(1,1) S(1,N) S(1,n) . . . 2,1 1,2 2,n 2,N possible placements of component 2 Level 2: . . . S(1,n)(2,n) . . . I,1 I,2 I,n I,N possible placements of component I Level I: . . . . . . . . . . . . . . . . . . I-1,m S(1,n)(2,n)...(I-1,m) S(1,n)(2,n)...(I-1,m)(I,N) S(1,n)(2,n)...(I-1,m)(1,N) Figure 6.1. A search-tree of the branch-bound method S 1 n , ( ) S 1 n , ( ) 2 m , ( ) m , 1 N , = S 1 n , ( ) I I I S I N , ( ) x1n 1 x2n , 1 ? x I 1 m , ? , , 1 ? x1N , , 1 = = = = xin 0 = 6 Optimal Solution Procedure 22 To find the optimal solution of the mapping problem of important size without having to con- struct complete search-tree, the concept of bounding function [1] is used. We present a branch-bound method, that take into account peculiarities of the mapping problem (4.1)-(4.4) by choosing (a) the bounding function, (b) the branching vertex at each stage, (c) the component (variable ) relative to which the branching of the chosen vertex has to be done. Let us examine these three points. At each stage of an algorithm for every terminal vertex of the search-tree, we construct a lower bound of objective function (1). For the choice of (b) we use the so-called `depth first search' method [1], which chooses a vertex of maximum depth among those vertices not yet branched. If there is more than one, then one could choose that which corresponds to the lowest bounding value. This method aims at exhibiting a (good) solution of the problem as soon as possible. Then value of the got solution is used for narrowing the domain of optimal solution search by rejection of those vertices for which . By limiting, at each stage, the computation of the bounds to the successors of only those vertices for which , a considerable reduction of the number of vertices actually examined can be obtained, a reduction sufficient to deal with problems of a fairly large size. When a terminal vertex with objective function value of a DMA location into DCS is reached, one attempts to improve obtained solution by choosing and branching terminal vertices with bounds . The consideration of such terminals is started from the terminal vertex of the obtained solution towards the root of the search tree. This approach aims to improve the solution as soon as possible and, using the improved solution, allows to decrease the set of ter- minal vertices examined at upper levels of the search tree. Every obtained solution is checked whether it is optimal. An obtained one is optimal if - the bounding value for a terminal vertex of the lowest level is satisfied to equality , where is an initial value of lower bound for the root of the search-tree, - or all terminal vertices of all levels have bounding values . For the choice of (c), two approaches are presented: - one is based on a fixed ordering components is presented; - other is based on the so-called `penalty' method that associates with every pair - acceptable location of not yet assigned component to computer - a number equal to the penalty if this assignment will be not done. If the branching is then carried out relative to the pair associated with the largest penalty , then one obtains two new subsets, one of which has a much better chance of containing an optimal solution than the other. i h ? t Bt F s t B t F > s t B t F ? s Fs Bt F ? s I F s B0 = B0 B t F s < i n , ( ) i n p i n i n , ( ) p i n 6 Optimal Solution Procedure 23 In some way it is a question of the `most informative and quick' choice. It leads to minimizing the risk of having to explore in vain a branch of the tree while the optimal solution is contained in the other branch. Further, we refer to the first algorithm based on the fixed ordering components as Ord-Algo- rithm and to the second one as Penalty-Algorithm. Computational efficiency of algorithms, based on the branch-bound method, depends on an accuracy and computational complexity of a method for construction of an objective function bound for vertices of the search-tree. The accurate is the method the more unpromising branches are rejected during an optimization procedure. However increasing an accuracy of bounds causes increasing the computational complexity. Let us consider the construction of an objective function bound for every vertex of the search- tree. Assume that some components and arcs connecting their with each other are already assigned. Using formula (4.1), one can computes the placement cost for these compo- nents. The cost is an exact partial of the complete cost , where is the cost of placement of other components and their arcs that are not yet assigned. To construct a lower bound for , we propose an approach based on a search of independent best place- ment (with minimum cost) of every unassigned component into DCS taking into account only available resources of DCS not used. Then for every vertex one has to examine not more than acceptable placements for every yet unassigned component . Therefore for every ver- tex one has to examine not more than placements of individual components. Sum- mary number of component placements examined is not more than , where is the number of vertices examined in the search tree. Thus solving the mapping problem is advantageous if number of examined vertices is con- siderable less than summary number of vertices in the search-tree, i.e. . Value of depends on an accuracy of lower bound construction. Two methods of independent component placements are proposed: - the first one (see Figure 6.2,a) bases on a placement of a component with his adjacent arcs and components that are not yet assigned, - the second one (see Figure 6.2,b) bases on a placement of an unassigned component only. Evidently, the first method is characterized by more exact bound construction but more tedious complexity of bound computation too. However it allows to take into account resource requirements of adjacent arcs and components. Therefore it rejects more unpromising vertices in the search-tree and can obtain a considerable reduction of the number of vertices actually examined. On the other hand, the second method can be useful for the case when channels of DCS have the same cost and DCS has sufficiently much communication resources such that constraints (4.4) have a faint effect on the problem solution. ha F0 ha ? ? ? ö F F0 ha ? ? ? ö F1 hu ? ? ? ö + = F1 hu ? ? ? ö hu B F1 N i hu ? N hu N I ? ? N I V ? ? V V V NI ? V B 6 Optimal Solution Procedure 24 Further we consider in detail the first method of bound construction which is more general. Now let us examine constructing the search tree. Suppose that, for any terminal vertex of the search-tree, component assignments have already been made, . The DMA arc assignments to DCS channels , depends in a unique fashion on the component assignments. On the other hand, having the arc assignments to channels, one can get corresponding arc assignments to DCS communication resources . Thus, components of set have already been located and computers of set are accept- able for location of not yet assigned components . This means that every computer 1 2 3 4 DCS graph DMA graphs a) b) Figure 6.2. Mapping a DMA component to DCS components or a) with his adjacent arcs and b) without ones ha h < hn n z ? , ( ) hn n z ? ? ha = lnm n m , ( ) p ? , ( ) l n m , ( ) n m , ( ) p ? ? la = l s lnm s r ? , n m , ( ) ps ? ? = ? ? ? ö ha z hu ? ? ? ö i hu ? 6 Optimal Solution Procedure 25 has enough resources to locate at least one of components . For terminal ver- tex , the Original Problem (4.1)-(4.4) can be rewritten as follows: (6.1) (6.2) (6.3) (6.4) (6.5) subject to (6.6) (6.7) (6.8) where is the set of components that are adjacent to component and are direct senders to component ; is the set of components that are adjacent to component and are direct recipients from the component in the DMA graph. In formula (6.1) characterizes the cost of component and their arcs assignments that have been already done. is the objective function of minimization problem for further placement of not yet assigned components and their arcs. Term of takes into account computational resource cost of component placed onto computer and the cost of communications of component with already assigned components . Term determines the cost of communications of component with not yet assigned components. n z hu ? ? ? ö ? i hu ? hn n z ? , ( ) F F0 ha ? ? ? ö F1 h\ha ? ? ? ö + F0 ha ? ? ? ö F1 hu ? ? ? ö + = = F0 ha ? ? ? ö fc i() d i ( ) gc i( ) j, d i j ( ) j Out i( ) ? j h a ? ? ? ? ö ? + ? ? ? ? ? ? ? ö i ha ? ? = F1 hu ? ? ? ö m i nx i n xin M i n ha ? ? ? ö L i n hu ? ? ? ö + ? ? ? ö n z i hu( ) ? ? i hu ? ? ? ? ? ? ? ü = M i n ha ? ? ? ö fn di ( ) gn c j() , d i j ( ) gc j() n , dj i ( ) j I n i() ? j ha ? ? ? ? ö ? + j Ou t i() ? j ha ? ? ? ? ö ? + = L i n hu ? ? ? ö 1 2--- xjmgnm d i j ( ) xjmgmn d j i ( ) m zj hu( ) ? ? j I n i( ) ? j hu ? ? ? ? ö ? + m z j hu( ) ? ? j Out i( ) ? j h u ? ? ? ? ö ? ? ? ? ? ? ? ? ö = xin n zi hu( ) ? ? 1 i hu ? " , = x i nd i Rn d i i hn ? ? ? ? ? ? ö ? i h u ? ? DRn hn ( ) n z hu ? ? ? ö ? " , = x i nx j mdij i j, ( ) l u ? ? n m , ( ) ps ? ? As d i j i j, ( ) ls ? ? ? ? ? ? ö ? DA s ls ( ) s r ? , = In i( ) i i Out i ( ) i F0 ha ? ? ? ö F1 hu ? ? ? ö i hu ? M i n ha ? ? ? ö F1 hu ? ? ? ö i n i j ha ? Lin hu ? ? ? ö i 6 Optimal Solution Procedure 26 Constraints (6.6) - (6.8) take into account resources of computers and capacities of communication resources remaining after previous assignments. The factor of 1/2 is used in (6.5) in order that the responsibility for the communication costs common to the two assignments, component and adjacent component , will be divided equally between these assignments1. 6.2 Problem Relaxation Let us derive a lower bound of in (6.1) for a terminal vertex that is characterized by vector of already executed component assignments . The derivation is based on the assumption that every unassigned component and unassigned arcs adjacent to him can be placed independent of other unassigned ones. Cost term in formula (6.1) is a con- stant. Therefore let us derive a lower bound of . >= , , (6.9) where and are determined by formula (6.4) and (6.5) respectively. Thus the lower bound is (6.10) Using lower bound we can relax the problem (6.3) - (6.8) to two subprob- lems: subProblem 1. For every component , the potential best assignment is deduced by solv- ing the problem (6.11) 1This approach allows to take into account communication resource expenses, caused by all output and input arcs of a component, each time when the component is assigned to a computer, and therefore to get more exact value of bounfing function further. DRn hn ( ) n z hu ? ? ? ö ? " , DA s ls ( ) s r ? " , ha i j F hn n z ? , ( ) i hu ? F0 ha ? ? ? ö F1 hu ? ? ? ö F1 hu ? ? ? ö m i nx i n ? i hu ? ? { } = minn zi hu( ) ? Sin ha hu , ? ? ? ö ? ? ? ö i hu ? ? b1 hu ? ? ? ö = S i n ha hu , ? ? ? ö Min ha ? ? ? ö m i nxjm L i n hu ? ? ? ö { } + = Min ha ? ? ? ö L i n hu ? ? ? ö B ha hu , ? ? ? ö F0 ha ? ? ? ö b1 hu ? ? ? ö + F0 ha ? ? ? ö minn zi hu( ) ? Sin ha hu , ? ? ? ö ? ? ? ö F ? i h u ? ? + = = b1 hu ? ? ? ö F1 hu ? ? ? ö ? i hu ? m i nn zi hu( ) ? Sin ha hu , ? ? ? ö m i nn zi h u ( ) ? Min ha ? ? ? ö minxjm Lin hu ? ? ? ö { } + ? ? ? ö = 6 Optimal Solution Procedure 27 subject to 1 subProblem 2. For every acceptable assignments of component to computer , the potential best placement of unassigned arcs adjacent to the component is deduced by solving the problem = = (6.12) subject to for every computer adjacent to computer , i.e. , (6.13) for every communication resource over which output channel and input ones of computer are routed, i.e. , , (6.14) Thus the bounding function is computed by formula (6.10) that needs, for every unassigned component, to solve the subProblem 1 of the potential best assignment of the com- ponent. At that the solution of subProblem 1 needs to solve, for every acceptable assignment of the component, subProblem 2 of the potential best placement of his unassigned arcs. 6.3 Forward Procedure The Forward Procedure executes following key operations: 1 The form of objective function (6.11) allows us to argue: if subProblem 1 has solution then constraints (6.6) will be necessary satisfied else component can not be allocated and the corresponding vertex does not contain an acceptable solution. If the vertex is one of level 0, i.e. (called the root of the directed search-tree) and component can not be allocated, then the original problem (4.1) - (4.4) has not a solution. i hn n z ? , ( ) ? h , ( ) i di DRn hn ( ) n z hu ? ? ? ö ? " , ? i hu ? n z i hu ? ? ? ö ? L m i n i n hu ? ? ? ö minxjm Lin hu ? ? ? ö { } = 1 2--- m i nx j m x j mgnm dij ( ) x j mgmn dji ( ) m z j hu() ? ? j In i( ) ? j h u ? ? ? ? ö ? + m z j hu() ? ? j Out i( ) ? j hu ? ? ? ? ö ? ? ? ? ? ? ü m n m zj hu ? ? ? ö j Out i( ) In i() ? ? ? ? " xjmd j DRm hm ( ) ? j hu ? ? n m , ( ) m n , ( ) n s rnm rmn ? ? " m zj hu ? ? ? ö j Out i() I n i() ? ? ? ? " xjmdij x j md j i DA s ls ( ) ? j hu ? ? m n , ( ) ps ? ? + j hu ? ? n m , ( ) p s ? ? B ha hu , ? ? ? ö 6 Optimal Solution Procedure 28 - Choosing next component for location. - Examining all acceptable locations of the component. - Computing by formula (6.10) the bounding value for every acceptable location of the com- ponent using subAlgorithm 1 (see below). - Choosing the best location of the component using the bounding values. - Checking an obtained acceptable component location whether it is optimal. The Forward Procedure is based on the fixed ordering of component assignments and it's pre- sentation is as follows: 1. If there are `non-alternative' assignments1 for components yet unassigned then executing the assignments; increasing cost in the value of expenses for such assignments; If at least one `non-alternative' component can not be assigned then go to Roll-back Procedure; If all components are assigned then the optimal solution is obtained; stop. 2. Ordering and renumbering components in decreasing with respect to (or , where and are cost functions of an arbitrary computer and channel of DCS respectively)2. 3. Computing initial bounding value for current state of assignments performed in accordance with formula (6.10) using subAlgorithm 1, step 2 (see below). 4. If there are `non-alternative' assignments for components yet unassigned then executing the assignments; increasing cost in the value of expenses for such assignments; if at least one `non-alternative' component can not be assigned then go to Roll-back Procedure else go to step 10 1 We say that component has the `non-alternative' assignment if and only if one computer of DCS, for example , can meet computational resource requirements of component , i.e. constraint is satisfied only for computer and fails for all other computers. 2 At this step an order of summation with respect to index in (6.10) is determined. i n c i() = i DRn di i hu ? , ? n F0 i hu ? d i f d i ( ) g dij ( ) g dji ( ) + [ ] j ? + f g i B0 ha hu , ? ? ? ö F0 6 Optimal Solution Procedure 29 5. Choosing the next component from ordered set to perform his assignment, i.e. 6. terminal vertices will be built at level of the search-tree. Each of such vertices corresponds to one computer , that component could be assigned to. For all terminal vertices of level , corresponding bounding values are computed in accordance with formula (6.10) by subAlgorithms 1 and 2 solving subProb- lems 1 and subProblems 2. 7. Choosing the terminal vertex of level that has the lowest bounding value in set . Suppose element corresponding to placement of component into computer is found. 8. If then go to Roll-back Procedure 9. Assigning component to computer and recomputing cost function: ; ; ; ; . Assigning arcs adjacent to component and recomputing cost function: Every output arc and input one of component , that were not yet assigned and connects the component with already assigned components , are assigned to channel and respectively and are included in sets of assigned arcs . Recomputing capacities of communication resources and channels , of DCS that the arcs are assigned to. The algorithm for arc assignments is as follows: for every component for every shared resource ; ; ; ; ; l hu l l 1 + Ü zl hu ? ? ? ö 0 > l v 1 z l hu ? ? ? ö , = n z l hu ? ? ? ö ? l l Bln ha l ? hu\l , ? ? ? ö l B l n ha l hu \ l , ? ? ? ? ö n zl hu ? ? ? ö ? , { } B?lk l k B? l k Fbest ? l k ha ha l ? Ü hu hu\l Ü hn hn l ? Ü DRk DRk dl ? Ü F0 F0 fk dl ( ) + Ü l l j , ( ) j l , ( ) l j ha ? k c j( ) , ( ) c j( ) k , ( ) la lk c j() , lc j() k, ls , , , DAs Rkm Rmk j Out l ( ) ha ? ? s rk c j( ) , ? la la l j , ( ) ? Ü lu lu \ l j , ( ) Ü l s ls l j , ( ) ? Ü lk c j( ) , lk c j( ) , l j , ( ) ? Ü DAs DA s d l j ? Ü F0 F0 gs d l j ( ) + = 6 Optimal Solution Procedure 30 for every component for every shared resource ; ; ; ; ; for every component ; for every component 10. If all components are assigned, i.e. or in other terms then a. the vertex with value determines an acceptable solution of problem (4.1) - (4.4). If another acceptable solution was obtained before, then choosing the best one out of the both and now characterizes the cost of the current best solution. (At the beginning of Forward Procedure ). b. Vertices for which (6.15) are not promising because they do not contain the optimal solution and, therefore, they are eliminated from further consideration. c. If is equal to initial value of bounding function , i.e. , or rela- tion (6.15) is satisfied for all terminal vertices then found value is optimal and, therefore, , stop else the terminal vertices showing promise are branched and the search-tree is con- structed further (by choosing the vertex and corresponding level that is nearest one to the obtained solution and has a bounding value less than )1 till relation (6.16) will be satisfied for all remaining terminal vertices. For that go to step 7. 11. If not all components are assigned, i.e. then go to step 4. Let us consider in detail some steps of the algorithm. At step 1 and 4 `non-alternative' component assignments are detected. For every such compo- nent that have to be assigned to computer it is checked whether capacities of adjacent chan- nels of the computer are enough to place yet unassigned arcs of the component. 1 For the vertex, all parameters have to be reset. j In i( ) ha ? ? s rc j() k, ? la la j l , ( ) ? Ü lu lu \ j l , ( ) Ü l s ls j l , ( ) ? Ü lk c j( ) , lk c j( ) , j l , ( ) ? Ü DAs DA s d j l ? Ü F0 F0 gs d j l ( ) + = j Out l ( ) ha ? ? DRk c j( ) , m i n DA s s rk c j() , ? " , { } = j In i( ) ha ? ? DRc j() k, m i n DA s s rc j( ) k , ? " , { } = i h = ha h = B? h k F0 h ( ) = Fbest Fbes t - = Bin Fbest i , ? 1 h 1 ? , = Fbest B0 Fbes t B0 = Fbest F x i n ( ) Fbes t = l Fbes t ha hu h n la lu l n m l s DR n DRnm DAs Bi n , , , , , , , , , , l h < i n 6 Optimal Solution Procedure 31 Note that assignment of component to computer may be possible if at least the following relations are satisfied: - computer has enough computational resource to locate component , i.e. (6.16) - communication resources used for location of output and input arcs of component (which are not yet assigned and connect the component with already assigned components) have enough available capacities, i.e. , (6.17) , . Therefore the condition of resource capability to allocate a `non-alternative' components is checked by following operation: If for any `non-alternative' component , at least one relations of (6.17) is not satisfied then go to Roll-back Procedure else assigning the component to computer using procedure of step 9. 6.3.1 SubAlgorithm 1 The subAgorithm 1 computes bounding values for all vertices of a same level in the search-tree by examination every acceptable location of a component not yet assigned by the Forward Pro- cedure. The locations of the components are considered as mutually independent. The subAl- gorithm executes following key operations: - Examining all acceptable locations of a not yet assigned component. - Computing by formula (6.10) the bounding value for every acceptable location of the com- ponent using subAlgorithm 2 (see below). - Choosing the best location of the component using the bounding values. Suppose that we examine a level of the search-tree at which we construct terminal vertices, each for one possible assignment of component to corresponding computer of set . Let us consider level of the search-tree including vertices corresponding to possible assignments of component . Then a representation of the subAlgorithm 1 is as follows. i n n i d i DRn hn ( ) ? i di j d j i DA s ls ( ) ? j ha ? j c j() ( ) m = ? ? ? ? ö ? m n , ( ) p s ? ? + j ha ? j c j( ) ( ) m = ? ? ? ? ö ? n m , ( ) ps ? ? s rnm rmn ? ? " m " j ha In i ( ) Ou t i( ) ? ( ) ? ? ? ? ? ö c ( j( ) ? ? m ) = i i n l zl hu ? ? ? ö l zl hu ? ? ? ö l l 6 Optimal Solution Procedure 32 for every computer , i.e. for every possible assignment of component 1. Checking the resource capability of computer and his adjacent channels to locate compo- nent with his adjacent arcs: if relations (6.16) - (6.17) are satisfied for assignment of component to computer then `virtual' assigning1 component to computer and output and input arcs of compo- nent , that were not yet assigned and connect component with already assigned compo- nents , to corresponding channels of computer . else corresponding terminal vertex of level in the search-tree get bounding value and considering the next possible location of component , i.e the next computer , by return to the beginning of step 1. 2. Best placing of not yet assigned components : for every unassigned component ; (here is cost value of the potential best placement of component ) a. for every computer , i.e. for every possible assignment of component i if the DCS has potentially sufficient computational and communication resources for assignment of component to computer , i.e. relations (6.16), (6,17) are sat- isfied then `virtual' assigning component and its arcs connecting with already assigned components ; computing cost initial value of such assignments using formula (6.4) else component can not be assigned to computer , , and go to step f b. if there are `non-alternative' assignments of any component(s) adjacent to component , i.e. of set then `virtual' assigning `non-alternative' components and their arcs connecting with component ; increasing cost in the value of expenses for such assignments if not all `non-alternative' components can be assigned then component can not be assigned to computer , , and go to step f else go to step f c. If all components of are assigned 1 `Virtual' assignment means that only computational and communication resource expenses are calculated wit- hout real assignment of component to computers. The procedure of assignment is the same with one of step 9 of Forward procedure (see section 6.3) but work state-parameters are used instead of , cost parameter is used instead of and instead of . t zl hu ? ? ? ö ? l t l l t l t h a h u hn l a lu lnm ls DRn DRnm DAs , , , , , , , , , Blt F0 t k l l j ha ? t l B l t - = l t zl hu ? ? ? ö ? i hu \ l ? i hu \ l ? S i best - = Sibest i hu\l ? n z i hu ? ? ? ö ? i n i j ha Ou t i ( ) In i( ) ? ( ) ? ? Sin Min Ü i n S i n - = i hu Out i ( ) In i ( ) ? ( ) ? i S i n i n S i n - = hu Ou t i( ) In i ( ) ? ( ) ? 6 Optimal Solution Procedure 33 then an assignment of component to computer is acceptable and costs , go to step f d. Solving of subProblem 2 for given to deduce the best potential placement of unassigned arcs of component assigned (virtually) to computer e. if a solution of subProblem 2 is obtained then increasing cost in the got value of subProblem 2 else there is no solution for assignment of component to computer , . f. if then Recomputing lower bounding value for the terminal vertex corresponding to the assign- ment of component to computer : 6.3.2 SubAlgorithm 2 The subAlgorithm 2 is used for problem (6.12) - (6.14) solution and performs locating all not yet assigned adjacent arcs of a component by the branch-bound method. It executes following key operations: - Choosing next adjacent arc of given component - Examining all acceptable locations of the arc. - Computing the bounding value for every acceptable location of the arc. - Choosing the best location of the arc using the bounding values. The objective function (6.12) of subProblem 2 takes into account only communication resource cost of mapping not yet assigned components adjacent to component provided that component is assigned to computer . Constraints (6.13) guarantee that computational resources used by the adjacent components assigned to a computer do not exceed the available resource of the computer. Constraints (6.14) guarantee that summary capacity of the output and input arcs of component placed onto communication resource does not exceed the available capacity of the resource. Let us consider an approach based on the branch-bound method to solve the problem (6.12) - (6.14). Let us denote additional work parameters: be the set of components assigned to computers including `virtual' assignments per- formed by Forward Procedure; i n Sin - < i n , ( ) i n Sin L min in i n Sin - = Sin S i best < Sibe s t Sin = l t B l t Blt S i best + Ü i i i n i s höa 6 Optimal Solution Procedure 34 be the set of components assigned to computer including `virtual' assignments, ; be the set of arcs assigned to communication resource including `virtual' assign- ments; be the set of components that are not yet assigned taking into account `virtual' assign- ments performed, . Note, that subAlgorithm 2 gets sets , , and from the subAlgorithm 1 and does not change its; , be the variable sets of unassigned components, adjacent to component , that are connected to component by his output arcs and input ones respectively. The initial values of the sets , , ; , be the variable sets of components, adjacent to component and corre- sponding to output and input arcs of the component, that are assigned by subAlgorithm 2. Initial values of the sets , , . Always during the performance of subAlgorithm 2, the following relations are satisfied , ; be the initial number of arcs adjacent to component that are not yet assigned, ; be the set of arcs assigned by subAlgorithm 1 to communication resource ; be the set of such computers that are connected to computer by corresponding out- put channels which can be used for placing yet unassigned output arcs of compo- nent ; be the set analogous with but for input channels of computer and input arcs of component ; Inasmuch as index of component is fixed and there is only one arc between any pair of adja- cent components, we can use for numbering output and input arcs of component the same indexes of components that are adjacent to component . hön n hön n ? höa = lös s r ? höu höu h \ höa = höa hön lös höu DhOu t u DhIn u i i DhOu t u höu Ou t i( ) ? = DhIn u höu In i( ) ? = Dhu DhOu t u DhIn u ? = DhOu t a Dh In a i DhOu t a ? = DhIn a ? = Dha DhOut a Dh In a ? ? = = DhOu t a DhOut u ? höu Out i ( ) ? = DhIn a DhIn u ? höu In i( ) ? = J i J Dhu Dha ? = Dls j Dha ? s zOu t n n m , ( ) i z In zOu t m n , ( ) n i i i i 6 Optimal Solution Procedure 35 A search-tree construction will be carried out with respect to levels corresponding to arcs . Each vertex of level is characterized by corresponding set which determines assignments of first arcs to chan- nels. Here pair denotes that arc is assigned to channel 1 and . Let us derive a lower bound of objective function (6.12) for the vertex of level . Taking into account that arcs, adjacent to component , are already assigned, the objective function can be rewritten as , , (6.18) where determines the cost of assignments of first arcs that were already per- formed. The term is a cost value of further placements of arcs that are not yet assigned. To obtain a lower bound of we relax the integer programming problem (6.12) - (6.14) to continuous one. The relaxation allows to place arcs and components adjacent to com- ponent fractionally, i.e. more than one instance of an arc or a component may exist with each instance handling just a fraction of load imposed for the (individual, atomic, not replicated) arc or the component respectively. Let us define new solution variables - , , that determines a part of capacity of arc absorbed by computer to which corresponding fraction of component is assigned, i.e. , (6.19) where is the available resource of computer ; - ( , , ) denoting a value of capacity allocated for arc into channel . Arc may obtain required capacity from one or some different channels. Thus , (6.20) , (6.21) 1 To simplify notation we assume if is an output arc of component then is the input channel of computer (and the output channel of computer ). Otherwise if is an input arc of component then is the output channel of computer (and the input channel of computer ). j 1 J , = j Dha ? t J t j1 k1 , ( ) j2 k2 , ( ) ? j t k t , ( ) , , , { } = t jt k t , ( ) jt k t j t i k t k t n j t i kt k t n Dha j 1 j 2 ? jt , , , { } = t t J < i L min in höu ? ? ? ö 1 2--- LJ t Dha ? ? ? ö LJt Dhu ? ? ? ö + ? ? ? ö = LJt Dha ? ? ? ö gnc j() d i j ( ) gc j( ) n d j i ( ) j DhIna ? ? + j DhOu t a ? ? = LJ t Dha ? ? ? ö t j J t ? LJ t Dhu ? ? ? ö j Dhu ? LJ t Dhu ? ? ? ö i zjn j Dhu ? j n j 0 zjn m i n DRn dj , { } d j ------------------------------------ d i j dij ? ? ? DRn n z j k j Dhu ? k n ? k zOu t z In ? ? j k j 0 zjk zjn + d i j ? ? j " DhOut u k " zOu t ? , ? zjk zjn + k zO ut ? ? di j = j DhOu t u ? " 6 Optimal Solution Procedure 36 , (6.22) , (6.23) Now, taking into account formulation of suProblem 2 (6.12) - 6.14) and relations (6.19) - (6.22), the minimization problem for solving lower bound of in (6.18) can be formu- lated as (6.24) subject to (6.25) ,(6.26) , where if , if , and , since (6.27) Here is the cost of capacity allocated for arc in channel if , and in channel if . If component is placed into the same computer already containing component then arc connecting components and does not need a communication resource and therefore . Problem (6.24) - (6.26) can be solved, for example, by the simplex method. However we present a more simple and economical computation approach. Let us order output channels and input ones so that , and . For every arc adjacent to component the row of cost values and of assignments to corresponding channels are determined. (Note that ). If arc can not be assigned, for 0 zjk zjn + d j i ? ? j DhIn u k " zIn ? , ? " z j k z j n + k zI n ? ? d j i = j DhIn u ? " BJ t LJt Dhu ? ? ? ö LJt Dhu ? ? ? ö BJ t Dhu ? ? ? ö ? min z j k zjk j DhO ut u ? ? a j k z j kajk j DhI nu ? ? k zI n n ? ( ) ? ? + k zO ut n ? ( ) ? ? ? ? ? ? ? ü = zjnd j d i j ----------- DRn hön ( ) dj j Dha ? j c j() \ n = ( ) ? ? ? ö ? ? ? j Dh u ? ? DRn t Dha ? ? ? ö = zjk z j k DAs lö s ( ) dab a b , ( ) Dl s ? ? ? ? j DhInu ? ? n k, ( ) ps ? ? + j DhO ut u ? ? n k , ( ) ps ? ? DA s t Dls ( ) = s rnk rkn ? ? " k zOut zIn ? ? " ajk gnk dij ( ) d i j -------------------- = j DhOu t u k zOu t ? , ? ajk gnk dji ( ) d j i -------------------- = j DhIn u k zIn ? , ? ajn 0 = gnn 0 = zjka j k z j k d ? i j i j , ( ) n k , ( ) k DhOu t u ? k n , ( ) k DhIn u ? j Dhu ? n i i j ajn 0 = n k , ( ) k , zOu t ? k n , ( ) k , zIn ? gnk D ( ) gn k l + ( ) D ( ) l 1 ? , ? gkn D ( ) g k m + ( ) n D ( ) m 1 ? , ? j i gnk gkn gnn 0 = j 6 Optimal Solution Procedure 37 example to channel , then we set and do not take into account this channel, when arc is considered. Now the solution of problem (6.24) - (6.26) can be obtained by sequential filling - at first, computer by unassigned components , adjacent to component and ordered (in accordance with their arcs) in decreasing with respect to , till relation (6.25) is satisfied. - then channels, ordered in increasing of cost functions and , by unassigned arcs adjacent to component and ordered as mentioned above. Let us present the algorithm for the problem (6.12) - (6.14) solution. 1. Ordering the output and input channels of computer in increasing with respect to and , as mentioned above. Ordering the output and input arcs of component in decreasing with respect to . 2. . 3. Constructing the next level of the search tree (by considering all possible placements of the next unassigned arc adjacent to component ) ; 4. terminal vertices will be built at level of the search-tree. (Here denotes a num- ber of acceptable assignments of arc including the case of absorption the arc by computer ). For every such vertex corresponding to assignment considered at level , the bounding value is computed , (6.28) where the first term is calculated by formula (6.18). For computing , the problem (6.24) - (6.26) is solved by sequential filling at first computer then channels in the order as men- tioned above by components and arcs respectively. 5. Choosing the terminal vertex of level that has the lowest bounding value . Here chosen vertex corresponds to placement of arc into channel (computer) 6. If then return to the previous level of the search tree resetting state parameters of assignments corresponding to this level. If then go to step 5 else the original problem (6.12) - (6.14) has not an acceptable solution. 7. Assigning arc to channel (computer) . 8. If all arcs are assigned, i.e. then a. The vertex with value determines n k , ( ) gnk - = j n j i a j k gnk gkn i n gnk gkn i ajk t 0 Jt , ? = = j t i t t 1 + Ü K t 0 > t Kt j t n jt m , ( ) t PJt m ( ) LJt Dha j t ? ? ? ? ö BJt Dhu jt ? ? ? ? ö + = BJt n t PöJt PJt kt ( ) m i nm PJt m ( ) { } = = j t k t PöJt - = t t 1 ? Ü t 0 > jt k t t J = PöJJ gnc j() dij ( ) gc j() n dji ( ) j J J ? j DhI na ? ? ? ? ? ? ö ? + j JJ ? j DhOu t a ? ? ? ? ? ? ö ? = 6 Optimal Solution Procedure 38 an acceptable solution of problem (6.12) - (6.14). If another acceptable solution was obtained before, then choosing the best one from these both and restoring it's cost as b. Vertices for which (6.29) are not promising because they do not contain the optimal solution and, therefore, they are eliminated from further consideration. c. If relation (6.29) is satisfied for all terminal vertices then found value is optimal and, therefore, else the terminal vertices showing promise are branched and the search-tree is constructed further (by choosing the terminal vertex that is nearest one to the obtained solution and has a bounding value less than )1 till relation (6.29) will be right for all remaining terminal vertices. Go to step 4. 9. If then the vertex of level with lowest bounding value will be branched with respect to the variable , go to step 3. At step 2 and step 4 computing bounding values is performed: a. Computing bounding value for the case of absorption of arc by computer : if component has enough resource to place adjacent component (in other words, to absorb arc ), i.e. then `virtual' assigning component to computer (that means absorption of arc by computer ); best `virtual' placing yet unassigned arcs adjacent to component by sequen- tial filling, at first computer by absorbing the arcs, and then ordered channel, adja- cent to the computer, by ordered arcs; computing bounding value else ; b. Computing bounding values of placements of output arc into every output channel : if arc is output one with respect to component then 1 All parameters of the vertex have to be reset. Lbest PJt Lbest t , ? 1 J 1 ? , = Lbe s t L m i n i n 1 2--- Lbe s t = Lbest t J < t PöJ t j t 1 + PJ t m ( ) PJ t n ( ) jt n n j t jt DRn d j t ? j t n jt n j Dhu ? i n PJt n ( ) PJt n ( ) - = PJt m ( ) jt n m , ( ) jt i 6 Optimal Solution Procedure 39 for every output channel of computer , i.e. for every possible assign- ment of arc if channel has enough resource to place arc , i.e. then `virtual' assigning arc to channel best `virtual' placing yet unassigned arcs adjacent to component by sequential filling, at first computer by absorbing the arcs, and then ordered channel, adjacent to the computer, by ordered arcs; computing bounding value else ; c. Computing bounding values of placements of input arc into every input channel : if arc is input one with respect to component then for every input channel of computer , i.e. for every possible assignment of arc if channel has enough resource to place arc , i.e. then `virtual' assigning arc to channel best `virtual' placing yet unassigned arcs adjacent to component by sequential filling, at first computer by absorbing the arcs, and then ordered channel, adjacent to the computer, by ordered arcs; computing bounding value else At step 7 assignment of arc to channel (or computer) is as follows: , , ; if arc is absorbed by computer , i.e. then ; else if arc is output one of component then for every shared resource ; ; ; Recomputing available capacity of output channel : n m , ( ) n j t n m , ( ) jt DRnm dijt ? j t n m , ( ) j Dhu ? i n PJt m ( ) PJ t m ( ) - = PJt m ( ) jt m n , ( ) jt i m n , ( ) n jt m n , ( ) jt DRmn djti ? j t m n , ( ) j Dhu ? i n PJt m ( ) PJ t m ( ) - = jt kt Dha Dha jt ? Ü Dhu Dhu \ j t Ü Jt J t jt k t , ( ) ? Ü j t n k t n = Dhn Dhn jt ? Ü DRn DRn d j t ? Ü jt i s rnk t ? Dl s Dls i jt , ( ) ? Ü DA s t DA s t d i j t ? Ü LJ t LJt gs dijt ( ) + Ü n k t , ( ) 6 Optimal Solution Procedure 40 ; if arc is input one of component then for every shared resource ; ; Recomputing available capacity of output channel : . 6.3.3 Example using SubAlgorithm 2 Let us consider an example of locating arcs of component d (see Figure 2.1). Assume that com- ponent d is assigned to computer C (see Figure 2.9) and no other components are yet assigned. In Figure 6.3 the part of DMA graph, that have to be assigned, and the part of DCS graph, used for the assignment, are depicted. Table 6.1 presents initial data of components, arcs and channels ordered in accordance with step 1 of subAlgorithm 2. (The initial data are chosen to present more general and interesting case). Every arc adjacent to component d is presented in the table by two columns: the first one for a component adjacent to component d and the second one for the arc connecting the component to component d. Cells of the last row contain values of required resources of components and arcs adjacent to component d. Last four columns are used for values of available DCS resources: column R - for DRnk t min DA s s rnk t ? " , { } = jt i s rk t n ? Dl s Dls jt i , ( ) ? Ü DA s t DA s t d j t i ? Ü LJ t LJt gs djti ( ) + Ü k t n , ( ) DRk t n min DA s s rk t n ? " , { } = b c d e 3 3 3 4 4 2 2 A B C d E D 1 3 1 3 2 2 a) b) Figure 6.3,a) Component d of DMA graph with adjacent arcs and components, b) Computer C of DCS graph with adjacent channels 6 Optimal Solution Procedure 41 resource of computer C and for capacities of virtual channels, columns - for capac- ities of communication resources over which corresponding channels are routed. Cost values computed by formula (6.27) and used for ordering arcs and channels are pre- sented in columns of components and arcs. The filling in the table by takes into account that input arcs of component d can be placed only into input channels of computer C and output arcs - into output channels. Rows of the table follow the order: the row for computer C, the rows for input channels of computer C and the rows for output channels. The second row of the table corresponds to computer C which component d is assigned to. The computer can absorb (completely or partly) an arc adjacent to component d. A partly absorption of an arc in accord with formula (6.19) is acceptable provided that at least one not yet assigned component adjacent to component d can be previously completely allocated into computer C1. Table 6.1 Arc allocation are executed by filling in the table from link to right and from top to down direc- tions. An illustration of arc (b,d) allocation will make this clear. The search-tree of possible alternatives is depicted in Figure 6.4. The vertex C of the first level gets bounding value since computer C has not enough resource to locate component b and to absorb arc (b,d). Let us consider the next vertex (D,C) of the first level. The allocation of arc (b,d) into channel (D,C) causes decrease (by ) of capacity of: - channel (D,C), - all communication resources over which channel (D,C) is routed, - all other channels that are routed over the communication resources, capacity of which were decreased. 1 This condition allows to get more exact bounding value. b (b,d) e (d,e) c (c,d) R A1 A2 A3 C 0 0 0 2 (D,C) 4 2 5 5 (E,C) 4 2 5 5 (A,C) 5 3 3 4 3 (B,C) 5 3 3 4 3 (C,D) 3 5 5 (C,E) 3 5 5 (C,A) 4 3 4 3 (C,B) 4 3 4 3 d 3 3 4 3 2 2 A1 A2 A3 , , ajk ajk - dbd 3 = 6 Optimal Solution Procedure 42 Table 6.2 represents the resource state of DCS after assignment of arc (b,d) to channel (D,C). In the cell corresponding to allocation arc (b,d) into channel (D,C), value 3 shows that arc (b,d) gets completely required capacity by placing into channel (D,C). In column R and A2, capacities of channel (D,C), communication resource 2 involved into the channel, and other channels that are routed over communication resource 2 are decreased from 5 to 2 by . Table 6.2 b (b,d) e (d,e) c (c,d) R A1 A2 A3 C 0 0 0 2 (D,C) 4 3 2 2 5 2 5 (E,C) 4 2 2 5 2 5 (A,C) 5 3 3 4 3 (B,C) 5 3 3 4 3 (C,D) 3 2 5 2 5 (C,E) 3 2 5 2 5 (C,A) 4 3 4 3 (C,B) 4 3 4 3 d 3 3 4 3 2 2 dbd 3 = dbd 3 = DC - - - - C DC EC AC BC C CD CE CA CB C CD CE CA CB C DC EC AC BC 22 22 26 26 24 24 24 24 24 28 28 30 30 - - - assigning arc (b,d) arc (d,e) arc (c,d) Figure 6.4. Search-tree for optimal assigning arc adjacent to component d 6 Optimal Solution Procedure 43 In accordance with the table, next must be assigned arc (d,e). First one attempts to locate com- ponent e into computer C. However available resource of computer C is not enough to locate component e completely. To locate component e partly, first one must to locate at least one not yet assigned component into computer C completely. Next component c in the table is such one. After allocation of component c into computer C, available resource of computer C decrease to . Therefore arc (e,d) now can not be absorbed by computer C and it must be allocated into an input channel of computer C. One uses the following rule: locate an arc into the cheapest acceptable channel. Following the rule, arc (e,d) is located partly: 2 units of required capacity to channel (C,D) and 1 unit - to channel (C,A). Table 6.3 represents the resource state of DCS after these assignments and allows to compute the bounding value of the vertex (D,C) by formula (6.24): . Table 6.3 Using the proposed procedure of filling in the table 6.1 for other vertices of the first level, one obtains bounding values for the vertices. Choosing the vertex (D,C) with minimum value, one continues the branching search-tree. At the third level one gets the first acceptable solution: arc (b,d) assigned to channel (D,C), arc (d,e) - to channel (C,A) and arc (c,d) is absorbed by com- puter C. The solution has value of objective function L = 24. Only one vertex (E,C) of the first level has value 22 less than L. Therefore one executes branch- ing the vertex. However new bounding values, obtained at the second level, are not less than L. Thus the first solution obtained is optimal. b (b,d) e (d,e) c (c,d) R A1 A2 A3 C 0 0 0 2 0 2 (D,C) 4 3 2 0 2 5 0 2 5 (E,C) 4 2 0 2 5 0 2 5 (A,C) 5 3 3 4 3 (B,C) 5 3 3 4 3 (C,D) 3 2 0 2 5 0 2 5 (C,E) 3 0 2 5 0 2 5 (C,A) 4 1 2 3 3 4 2 3 (C,B) 4 2 3 3 4 2 3 d 3 3 4 3 2 2 DRC 2 = DRC 0 = PDC a j kz j k j k, ? 0 2 4 3 3 2 4 1 ? + ? + ? + ? 22 = = = 6 Optimal Solution Procedure 44 6.4 Roll-back Procedure The Roll-back Procedure is as follows: 1. Storing the reached `not full' DMA location in set that can be used further by Reduction Procedure if it will be performed. 2. Rolling back to the nearest level of the search-tree. 3. If the level is highest one, i.e. indexed by 0 and consists of a single root-vertex then there is no solution of Original Mapping Problem for given data, go to Reduction Procedure else the acceptable solution obtained is optimal, stop 4. If at the level there is at least one acceptable vertex (with bounding cost value ) not yet examined then choosing the best one, resetting state-parameters of the vertex ( ), and go to step 7of the Forward Procedure else go to step 2 6.5 Penalty Algorithm 6.5.1 Criteria and method of choosing next assignment The structure of the Penalty Algorithm is analogous with one of the Ord-Algorithm presented in Figure 6. However the content of the Forward Procedure distinguishes from the one of the Ord-Algorithm and will be considered below. Let us examine the Forward Procedure. A criteria of choosing next assignment of component to computer based on cost matrix and suggested by Vogel's Approximation Method [4]. Ele- ment computed by formula (6.9) determines the minimum attainable cost associated with assigning component to computer . The Vogel's Method, used for solving transportation problems, falls into a class of so-called penalty methods. It is based on the `difference' associ- ated with each row and column in the matrix . A row or column `difference' is here defined as the arithmetic difference between the second smallest and the smallest element in that row or column. This quantity provides a measure of priorities for making allocations to the respective rows and columns since it indicates the minimum unit penalty incurred by failing to make the assignment to the smallest cost cell in that row or column. The selection of an element Y B? Fbes t < ha hu hn la lu lnm l s DRn DRnm DA s Bin , , , , , , , , , , i n , ( ) i n Sin i hu n z hu ? ? ? ö ? , ? , { } S i n i n S i n { } 6 Optimal Solution Procedure 45 is then made by selecting the row or column with the largest difference and choosing the smallest cost cell in that row or column. Thus determination the assignment by the penalty method is as follows. 1. Computing the row difference by finding the arithmetical difference between the second smallest and the smallest element in each row of matrix . 2. Computing the column differences in the same way. 3. Finding the largest of the row and column differences and then the smallest cost cell in that row or column. Suppose element has been located in this way. Then assignment will be done next. 6.5.2 Forward Procedure Let us present the Forward Procedure. 1. Checking and executing `non-alternative' assignments similarly to step 1 of Forward Pro- cedure of Ord-Algorithm (see section 6.3). 2. Computing initial value of bounding function similarly to step 3 of Forward Procedure of Ord-Algorithm. 3. Checking and executing `non-alternative' assignments similarly to step 4 of Forward Pro- cedure of Ord-Algorithm. 4. Computing the bounding cost matrix by formula (6.9) using the subAlgorithm 1 for every component . 5. If there is at least one component yet unassigned that can not be assigned then go to Roll- back procedure. 6. Applying the penalty method to determine assignment of component to computer that will be done next. 7. Assigning component to computer . Increasing cost in the value of expenses of such assignment. 8. If all components are assigned then checking whether the solution obtained is optimal sim- ilarly to step 10 of Forward Procedure of Ord-Algorithm. 9. If the current value of cost does not exceed the cost of the best solution found previously then go to step 3 else go to Roll-back Procedure. Sin Sin { } S i n i n , ( ) Sin i hu n z hu ? ? ? ö ? , ? , { } i hu ? i n , ( ) i n i n F F Fbest 7 Reduction Procedure 46 7 Reduction Procedure The procedure obtains the DMA placement into DCS with minimum relaxation of resource requirements. 7.1 Problem Formulation Let be the summary resource of computer used by components assigned to the computer; be the summary capacity of channel used by DMA arcs assigned to the channel. Let us consider ratio which means a fraction of computer resource used by all components assigned to computer . If then location of the components can be done onto computer without resource requirement relaxation. Otherwise, is the measure of resource gap of computer . If we have a procedure for a distribution of the resource gap between all components assigned to the computer1, then we can get the resource gap of the computer with respect to individual component . Value can be used further for evaluation of QoS relaxation. The communication resource requirement relaxation can be determined similarly by utilization factor for every DCS communication resource . Following mapping problem formulation allows to get an acceptable DMA location if DCS resources are enough to meet resource requirements of DMA, otherwise to get an optimal DMA location that guarantees minimum resource requirement relaxation with respect to computers and communication resources of DCS: (7.1) subject to (7.2) 1 This is another optimization problem does not examined here. Dn hn ( ) d i i hn ? ? = n hn Dnm lnm ( ) d i j i j, ( ) lnm ? ? = n m , ( ) lnm dn Dn Rn ? = Rn n dn 1 ? n dn 1 0 > ? n d i n i d i n e s Dnm lnm ( ) n m , ( ) p s ? ? A s ? = s r ? G x ( ) m i n max dixin i hu ? ? Rn ----------------------- n z ? " , ? ? ? ? ? ? ? ? ? ö d i j x i nx j m i j, ( ) l ? ? n m , ( ) p s ? ? As -------------------------------------------------------------- s r ? " , ? ? ? ? ? ? ? ? ? ö , ? ? ? ? ? ? ? ? ? ü ? ? ? ? ? ? ? ? ? ü = x i n n z i hu() ? ? 1 i h ? " , = 7 Reduction Procedure 47 7.2 Problem Solution Technique An algorithm proposed is based on a branch-bound method that is analogous with one of Opti- mization Procedure (see section 6.1). Suppose that for any terminal vertex of the search-tree, component assignments , have already been made, . The DMA arc assignments to DCS chan- nels , , depends in a unique fashion on the component assignments. On the other hand, having the arc assignments to channels one can get correspond- ing arc assignments to DCS communication resources . To take into account different features of component and arc locations some lower bounds of objective function (7.1) are proposed. Evidently, maximum utilization of DCS computers produced by assigned components can be used as one of bounds: (7.3) For computing lower bound for a terminal vertex of the search tree, best placements of not yet assigned components are sought. Suppose that placement of such component k to computer l is examined. Then the conditional utilization of the computer will be . The placement of component k calls for locations of his adjacent arcs that connect the compo- nent to already assigned components. These arc locations onto channels adjacent to computer l can cause an utilization change of DCS communication resources over which the channels are routed. Let be maximum utilization of such communication resources provided that com- ponent k is placed into computer l, i.e. , where and (n,m) are the channels adjacent to computer l. Therefore maximum utilization of DCS resources obtained provided that component k is placed into computer l is . The best placement of component k is characterized by utilization factor . t ha h < = hn n z ? , hn n z ? ? ha = lnm n m , ( ) p ? , l n m , ( ) n m , ( ) p ? ? la = ls lnm n m , ( ) ps ? ? = G1 max dn hn ( ) n z ? , { } = G2 kl d l dk Rl ? + = G3 i n G3 kl maxs es k l fi ? { } = e s Dnm lnm ( ) n m , ( ) ps ? ? As ? = G23 kl max G2 k l G3 i kl , { } = G23 k m i n l G23 kl { } = 7 Reduction Procedure 48 Thus the other bound of objective function can be maximum utilization reached by best place- ment of one of not yet assigned components: (7.4) If there is a gap of computational resources in the DCS, next third bound of objective function is useful too. Let us order computers in decreasing with respect to computational resources available to mapped DMA: (7.5) where , If available resources of two (or more) computers are equal, then these computers are ordered in increasing with respect to utilization factors (7.6) Assume that not yet assigned components will be distributed between computers with most available resources such that the utilization of these computers would be uniform. Indexes of these computers are the first elements of the sequence ordered corresponding to relation (7.5) and (7.6). Then the average uti- lization factor of each such computer is (7.7) Evidently, objective function . Thus a lower bound of the objective function (7.1) for the vertex of performed assignments can be computed as follows: (7.8) Efficiency of the bound proposed depends on the order of not yet assigned components used for construction of next level of the search-tree, i.e. the method of choosing next component rela- tive to which the branching of the chosen vertex has to be done is important. We propose to choose such component using following criterions: a. component with most number of adjacent arcs that connect, the component to already assigned components (this property is important for the bound ), b. if there are some components that have equal most numbers of such adjacent arcs, then one chooses the component of these ones that requires most computational resource (this property is important for and ), G23 maxk G23 k { } = n1 n2 ? n z , , , ( ) DRn 1 DRn2 ? ? DRnz ? ? ? DRn z ? ? DRnz Rn z Dnz ? = dn1 dn2 ? dn z ? dn z ? ? ? ? ? hu z m i n hu z , { } = n1 n2 ? n z ? n z , , , , , ( ) G4 1 z--- dn k k 1 = z ? di i h u ? ? Rnk k 1 = z ? ------------------ + ? ? ? ? ? ? ? ? ? ? ? ö = G G4 ? hn n z ? , BG hn n z ? , ( ) max G1 G23 G4 , , { } G ? = G23 G1 G4 7 Reduction Procedure 49 c. if there is no component with such adjacent arcs, then one chooses the component with most required resource. 7.3 Algorithm The algorithm of the Reduction Procedure is as follows: 1. Computing initial lower bound using formulas (7.4), (7.7), (7.8) for and including only attached components that are already placed by the Attached Component Location Procedure. Let . 2. If not all components are assigned, i.e. then choosing next component using criterions mentioned above else go to step 6 3. Constructing terminal vertices at level of the search-tree. Each of such verti- ces corresponds to one computer that component could be assigned to. For every terminal vertex, bounding value are computed using formula (7.3) (7.4), (7.7), (7.8) provided that component is placed into corresponding computer 4. Choosing the terminal vertex of level that has the lowest bounding value. Suppose, it is the vertex with value . If then go to step 7. 5. Including the vertex into the current solution. Assigning component to computer . Go to step 2. 6. An acceptable solution is obtained. Suppose, value of the solution is . If then the solution is optimal, stop. If then storing the solution and let . 7. Excluding the last vertex from the solution. Rolling back to the previous level . If then stop, optimal the solution is obtained else restoring vertices of level and go to step 4. Remark. Reduction Procedure can use set of `not full' solutions of DMA that are storing by Roll-back Procedure (see section 6.4). Then before step 2, one has to choose the best `not full' solution of set and to restore the state parameters of the solution as initial ones. The best `not full' solution could be a solution with maximum number of assigned com- ponents, . The proposed approach allows to obtain quickly an acceptable solution, value of which can be used further to improve the solution. The obtained optimal solution could be used further in the following ways: GI0 ha hn n z ? , Gbest - = hu 0 > i hu ? l ha 1 + = n i BGn ha i ? ? ? ? ö i n l BGn BGn Gbe s t > i n BG BG BG0 = BG Gbe s t < Gbest BG = l l 0 = l Y Y ha hu hn la ls , , , , maxY ha { } 8 Examples 50 - Distributing the resource gap in every overflowed computer and channel between compo- nents and arcs assigned to them respectively. - Checking the reduction level of QoS caused by the DCS resource gap. - If the reduction level of QoS is held within allowable limits vertices then an acceptable relaxed solution is obtained, stop. 8 Examples 8.1 Example 1 Let us consider an example of mapping DMA depicted in Figure 2.1 to DCS depicted in Figure 2.7, system graph of which is presented in Figure 2.9. Required computational resources of components and capacities for component communications are shown in Figure 2.1 as weights of nodes and links of DMA graph respectively. Available capacities of DCS communication resources LAN1, LAN2 and WAN are respectively. Initial data of available capacities of virtual channels and components of DCS are presented by matrix (2.3). Let us determine cost functions. To simplify computations, we consider linear cost functions of capacity and computational resource. Suppose, that costs of one unit of the computational resource for computers of DCS are following: . Costs of one unit of capacity for LAN1 is , for LAN2 - and for WAN - . Then one can calculate cost factor for every channel of DCS by formula , where is the set of communication resources of DCS over which channel is routed. So the final cost factors for computers and virtual channels of DCS can be represented by matrix : Assume, that components and are attached to computers and respectively. Now all input data are given. A B C D E A 2 2 5 6 6 B 2 1 5 6 6 C 5 5 2 4 4 D 6 6 4 2 1 E 6 6 4 1 1 A1 5 A2 , 6 A3 , 3 = = = aA 2 aB , 1 aC 2 aD , 2 aE 1 = , = = , = = a1 2 = a2 1 = a3 3 = n m , ( ) anm a i i rnm ? ? = rnm n m , ( ) a f g E D 8 Examples 51 In accord with the procedure for location of attached components (see section 5), one gets: - ; ; - constant term of cost objective function ; - modified matrix of required resources d: - modified matrix of available resources R: Further computations are executed in accord with Optimization Procedure (see section 6). At step 1 of Forward Procedure, `non-alternative' assignments are not found. At step 2, ordering components, using for example cost function , results in the sequence of components e, d, b, c, a with cost values 14, 12, 6, 4, 2 respectively. This component sequence is used further to construct levels of the search-tree depicted in Figure 8.1. a b c d e f g a 1 1 b 3 3 c 2 2 d 4 3 e 4 3 3 f 0 g 0 A B C D E A 5 5 3 3 3 B 5 4 3 3 3 C 3 3 6 6 6 D 3 3 6 5 6 E 3 3 6 6 4 x f E 1 xfn , 0 n , A B C D , , , = = = xgD 1 xgn , 0 n , A B C E , , , = = = F0 aE df aD dg ? + ? 1 3 2 3 ? + ? 9 = = = f E di ( ) gE d i j ( ) gE d j i ( ) + [ ] j ? + 8 Examples 52 At step 3 for the vertex of level 0, one computes initial bounding value of best independent placements of all unassigned components using subAlgorithm 1. In Figure 8.2, bounding values computed by subAlgorithms 1 and 2 are presented for all possi- ble placements of every unassigned component. The value in brackets corresponds to term in formula (6.4), i.e. to cost of placements of component i into computer n and arcs that are adja- cent to the component and connect it to already assigned components. For example, bounding value for assignment of component e to computer C is equal to 39.5, the part of which 32 cor- responds to summary cost of placement of component e to computer C and placements of arcs (e,f) and (e,g) connecting component e to already assigned components f and g (see Figure 2.1). Therefore 39.5 - 32 = 7.5 is the bounding value of in formula (6.5) that is the half of the cost of placements yet unassigned adjacent arcs (a,e) and (b,e). B0 hu e d b c a , , , , { } = M i n Lin DC A B C D E - - - - - - A B C D E - - 48.5 (17) 48.5 (9) 64 (40) A B C D E - - - A B C D E - - - A B C D E - - - 64 (42) 67 (63) 64 (60) 64 (64) Assigning component d component e component a component b component c Figure 8.1. Search-tree constrained by Optimization Procedure Level 0 2 3 4 5 1 - - 8 Examples 53 Choosing the best placement of every unassigned component that is characterized by minimum cost value, one obtains the initial bounding value . In Figure 8.1, value of already assigned compo- nents f and g is shown in brackets near the value 48.5. In the process of computa- tion, one obtains cost bounding values matrix M of all possible placements of every isolated unassigned component: A B C D E a 2 2 2 2 1.5 b 9 6 12 7.5 4.5 c 6 4 4 5 3 d 20 e 39.5 12.5 10.5 B0 F0 10.5 20 4.5 3 1.5 + + + + + 48.5 = = F0 9 = B0 - - - - - - e A B C D E - - 39.5 (32) 12.5 (11) 10.5 ( 7) d A B C D E - - 20 (8) b A B C D E c A B C D E a A B C D E - - 9 (6) 6 (3) 12 (6) 7.5 (6) 4.5 (3) 6 (4) 4 (2) 4 (4) 5 (4) 3 (2) 2 (2) 2 (1) 2 (2) 2 (2) 1.5 (1) Figure 8.2. Bounding values for all possible independent of components e, d, b, c, a assignments 8 Examples 54 The matrix contains lowest bounding values for every possible placements of every component and shows: - component d can not be placed into computers A,B,D,E; - component e can not be placed into computers A and B; - component d can be placed only into one computer that is C. This information allows not to compute bounding values for vertices of next levels correspond- ing to such placements of components d and e. In accord with step 4 of Forward Procedure, `non-alternative' assignment of component d to computer C is considered at the next first level of the tree (see Figure 8.1). The bounding value of the vertex C is equal to 48.5 and value in brackets 17 shows the cost of already assigned com- ponents and arcs. Other vertices obtain value in accord with matrix M without additional computations. (In Figure 8.1 these values are enclosed in rectangles). After this assignment, computational resource of computer C is decreased to . At the second level of the tree, possible assignments of component e are examined. In accord with bounding matrix M vertices A and B get value . Computations by subAlgorithm 2 (see section 6.3.2 and 6.3.3) result in value for vertices C and E. Really, assignment of component e to computer C is impossible because available resource of the computer is not enough to meet resource requirement of component e. Assignment of the component to computer E is not acceptable too because, after this assignment, available capacity of com- munication resource and then arc (a,e) of component a can not be placed. SubAlgorithm 2 obtains value 64 for vertex D of second level. Value 40 in brackets corresponds to the cost of already assigned components f, g, d, e and their arcs (e,f), (e,d), (d,e) connecting the components with each other. At the step 7 of Forward Procedure the vertex D of second level is chosen for further branching. In the process of computing bounding value of the vertex D (at step 6), rows of cost bounding matrix M that correspond to not yet assigned components a,b,c are modified: Following Forward Procedure, one returns from step11 to step 4 and can see from these rows of matrix M that component a has `non-alternative' assignment to computer D. This assignment is done at the next third level of the search-tree. At that vertices A, B, C, E get value in accor- dance with matrix M without additional computations. A B C D E a 2 b 14 12 c 21 18 - - DRC 2 = - - DRC 2 = de 4 = DA2 0 = - - - - - - - - - - - 8 Examples 55 As shown in Figure 8.1 the first acceptable solution obtained has cost value 64 and is optimal. Figure 8.3 depicts the solution. b d c f g e a D E B C Figure 8.3. Optimal solution 3 3 3 4 1 3 3 4 2 6 4 7 3 2 1 3 2 8 by Optimal Procedure obtained DC A B C D E A B C D E A B C D E A B C E - D A B C D E 64 48.5 (9) - 58 56.5 (16) 65 Figure 8.4. Search-tree obtained by Optimal Procedure - - - - - - - - (20) - A B C D E 67 64 (36) - - - - 64 (40) (61) (58) - - - - 64 (62) - - - (64) Assigning component e d b c a without use of `non-alternative' assignments 8 Examples 56 Note, that checking `non-alternative' assignments at step 1 and step 4 of Forward Procedure allows to reduce the search-tree by decrease of number of vertices examined. By way of com- parison, Figure 8.4 depicts the search-tree constructing without use of the `non-alternative' assignments and consisting of 31 vertices examined (instead of 26 ones in Figure 8.1). 8.2 Example 2 Let us consider the previous example of mapping the DMA graph depicted in Figure 2.1 to the DCS graph depicted in Figure 2.9. However suppose, that components f, g and a are attached to computers E, D and C respectively, and components b and c can be placed only into computers A and B. For this case, Optimization Procedure obtains initial value = 80.5 and matrix M with following initial cost bounding values of all possible placements of unassigned compo- nents The matrix shows that components d and e have `non-alternative' assignments to computer C. However the computer has not enough resource to locate the components. Therefore the Reduc- tion Procedure is used to obtain the DMA placement into DCS with minimum resource gap. To compute bounds of objective function (7.1), following tables are useful: table 8.1 presenting values concerned with allocations of components onto computers and table 8.2 dis- playing the interdependence between capacities of virtual channels and communication resources. The tables contains initial data for Reduction Procedure. To compute bound (see formula (7.7)), in table 8.1 computers are arranged in decreasing order with respect to computational resource available to mapped DMA . Com- puters C and D, having the same available resources ( ), are ordered in increasing with respect to utilization factor . The column next to the last contains sums of utilization factors. The element, that is i-th of the order in this column, is equal to the sum of i first values , i.e. . Similarly, the last column contains sums of available resources of computers. The row contains values of required resources of unassigned components. The row A B C D E b 9 6 c 6 4 d 20 e 39.5 B0 - - - - - - - - - - - - - - I1 I23 I4 , , I4 DRn Rn Dn ? = DRC DRD 5 = = dn dnk dn k k 1 = i ? dik d i k k ? 8 Examples 57 used for computing bound contains sums of required resources of unassigned components. The element, that is i-th of the order in this row is equal to the sum of all last elements of the previous row, starting from element . In table 8.1, components are arranged in decreasing order with respect to required computa- tional resources. However each time, when next component have to be chosen for placement (see step 2 of Reduction Procedure in section 7.3), components are rearranged so that the first component must be one with most number of adjacent arcs that connect the component to already assigned components. At the start of Reduction Procedure, such component is e. Table 8.1 Table 8.2 Com- puter C o m p o n e n t unassigned attached e d b c a e f A 0 5 5 0 0 5 C 1 1 6 5 1/6 1/6 10 D 3 3 8 5 3/8 13/24 15 B 0 4 4 0 13/24 19 E 3 3 7 4 3/7 163 168 23 4 4 3 2 1 3 3 max 3/7 13 9 5 2 Com muni cat- ion resou rce Virtual channel connections A, B A, C A, D A, E B, A B, C B, D B, E C, A C, B C, D C, E D, A D, B D, C D, E E, A E, B E, C E, D 1 * * * * * * * * * * * * * * 5 0 0 2 * * * * * * * * * * * * * * 6 0 0 3 * * * * * * * * * * * * 3 0 0 max 0 G4 dik d ij Dn Rn DRn dn dn k k ? DRn k k ? dik dik k ? As Dnm nm ( ) ? e s Dnm 8 Examples 58 Table 8.2 is used for computation of utilization factors of communication resources over which virtual channels are routed. For example, if an arc with required capacity 3 is assigned to channel (A,E), then the table shows (see symbols *) that this assignment causes uti- lization of 3 units of capacity of communication resources LAN1, LAN2 and WAN, utilization factors of which become equal to 5/3, 6/3 and 3/3 respectively. The search-tree constructed by Reduction Procedure is depicted in Figure 8.5. The optimal solu- tion is to assign component e to computer C, component d to A, component b to B, and compo- nent c to A. Figure 8.6 depicts the solution obtained and tables 8.3 and 8.4 present computational resources and capacities of communication resources used by mapped DMA. Computer A has 20% of the resource gap ( , gap = 6 - 5 = 1) and communication resource LAN1 has 20% of the capacity gap ( , gap = 6 - 5 = 1) that causes the capacity gap of virtual channels (B,A) and (A,C) which are routed over LAN1. If the gap of the computer resource can be divided between assigned components d and c pro- portional to resource required for the components, then the resource relaxation for component d is equal to , i.e. % = 16.6% and for component c: , i.e. e s s rnm ? dA 6 5 ? = e1 6 5 ? = A B C D E A B C D E A B C D E A B C D E A B A B A B A B 5 6 12 6 9 6 6 5 7 3 7 3 6 6 7 6 7 6 6 5 12 6 6 5 6 5 6 5 6 5 7 6 7 6 10 6 10 6 11 8 11 8 7 5 6 5 6 5 8 5 5 3 5 3 5 3 5 3 Assigning component e d b c Figure 8.5. Search-tree constructed by Reduction Procedure 1 2 6--- ? 0.33 = 0.33 2 ---------- 100 ? 1 4 6--- ? 0.67 = 8 Examples 59 % = 16.6%. Thus component c gets 1.67 units of computational resource instead of required 2 units, and component d gets 3.33 units instead of 4. Table 8.3 Com- puter C o m p o n e n t assigned attached e d b c a g f A 4 2 6 5 -1 6/5 B 3 3 4 3 3/4 C 4 1 5 6 1 5/6 D 3 3 8 5 3/8 E 3 3 7 4 3/7 4 4 3 2 1 3 3 max 6/5 0.67 4 ---------- 100 ? c d b e a f g 2 4 3 4 1 3 3 C 6 E 8 D 7 B 4 A 5 1 3 1 3 3 2 2 3 3 6 5 3 7 3 8 5 6 3 4 Figure 8.6. Optimal solution obtained by Reduction Procedure Dn Rn DRn dn dn k k ? DRn k k ? d ik dik k ? 8 Examples 60 Similarly for the communication resources. Let us divide the gap of LAN1 capacity between channels (B,A) and (A,C), routed over LAN1, proportional to the capacity required by data streams from component d to component e and from b to d. Then the first stream gets 2.5 units of the capacity instead of required 3 units (that corresponds to 16.6% of the capacity require- ment relaxation) and the second stream is characterized by the same values. Table 8.4 Let us illustrate computing the bound for one of vertices using the tables 8.1 and 8.2. Suppose, that component e is assigned to computer D and location of component d into computer A is examined. The corresponding vertex is depicted in Figure 8.1 by the bold circle. Tables 8.5 and 8.6, corresponding to this vertex, are used to compute bounds and BG in accor- dance with formulas (7.3), (7.4), (7.7), (7.8). Note that in two last columns of table 8.5, only two first elements are computed and used further for calculation of bound . The number of these elements is not more than the number of unassigned components (see parameter z in formula (7.7)). Moreover, unassigned components b and c can be placed only into components A and B. Therefore only these computers are ordered in the table. The bound equal 7/8 is presented in table 8.5. In accord with formula (7.7) and table 8.5 the bound . To compute bound for the vertex, the best placement for every unassigned component is determined. Taking into account that unassigned components b and c can be placed only into computers A and B, one can calculate utilization factors of computers , and of commu- nication resources , for every acceptable placements of these components, n = A,B. Results of the computations of bounds and , that correspond to the best placements of components b and c, are presented in tables 8.7 and 8.8 respectively. Com muni cat- ion resou rce Virtual channel connections A, B A, C A, D A, E B, A B, C B, D B, E C, A C, B C, D C, E D, A D, B D, C D, E E, A E, B E, C E, D 1 * * * * * * * * * * 5 6 6/5 2 * * * * * * * * * * * * * * 6 6 6/6 3 * * * * * * * * * * * * 3 3 3/3 3 3 3 3 max 6/5 As Dnm nm ( ) ? e s Dnm G1 G23 G4 , , G4 G1 G4 1 2--- 4 5--- 5 9--- + ? ? ? ö ? 61 90 ------ = = G23 G2 bn G2 cn G3 bn G3 cn G23 b G23 c 8 Examples 61 In accord with formula (7.4), maximum utilization reached by the best placement of unassigned components b and c is = 6/5. Table 8.5 Table 8.6 Com- puter C o m p o n e n t unassigned attached e d b c a g f B 0 4 4 0 0 4 A 4 4 4 1 4/5 4/5 9 C 1 5 6 1 1/6 D 4 3 7 8 1 7/8 E 3 3 7 4 3/7 4 4 3 2 1 3 3 max 7/8 13 9 5 2 Com muni cat- ion resou rce Virtual channel connections A, B A, C A, D A, E B, A B, C B, D B, E C, A C, B C, D C, E D, A D, B D, C D, E E, A E, B E, C E, D 1 * * * * * * * * * * 5 3 3/5 2 * * * * * * * * * * * * * * 6 7 7/6 3 * * * * * * * * * * * * 3 3 3/3 3 1 3 max 6/5 G23 max G23 b G23 c , { } = Dn Rn DRn dn dn k k ? DRn k k ? d ik dik k ? As Dnm nm ( ) ? e s Dnm 8 Examples 62 Table 8.7 Table 8.8 Table 8.9 illustrates computing utilization factors of communication resources provided that component b is placed into computer B. In accord with formula (7.8), the final bound of the ver- tex examined is . Table 8.9 b max A 7/5 7/6 7/5 B 3/4 6/5 6/5 min 6/5 c max A 6/5 7/6 6/5 B 2/4 7/6 7/6 min 7/6 Com muni cat- ion resou rce Virtual channel connections A, B A, C A, D A, E B, A B, C B, D B, E C, A C, B C, D C, E D, A D, B D, C D, E E, A E, B E, C E, D 1 * * * * * * * * * * 5 6 6/5 2 * * * * * * * * * * * * * * 6 7 7/6 3 * * * * * * * * * * * * 3 3 3/3 3 3 1 3 max 6/5 G2 bn G3 bn G23 bn G23 b G2 cn G3 cn G23 cn G23 c BG max G1 G23 G4 , , { } max 7 8--- 6 5--- 61 90 ------ , , { } 6 5--- = = = As Dnm nm ( ) ? e s Dnm 9 Computational Complexity Analysis of Algorithms 63 9 Computational Complexity Analysis of Algorithms 9.1 Algorithms of Optimization Procedure Let us evaluate the computational efficiency of the algorithms. The complete set of all possible locations of DMA, consisting of I = 5 unassigned components, into DCS, consisting of N = 5 computers connected with each other, is consists of alternatives. The algorithm pro- posed have constructed the search-tree depicted in Figure 8.1 that consists of 26 vertices from which only for 9 ones bounding values are computed. Bounding values of the root has most computational complexity: 25 possible independent loca- tions of 5 isolated components onto 5 computers are examined (see cost bounding matrix M). In general case for a vertex of level i, number of examined such locations is equal to (I - i) N + 1. Then summary number of such independent locations needed to get first (may be optimal, as for the example considered) acceptable solution, provided that Roll-back Procedure will be not used is not more than = (9.1) For the example considered L = 300. Really, excluding vertices that obtain value without computations (see values enclosed in rectangles in Figure 8.1), one gets , where the first and third 1 cor- respond to `non-alternative' assignments. Thus only 88 independent locations of isolated components were examined by the algorithm instead of 3125 possible locations of DMA into DCS. The gain increases with increase of values N and I. If component i can be placed only into computers, then following relations evaluates computational complexity (9.2) The computational complexity of one component location is determined by the computational complexity of subAlgorithm 2, based on the branch-bound method, and depends on number K of unassigned arcs adjacent to the component and number W of channels adjacent to computer which the component is assigned to. A high bound of the summary number of unassigned arc locations is determined by following formula = N I 3125 = L I N I i ? ( ) N 1 + ( ) N i 1 = I ? + ? = 1 2--- N2I I 1 ? ( ) 2NI + - - L 5 5 1 3 5 1 + ? ( ) 3 1 1 5 1 + ? ( ) 2 1 + ? + + ? + + ? 88 = = Ni N ? L N i 1 N j j i 1 + = I ? + ? ? ? ? ? ö Ni i 1 = I ? + i 1 = I ? Ni 2 N j j i 1 + = I ? + ? ? ? ? ? ö i 1 = I ? = = La r c K k ? 1 + ( ) W k 1 = K ? = W K 1 + ( ) K 2 ----------------------------- 10 Conclusions and Future Works 64 If necessary, some approaches can be proposed to decrease the computational complexity of arc locations: 1. First to assign components that have large number of adjacent arcs. For that at step 4 of For- ward Procedure, the ordering of components have to be done in accord with the criterion. This approach allows to decrease value K. 2. Instead of subAlgorithm 2 to use an exhaustive method for to find an optimal placement of K adjacent unassigned arcs. 3. Instead of subAlgorithm 2 to use approximate trivial one that locate arcs (without fractional distribution), ordered in decreasing required capacity, into channels in sequence of increase of capacity cost. This approach gets approximate values for vertices of the search-tree con- structed by Forward Procedure and therefore obtains a suboptimal solution. 9.2 Algorithm of Reduction Procedure To estimate computational complexity of Reduction Procedure based on the branch-bound method, formulas (9.1) and (9.2) can be used to determine summary number of independent component locations needed to get a first acceptable solution provided that Roll-back Procedure will be not used. Taking into account numbers of arithmetical and logical operations needed for computation of parameters , order of computational complexity can be evaluated by value arithmetical and logical operations, where a is a constant. More detailed evaluation of computational complexity of algorithms proposed can be obtain by computer statistical experiments. 10 Conclusions and Future Works In this paper, the problem of mapping DMA to a DCS is examined. We have proposed models of weighted precedence graphs both for DMA description (including representation of topol- ogy, data streams between components, computational and communication resource require- ments) and for DCS description (including representation of feasible channel connections between computers, communication resources over which the channels are routed, parameters of computational and communication resources available to mapped DMA). An approach based on the solving two kinds of the mapping problem is proposed. The first one is formulated as a nonlinear integer programming problem with cost function under constraints KW 10 ? G1 G , 2 kl es G3 kl G23 k l G4 , , , , a N2 I2 ? ? 11 References 65 on DCS resources available to mapped DMA. If the first one has not an acceptable solution, then other problem, formulated as minimax nonlinear integer one to find the DMA location into the DCS with minimum DCS resource gap, is solved. To solve both problems, effective algo- rithms based on the branch-bound method are proposed. Computational efficiency of the algo- rithms is examined and illustrated by numerical examples. The nearest future work is to evaluate computational complexity of the algorithms proposed by computer statistical experiments and to develop algorithms taking into account the results of the experiments. We plan to implement the developed algorithms in CINEMA project [2] aimed at developing powerful abstractions for multimedia processing in distributed environments. The proposed algorithms can be successfully used for an advance reservation service when the provider has to do some planning for future allocations using such parameters of the client requests as the starting time, duration and resource requirements of DMA. We plan to adapt algorithms proposed for an immediate reservation service done in real-time mode. Other algorithm developments includes taking into account multicasting mode and peculiarities of variable filters used in DMA [14]. At last there is another problem addressed in the paper: development of a method for division the resource gaps of DCS between components and data streams of DMA, if DCS resources available to mapped DMA can not meet the DMA resource requirements. 11 References 1. Furth B., Multimedia systems: an overview. IEEE Multimedia Systems, 1, 1994, p. 47 - 59. 2. Rothermel K., Barth I., Helbig T., CINEMA - an architecture for configurable distributed multimedia applications. Tech. Report 3/1994, Universität Stuttgart, Fakultät Informatik, pp. 20. 3. Norman M.G., Thanisch P., Models of Machines and Computation for Mapping in Multi- computers. ACM Computing Surveys, 3, 1993, p. 263 - 302. 4. Berman F., Snyder L., On mapping parallel algorithms in parallel architecture. J. Parall. Distr. Comput., 5, 1987, p.439 - 458. 5. Houstis C.E., Aboelaze M., A comparative performance analysis of mapping applications to parallel multiprocessor systems: a case study. J. Parall. Distr. Comput., 1991, pp. 17 - 29. 11 References 66 6. Efe K., Heuristic models of task assignment scheduling in distributed systems. Computer, 6, 1982, p. 50 - 56. 7. Kapelnikov A., Muntz R.R., Ercedovac M.D.,A modelling methodology for the analysis of concurrent systems and computations. J. Parall. Distr. Comput., 6, 1989, p. 568 - 597. 8. Ammar M.H., Cheung S.Y., Scoglio C.M., Routing multipoint connections using virtual paths in an ATM network. IEEE INFOCOM' 93, 1993, p. 98 - 105. 9. Kleinrock L., The latency/bandwidth tradeoff in gigabit networks. IEEE Commun. Maga- zine, April, 1992, p. 36 - 40. 10. Ferrari D., Gupta A., Ventre G. Distributed advance reservation of real-time connections.- htp://www.icsi.berceley.edu, 1995. 11. Minoux M., Mathematical Programming, Theory and Algorithms, John Wiley and Sons, 1986. 12. Reinfeld N.V., Vogel W.R., Mathematical Programming, Englewood Cliffs: Prentice- Hall, 1958. 13. Schwarz M., Telecommunication Networks: Protocols, Modeling and Analysis, Addison- Wesley Publ. Comp., 1987. 14. Dermler G., Fiederer W., Barth I., Rothermel K., A Negotiation and REsource Reservation Protocol (NRP) for Distributed Multimedia Applications. Tech. Report 11/1995, Univer- sität Stuttgart, Fakultät Informatik, pp. 27.