- 11 -
4.1.1 A Multiple Sum Path in a Layered Graph
A Two Sum Path
In case of a TS path we consider a doubly (FW, SW) weighted graph in
which there are only two weights associated with each edge. Thus instead
of a single weight on each edge as in traditional graphs we have an
ordered pair of weights on each edge. As usual, a path between the start
and the end node would be composed of edges, the First Sum (FS)
associated with the path would be the sum of the first elements of the
ordered pairs of all edges in the path while the Second Sum (SS) would
be the sum of the second element of the ordered pairs of all the edges
in the path. We further assume that each weight associated with an
ordered pair is an integer. The problem is to find a path between the
start and the end node such that the FS and the SS associated with the
path are constrained by an upper limit which is an integer, let us
denote this by L. Note that the number of incoming paths at any cut
belonging to layer i would be equal to mi-1, that means the total number
of paths between the start and the end node would be proportional to
O(mn).
We try to restrict these paths at every layer by using the following procedure. Suppose there are two paths arriving at a cut belonging to any layer from the start node. We reject path P1 as
Fig. 7 : A Cut Path in the Layered Graph
S
E
layer 1
layer 2
layer c
(FW, SW) (FW, SW) (FW, SW)
(FW, SW) (FW, SW) (FW, SW)
(FW, SW) (FW, SW) (FW, SW)
(FW, SW) (FW, SW)
Seite 11: vorherige Seite | nächste Seite | Übersicht
- 12 -
compared to a path P2 provided FS(P1) as well as SS(P1) are larger than or equal to FS(P2) and SS(P2) respectively. Note that we can not reject a path P3 as compared to path P1 if FS(P3)>FS(P1) and SS(P3)<SS(P1). Thus we limit the number of paths leaving a cut to only L. That means that the number of incoming paths at any cut belonging to a layer i from all vertices belonging to layer i -1 would be at most equal to mL but the outgoing path from this very vertex would be only equal to L. A total number of O(mL) comparisons would have to be made at each cut in a layer in order to reduce the number of paths from mL to L. Thus the total number of comparisons per layer would be proportional to O(m2L). In order to find whether a path between the start and the end node exists such that FS as well as SS are both constrained by an integer limit L, would take time proportional to O(cm2L).
A Multiple Sum Path
The layered graph is now assumed to be a multiple weighted graph in
which there are k weights associated with each edge, and each weight is
an integer. The problem is to find a path between the start and the end
node in the layered graph such that the First Sum (FS), Second Sum (SS),
..., and the Kth Sum (KS) associated with the path are constrained by an
upper limit which is we also denote by L. We can use a similar procedure
to limit the number of outgoing paths from a cut, thus we have to make a
total number of O(mLK-1) comparisons in order to reduce the number of
mL-power K-1 incoming paths at each cut to LK-1 outgoing paths from that
very cut. The total comparisons made per layer would thus be at the most
equal to m2LK-1. The total complexity of this pseudo-plynomial algorithm
would be limited by O(cm2LK).
The Special Case of Uniform Edge Weights We consider here the special case where all edges, incident on a cut, in between any two adjacent layers in the graph have the same K weights associated with them. We can use a similar procedure to limit the number of outgoing paths from a cut, and we have to make a total number of mLK-1 comparisons in order to reduce the number of mLK-1 incoming paths at each cut to LK-1 outgoing paths from that very cut. The total steps made per layer would, however, be limited to mLK-1. The total complexity of the pseudo-polynomial algorithm would thus be O(cm LK) under the condition of uniform weights on edges in between two adjacent layers in the graph. Note that this case corresponds to our component allocation problem, since each edge leading to the same cut carries the same load tuples which are implied by the cut.
Seite 12: vorherige Seite | nächste Seite | Übersicht
- 13 -
4.1.2 Task Allocation for One Chain per Satellite We would use the Multiple Sum Path Algorithm to find a task allocation with minimal BRU. The layered allocation graph is shown in Fig. 8, note that there is only one chain anchored on a satellite. All m incoming edges at any cut ct in chain i from all possible cuts in chain i-1 are weighted with two k-dimensional tuples, i.e., the S-LRT, and the H-LRT pair associated with the cut ct in chain i. All incoming edges at the end node have zero weights associated with them.
In order to find a task allocation with minimal BRU we use a function Probe(L) to find if an allocation exits with BRU bounded by L. If Probe(L) returns true meaning that such an allocation really exists then we lower the limit L otherwise we increase it. We thus make an efficient search for an allocation with minimal BRU. The working of the function Probe(L) is described below.
The Function Probe(L)
We remove each edge in the layered allocation graph for which the
maximum resource usage factor SMRUi is higher than L for each satellite
i. Note that in the special case of one chain per
Fig. 8 : One Chain per Satellite
S
E
layer 1
layer 2
layer c
(S-LRT, H-LRT) (S-LRT, H-LRT) (S-LRT, H-LRT)
(0, 0)
(0, 0)
(0, 0)
(S-LRT, H-LRT) (S-LRT, H-LRT) (S-LRT, H-LRT)
(S-LRT, H-LRT) (S-LRT, H-LRT) (S-LRT, H-LRT)
sat 1
sat 2
sat c
Seite 13: vorherige Seite | nächste Seite | Übersicht
- 14 -
satellite each edge (except the one's terminating at the end node), in the allocation graph, represents an allocation for a chain i which is the only chain anchored on satellite i.
In the remaining layered allocation graph we ignore the k-dimensional S-LRT tuple and only consider the k-dimensional H-LRT tuple associated with each edge. Using the Multiple Sum Path algorithm we try to find if HMRU is less than or equal to L. If HMRU is bounded by L then the function Probe(L) returns true (as well as the possible allocation), otherwise it returns false. The complexity of the function Probe(L) is O(cm Lk). We can thus find an allocation with minimal BRU in steps proportional to O(cm [Lk]*log L).
4.1.3 Task Allocation for Multiple Chains per Satellite We would use a similar Probe(L) function and the Multiple Sum Path algorithm to find a possible allocation where the BRU is bounded by L. We again use an efficient search technique to find the allocation with minimal BRU.
We group all chains anchored on satellite 1, we shall first consider only these chains. The layered allocation graph is shown in Fig. 9, where all the d chains anchored on satellite 1 are shown. All m incoming edges at any cut ct in chain i are weighted with two k-dimensional
Fig. 9 : Chain Grouping per Satellite
S1
S2
layer 1
layer d
(S-LRT, H-LRT) (S-LRT, H-LRT) (S-LRT, H-LRT)
(0, 0)
(0, 0)
(0, 0)
(S-LRT, H-LRT) (S-LRT, H-LRT) (S-LRT, H-LRT)
sat 1
sat 1
layer 1
sat 2
Seite 14: vorherige Seite | nächste Seite | Übersicht
- 15 -
tuples, one such tuple is S-LRT, and the other is H-LRT, both these tuples corresponds to cut ct in chain i (note that all d chains are anchored on satellite 1). Thus each incoming edge at any cut in the allocation graph is weighted with 2k weights. We can now design a suitable Probe(L) based on the Multiple Sum Path algorithm to find if a path between the start node and the end node exists in the layered graph for which both SMRU1 and HMRU are bounded by L.
The total number of distinct paths leaving each cut in the last chain anchored on satellite 1 would be proportional to L2k. Thus there would be at most mL2k paths arriving at the end node. Corresponding to each path there would be an associated H-Load which is equal to the k-vector sum of all H-LTR's correspoding to all cuts of all chains anchored on Satellite 1. The k-dimensional H-Load can have at the most O(Lk) distinct possibilities, thus mL-power 2k allocations (or paths) can be reduced into O(Lk) distinct paths by rejecting unnecessary paths or paths that would lead to an overall degraded performance. The total number of comparisons made so far would be proportional to O(dmL2k).
This process is repeated for each satellite and it will take dmsL2k steps to find if an allocation exists where the BRU is bounded by L. The total complexity of finding an allocation with minimal BRU would be O( dms[L2k]*log L ).
4.2 Fast Heuristics
The previously described algorithms imply an overhead which is
increasing exponentially with increasing number of constraints. As
measurements show (see below), time overhead is at most tolerable for
the case of 1 constraint. In order to cope with situations where many
constraints are considered, we have developed and analyzed several
heuristics. The most performant one, referred to as the SQ(ueeze)
algorithm, is presented here along with performance measurement results.
SQ is compared with an algorithm proposed earlier, MMKP (m-dimensional
multiplechoice knapsack algorithm []), which was designed for a
different optimisation model. However, this model, with small
modifications, is applicable to the host-satellite allocation model as
well. We postpone a review of other heuristic approaches to Section xxx.
The SQ Algorithm
The basic idea of SQ is to calculate intial chain cuts using a
greedy-type approach, and then to gradually improve the chain cuts by
excluding ?bad? alternatives, until, for each chain, only one
alternative is left (see Figure below).
Seite 15: vorherige Seite | nächste Seite | Übersicht
- 16 -
The intial cut set is computed considering the chains one after the other in a fixed, though arbitrary sequence. For each chain, the cut is computed which yields the lowest BRU. The procedure is continued until all chains have been cut. The second part of SQ is based on several iterations. During each iteration, the chains are considered in the given fixed order. For each chain, both the best and the worst cut (in terms of BRU) are derived. For this, each chain cut is considered as a possible replacement for the cut of the previous iteration (or the initialization part). The best cut is then used as replacement, The worst cut is disabled, i.e. is not considered during following iterations.
The computation complexity of SQ is determined by the iterative part. During each iteration, one cut is disabled for each chain. Since there are m cuts, there is a total of (m-1) iterations. Per iteration, at most c*m cuts are considered, where each cut consideration involves k additions and comparisons. Hence, the overall complexity is of order o(k * c * m2).
The MMKP Algorithm for the Host-Satellite Problem The optimzation model of the MMKP algorithm assumes c groups (here chains), where from each group one of m elements (here cuts) has to be selected. All elements make use of the resources of one knapsack with r resources. Each element implies an (r-dimensional) load vector for the knapsack resources. Knapsack resources are limited by finite capacities. The goal of MMKP is to find the element selection (one from each group) which is within the given resource constraints and which minimizes overall cost (given as the sum of selected element costs). The reader is referred to [Mose96] for a more detailed description.
The host-satellite allocation model fits into the MMKP model, if all satellites and the host are considered as one super-system with a total of r = k*(noSatellites+1) constraints. Each chain forms an element group, where elements correspond to the cuts of a chain. However, there are two notable differences.
Firstly, the MMKP model seeks a solution withtin the resource constraints, not necessarily a solution with minimal BRU. However, the MMKP algorithm iteratively calculates and reduces the load for the bottleneck resource. It stops, when the (relative) BRU is below 100%. It is straight-forward to see from [Mose96], that if this stopping criterion is removed, the MMKP algorithm seeks a solution with minimal BRU. Secondly, the MMKP model seeks a solution optimizing for cost, in this respect it is more general than the SQ model. Including cost into
Seite 16: vorherige Seite | nächste Seite | Übersicht
- 17 -
Fig. 10 : The SQ algorithm
Input: c chains Xi, 1? i ? c each with m cuts XiCj 1? j ? m each with a
2*k-dimensional load tuple XiCjL1
1 satId XiS 1? i ? c
s satellites Sp 1? p ? s 1 host H
Output: Set of Cuts { 1 ? i ? c | XiC } Bottleneck resource usage BRU
Bottleneck resource id BR
Step 0: satellite and host load initialisation for all Sp, set used load(-vector) to 0 for H, set used load(-vector) to 0
Step 1: calculate inital cuts for chains for( i = 1; i <= c; i++) {
for( j = 1; j <= m; j++) {
vector-add load tuple of chain i at cut j to used load of XiS and H get
the currentBRU
if currentBRU lower than bestBRU so far, store j as best cut so far,
update bestBRU remove load of chain i from XiS and H }
vector-add load of chain i at the best cut found to used load of XiS and
H }
Step 2: improve iteratively the selected cuts for( iter = 1; iter <=
m-1; iter++ ) { for( i = 1; i <= c; i++ ) {
remove load of chain i from XiS and H as implied by best cut so far find
best cut and worst cuts for chain i as in Step 0 }
mark worst cut as disabled
vector-add load of chain i at the best cut found to used load of XiS and
H }
Step 3: compile result
for( i = 1; i <= c; i++)
XiC = best cut after last iteration of Step 2 BRU = bottleneck resource
usage after last iteration of Step 2
1. Load elements are assumed to be normalized to values between 0 and 1 with respect to the corresponding capacity constraints.
Seite 17: vorherige Seite | nächste Seite | Übersicht
- 18 -
our model is easy, for instance each cut may be associated with the communication cost of the cut.
The main expected drawback of MMKP was its relatively high complexity o( k* c2 *NC2 ), when applied to the host-satellite allocation problem. However, measurement results (see below) show that the SQ algorithm is superior not only with respect to time complexity but also concerning the quality of achieved results.
4.3 Measurement Results
We implemented the three approaches introduced above and evaluated them
for various problem sizes. In addition we considered a fourth approach
(RCut) randomly selecting for each chain a cut.
Since the approximative scheme yields optimal solutions, we used it as an absolute comparison base. However, this scheme proved to imply too much overhead for problems including more than one constraint per computer. For these cases, the results were compared in relative terms.
We considered various problem sizes, varying both the number of chains and the number of cuts per chain, while keeping the number of chains per satellite fixed (to 2 for the measurements below). We selected for each cut (linearly distributed) random load figures between a minimum and a maximum value [minload = 100, maxload = 300]. Similarly, for all satellite capacities we assumed random values between [mincap = 800, maxcap = 2400].
We balanced the host capacities accordingly, i.e. its constraints were randomly set between [mincap*NoSatellites, maxcap*NoSatellites], given that a host has to support all chains coming from all satellites. We selected this setting since it allows to consider settings where the botlleneck resource varies its location equally between satellites and the host. For each pro-
Seite 18: vorherige Seite | nächste Seite | Übersicht
- 19 -
blem setting, we performed a run of 200 measurements and computed for each approach the averaged relative BRU with respect to the optimum result (of the approximative scheme).
Table 1. Relative BRU (1 constraint per computer)
Table 2. Time Overhead in msec (1 constraint per computer)
.....noSats
noCuts 2 4 8
2: Approx
RCut
SQ
MMKP
100
134
101
102
100
146
100
102
100
153
100
102
4: Approx
RCut
SQ
MMKP
100
153
102
107
100
182
102
110
100
221
101
113
8: Approx
RCut
SQ
MMKP
100
164
104
114
100
189
103
120
100
236
101
128
.....noSats
noCuts 2 4 8 16
2: Approx
RCut
SQ
MMKP
261
0
0
0
594
0
0
0
3019
0
1
3
4118
0
1
3
4: Approx
RCut
SQ
MMKP
333
0
0
0
721
0
0
2
1141
0
1
5
5449
0
2
15
8: Approx
RCut
SQ
MMKP
515
0
0
2
950
0
1
7
1700
0
3
20
7932
0
5
56
Seite 19: vorherige Seite | nächste Seite | Übersicht
- 20 -
Table 1 shows the relative BRU (in %) obtained for the RandomCut, SQ and the MMKP algorithms for various problem sizes, while Table 2 shows worst-case time overheads. It is obvious that the approximative scheme is too slow for a computation which is required to be done in real-time. For more than one computer constraint, the overhead is in the range of, at least, minutes. On the other hand, the tables show that SQ performs very well both in terms of achieved BRU and time overhead. The achieved BRU deviates only 2% from the optimum, while time overhead stays within the range of a few msec. SQ always outperforms the MMKP approach.
More measurements have been done, confirming qualitatively above results. For instance, in cases when satellites had abundant resources (i.e. the host was the bottleneck), MMKP results deteriorated, while SQ results did not. Also, the results were comparable for higher number of constraints. For instance, Table 3 shows the BRU achieved for the case of three constraints again indicating the superiority of SQ over the other approaches.1
.
Table 3. Relative BRU (3 constraints per computer)
4.4 Related Work
The multimedia component allocation problem, as presented in this paper,
bears similarity to the problem of partitioning and mapping parallel
programs onto multiprocessor architectures
1. For small problem sizes the table contains results obtained by enumeration. In these cases the absolute optimum could be considered as reference base.
.....noSats
noCuts 2 4 8 16
2: Enum
RCut
SQ
MMKP
100
145
100
121
100
142
100
109
X
146
100
105
X
146
100
105
4: Enum
RCut
SQ
MMKP
100
191
101
111
X
184
100
114
X
183
100
120
X
183
100
124
8: Enum
RCut
SQ
MMKP
100
202
103
117
X
197
100
122
X
200
100
133
X
195
100
139
Seite 20: vorherige Seite | nächste Seite | Übersicht