Improved distribution with relays
The archive deployment is functionally a form of multicast: the PageBox Repository deploys the archive to a group made of all PageBoxes that have subscribed to this Repository or for this archive on the Repository. Multicast is supported with IP but:
Unicast raises another issue. It can’t handle acknowledgement. When it deploys an archive a Repository must receive a message from each PageBox telling it if the deployment was successful and if not why. Whereas the deployment is a one-to-many request the deployment acknowledgement is a many-to-one traffic.
Therefore to shorten the deployment time in the PageBox for Java version we used a relay mechanism described first in the Grid V2 document. The PageBox for Java implementation is described in the deployment with Relays document. This relay deployment is effective but it can be further improved if we select the PageBoxes with the fastest connections to be the Relay PageBoxes.
A PageBox constellation is made of one Repository and n PageBoxes.
The deployment problem consists in deploying an archive from the Repository to each PageBox. The archive is big (100KBs to several MBs.)
With relayed deployment the Repository selects a Relay PageBox and sends a first deployment request to this PageBox. The Repository piggybacks half its deployment requests on this request.
In the next step the Repository chooses a second Relay PageBox and sends a second deployment request to this PageBox. The Repository piggybacks half its remaining deployment requests on this request. In the same step, the first Relay PageBox:
And so on. At the last step n / 2 Relay PageBoxes issue deployment requests in parallel.
When the Repository and the PageBoxes have the same kind of IP link (bandwidth and latency) and when links are independent (when two or PageBoxes don’t share the same link) this model achieve two objectives:
Note on link independence:
On Internet the link between two PageBoxes is made of a large number of links. This link list is not static and we can practically ignore shared links except on the end points.
When the Repository and the PageBoxes don’t have the same kind of link the solution is no more optimal. In this paper we explore solutions addressing this problem.
PageBox subscription is dynamic. Therefore a first solution is to ask the subscriber to grade the subscribed PageBox: For instance if the PageBox has a DSL link, the subscriber will give a grade of one. It the PageBox has an OC3 link the subscriber will give a grade of four. The Repository and the Relay PageBoxes will first deploy on the highest grade PageBoxes.
Best suited for:
Intranet constellations. In these constellations it is easy to grade PageBoxes because there are only two or three sorts of links. The network is simple and its behavior is easy to predict.
One of the goals of the PageBox project is to enable a pervasive architecture where millions of PageBoxes are subscribed to hundreds of Repositories. There is no central point of administration. Many kinds of links coexist. Some PageBoxes and Repositories are linked to Internet using dialup and DSL connections.
A PageBox or Repository requires a permanent connection (like DSL connections) but doesn’t require a permanent IP address. PageBox 0.0.8 and above even include a dynamic DNS update function.
A deployment involves:
We can represent the Repository and the subscribed PageBox set with an undirected and complete graph G where the Repository and the subscribed PageBoxes are vertices. For background information about graphs you can read our graph introduction. At the first deployment we randomly explore this graph from the Repository. We can record deployment times and build a second graph Gd, which is a weighted dag:
We compute the bandwidth between an origin x and a destination y with the formula:
Bandwidth(x,y) = archive_size / deployment_time
Our knowledge of G is minimum for two reasons:
The problem consists in analyzing Gd to create a Gd’ graph used to make a second deployment. For the third deployment we analyze Gd’ to build a Gd” graph and so on.
The problem is further complicated by the dynamic nature of constellations: PageBoxes subscribed for the first deployment can have been unsubscribed before the second deployment and new PageBoxes can have been subscribed between the first and the second deployment.
We ignore this issue for clarity in the simplified algorithm.
The deployment time from one Repository or PageBox x to a PageBox y
d(x,y) = latency(x,y) + archive_size / bandwidth(x,y)
The size of an archive is most of the time > 100 KB. Therefore we probably can ignore the latency. The bandwidth is the bandwidth of the slowest link, most probably the bandwidth of the connection link of x or x.
A Repository or a Relay is the origin of many deployments. Therefore we can compare the deployment times from this origin. If the deployment times are the same, the origin can be the slowest link. If the deployment times are different we can assume that the destination with the shortest deployment time has the fastest connection and the destination with the longest deployment time has the slowest connection. We can use this information to create Gd’. In this graph we put the PageBoxes with the fastest connection closer to the Repository (we promote them.) These PageBoxes will act as relays and have a bigger contribution to the deployment.
We scan the Gd graph. For each vertex we compare the weight of the outbound edges. If the weights are of the same order of magnitude and high the destinations have fast connections and we can promote all of them. If the weights are of the same order of magnitude and low the origin might have a slow connection, we don’t know. If the weights are different we promote the vertices with the highest weight.
We represent the Gd graph with an adjacency list.
The Gd graph is created at the first deployment.
Once the deployment is completed we scan the adjacency list and we update each subscribed PageBox object with a rank:
Then we serialize the updated subscribed PageBox objects into subscribers.xml.
At the next deployment, we sort the subscribed PageBox objects by bandwidth and we first deploy on the PageBoxes whose connection is certainly fast and next we exercise the PageBoxes whose connection speed is unknown. Because these PageBoxes receive then deployment requests from PageBoxes with fast connections we can determine if they have fast or slow connections.
At each deployment the deployment graph is further improved up to the point where either:
The rank of a PageBox moves from unknown to known but not from known to unknown: after the first deployment unknown means that the deployment origin has a slow connection and the PageBox bandwidth is very likely to be still as fast as at the previous deployment.
When a PageBox is unsubscribed the corresponding subscribed PageBox object is removed. When a PageBox is subscribed the subscribed PageBox object is created with a 0 rank.