PageBox |
|
ActiveNaming | Grid API | Coordinator | Presentation | Polaris A | Polaris B | Download |
Polaris B implementation
Polaris B is based on Polaris A. Polaris B shows how to use the Grid API to synchronize caches and replicate database updates.
The Polaris client implements a shopping basket.
Users can change their shopping baskets, for instance to add new articles. Their changes stay on client side up to the time the users commit their commands. Then the Polaris Web service is invoked and records the command on server side.
We want to have more than one instance of Polaris client able to serve end users, which implies that each Polaris client instance notifies its changes to the other instances.
We need to synchronize client instances to allow user browsers to call any client instance at any time.
It is useful when a client load balancing takes place.
The client load balancing is not in the PageBox scope and has to use a mechanism supported by browsers, which can be:
DNS round robin
Proxy load balancing
Server-driven load balancing
DNS round robin is supported by all browsers.
It leverages on the fact that the processing of a request includes two steps:
Retrieving the IP address of the Application server from its host name through a DNS query
Making the actual HTTP request to the server
The DNS server returns each time a different IP address chosen in the list of the Polaris client instances.
Pros:
DNS round robin is simple and fast
Cons:
Revolved host names are kept in a local cache
The DNS server should only balance between running instances. The DNS server is generally able to detect network problems and server crashes. However it is often unable to detect that an Application server instance is no longer able to serve pages (due to a deadlock, to a loop or to a resource shortage).
It is often possible to configure a browser to use a HTTP proxy. Then the browser requests are first processed by the proxy. The mechanism was designed primarily for static content:
The proxy first tries to serve the resource (page, image) from its cache
If the resource is not in its cache the proxy asks its neighbours (sibling proxies) for the resource
If the resource is not in its neighbour caches the proxy can ask an upper level proxy for the resource
If the resource is not found the proxy asks the Web server for the resource
It is possible to implement a proxy that balances dynamic requests between Application server instances.
Pros:
The proxy server is a single point of failure. However if each office has its own proxy server only this office is impacted in case of failure.
A custom proxy can call custom probes to check if Application server instances can still serve pages
Cons:
The implementation of a custom proxy is complex
There is no unique standard protocol for the proxy-to-proxy communication. These protocols tend to be bandwidth hungry. Proxies effectively implement LAN-wide caches to Internet but not more complex, multi-layered architectures.
End users access the Application server pages through a Welcome page.
The Welcome page can include links to the Application server pages or redirect to an Application server page. The Welcome page is a dynamic page that sets the link or redirection URL from the list of running Application instances.
This solution can look stupid at first: we use a dynamic page to balance requests between dynamic page servers and to enhance their reliability. However the reliability of a server page depends on the application complexity. A dynamic page that doesn’t implement a business logic, that doesn’t access a database nor write on a log file, can be as reliable and almost as fast as a static page.
We name this Welcome page routing page. Here is a simple implementation.
A routing server hosts a routing page.
The routing page selects an URL in a table of Active Application server instances.
This in-memory table is populated by a Web service, we call here I’m_here.
When a server page of an Application server instance is invoked it calls I’m_here each time a timeout has expired.
When an URL of the active instance table has been selected but has expired (is older than timeout), the routing page removes this URL from the active instance table.
Pros:
We use multiple instances of the Routing server. To balance requests between Routing server instances we can use DNS round robin.
It is easy to customize the downtime detection.
Multiple server solutions tend to broaden the gap between the downtime perceived by the end users (typically in the range 25-33%) and the downtime measured by the operators (typically in the range 0.2-0.5%). Nobody lies:
Users don’t complain when something fails once per year. When they complain the download time is much higher.
The aggregate downtime of the set {hardware, Operating System, Application server} is around 1%. Applications and Operation problems explain the gap. These problems are mainly due to lack of training and human errors.
We have seen that load balancing implies sending and processing extra messages to detect downtime conditions. The Application server instances must also manage the same data, which means that each instance must notify its changes to the other instances. It is what we call cache synchronization.
It is obvious that
A multiple server solution is more reliable than a single server solution only if the aggregate MTBF of the components of the multiple server solution is higher than the MTBF of a server. The components of a multi server solution are the load balancing, the Application instances and the cache synchronization.
A load balancing between multiple server instances is more scalable than a single server solution only if a) Load balancing is cheap and b) Cache synchronization is cheap
We can note that at best the number of load balancing messages grows linearly with the number of Application server instances. The number of cache synchronization messages can be constant with multicast or broadcast but the processing cost still grows linearly because each instance processes all cache messages. Therefore there is a scalability limit.
At a given time an authenticated user handles a shopping basket that contains a set of articles in given quantities.
A shopping basket is represented by a Hastable containing items whose key is the article name and the value is a CmdRow object. The CmdRow class contains the number of articles and repeats for convenience the article name and price.
Therefore we can represent the cache data with:
We need two protocols:
Newcomer protocol. The newcomer sends a message to all running Application server instances to get the cache data
Synchronization protocol. Application server instances send cache data to each other
The Polaris application sends and receives cache data during the processing of user requests:
A user can be served by any Application server instance.
Every user has a different user account, which means that at a given time only one Application server instance processes this user requests and acts as the synchronization source.
The first request of user 1 is processed by the first Application server instance.
The user adds an article to its shopping basket. The server page first retrieves messages from other Application server instances to update the cache. Even if it is the first time user 1 updates its shopping basket, this action allows updating the cache entries of other users. Then the server page adds the article to the cache and uses the GridAPI to send the update to the other Application server instances.
The second request of user 1 is processed by the second Application server instance.
The user adds another article. The server page first retrieves messages from other Application server instances to update the cache. Then the server page adds the article to the cache and uses the GridAPI to send the update to the other Application server instances.
The third request of user 1 is processed by the first Application server instance.
The user commits its command. The server page first retrieves messages from other Application server instances to update the cache. Then the server page calls the Polaris Web service with the user’s shopping basket as parameter. Next the server page removes the user entry from the cache and uses the GridAPI to send the update to the other Application server instances.
The synchronization protocol uses the Grid API in UDP transport mode with if possible multicast.
The server page only sends the change data. The change can be:
Adding a new article. In this case the user is implicitly added to remote caches if it doesn’t exist. The number of articles is also set. The message format is { seqno } + { user } : { article } : { quantity }
Modifying the number of articles in a user’s shopping basket. The message format is { seqno } = { user } : { article } : { quantity }
Removing an article from a user’s shopping basket. The message format is { seqno } - { user } : { article }
Removing a user from the cache. This action takes place at commit and rollback. The message format is { seqno } - { user }
Seqno is a sequence number incremented at each message by the source instance.
If a message is skipped then the destination instance uses the newcomer protocol to refresh its cache.
The newcomer protocol uses a SOAP Web service implemented by Polaris B and called Newcomer. This Web service implements a GetCache method that returns an XML string:
<c> <u> <n>User_1</n> <a> <n>Article_1</n> <q>5</q> </a> ... </u> <d> <n>http://www.otherInstance.com/otherPageBox/PolClient/Newcomer.asmx</n> <s>15</s> </d> </c> |
Where c stands for cache, u for user, a for article, n for name, q for quantity, d for destination and s for sequence number.
The XML string contains a cache image and a destination array. This destination array contains the last sequence numbers used to build this cache image and the URL of the corresponding instance of the Newcomer Web service. The newcomer starts receiving synchronization messages before getting the response to its Newcomer call and uses the destination array to discard messages older than the cache image.
The newcomer uses the GetParms method of the Grid API to retrieve the URLs of the Newcomer web service of Application instances already involved in the cache synchronization. Then it randomly selects one of the newcomer Web service URLs, instantiates a proxy and calls its GetCache method.
Note:
The newcomer processes requests only once its cache is populated.
The ActiveNaming Web service uses the concept of location to implement location-dependent routing. A Web service instance is chosen if it has the same location as the client.
We need a similar concept to define the set of Application instances whose caches are synchronized. For simplicity Polaris reuses the ActiveNaming location.
The Polaris Web application (the client) uses the ActiveNaming to perform
Data-dependent routing
Location-dependent routing
Load balancing
When the user of a PolClient instance in Location 1 commits a shopping basket for a user belonging to range 1, the ActiveNaming balances PolServer invocation between instances defined in Location 1 and managing Range 1 data.
We must replicate data updates on all instances managing the same user range even they are not in the same location and even if they don’t run at the time the data update occurs. We accept that the replication takes some time: if we update PolServer i and then query PolServer j PolServer j has not necessarily yet processed the change occurred on PolServer i.
PolServer implements a ShoppingOrder Web service.
When a PolClient instance commits a shopping basket it invokes an Order method of this ShoppingOrder service. Order has a prototype:
bool Order(string accountID, OrderRow[] lines); |
The OrderRow class describes an order row:
Article
Quantity
Unit price
The PolServer instance that processed the Order request builds an object containing the user ID (accountID) and an array of OrderRow (lines) and calls the Grid API in SMTP transport mode.
The Grid API sends a single mail to the mail addresses of the other PolServer instances managing the same user range.
The user range defines the set of Polserver instances whose updates are replicated to each other.
The user range is made of {userLow } : {userHigh} where userLow is the lowest user number in the range and userHigh is the highest user number in the range.
PolServer instances subscribe a message handler to the Grid API.
This handler is called each time the Grid API receives a mail. The handler method runs in the Listener thread and simulates a database update (PolServer doesn’t actually access a database).
The cache synchronization is implemented in PolClient.
The cache synchronization is mostly implemented in the ShoppingBasket class defined in ShoppingBasket.ascx.cs. ShoppingBasket is the code behind the ShoppingBasket user control. The ShoppingBasket Page_Load, OnAdd, OnDelete, Prepare and Commit methods are modified and new - private - methods are added, Initialize and SetCache are added. Read the Polaris A implementation document for a presentation of ShoppingBasket.
The behavior of ShoppingBasket changes on a single point: users must be authenticated before using the ShoppingBasket control. This is needed because the account ID is the only way to link the Shopping basket of an Application instance to the Shopping baskets of the other Application instances.
The main change in the code is that ShoppingBasket uses the cache and no longer keep shopping basket data in the Coordinator heap.
These parameters are additional parameters of the ShoppingBasket control.
Name | Function |
---|---|
GridRepURL | Repository used by the Grid API |
UDPAddress | UDP address used by the Grid API |
UDPPort | UDP port used by the Grid API |
Log | Name of the Log file used by the Grid API |
Initialize is called by Page_Load. The function of Initialize is to populate and update the cache and the sequence numbers of the other applications participating to the cache synchronization. The cache is stored in a static Hashtable called cache. The key of the cache items is the user name and the value of the cache items is itself a Hashtable whose items have article name as keys and article number as values. The sequence numbers are stored in a static Hashtable called seqnos, whose items have the Newcomer URL of the destination as key and the destination sequence number as value. Because cache and seqnos are static the cache is application-server wide.
Initialize first instantiates the Grid object. It uses two service methods, ComputePageBoxUrl and ComputeNewcomerURL to compute the URL of the PolClient controlling PageBox and Newcomer Web service, using the URL of the ShoppingBasket control.
When cache is not initialized, Initialize calls SetCache to populate it. Then Initialize reads messages from the other applications participating to the cache synchronization and updates the cache accordingly. Initialize handles the synchronization protocol described above. Synchronization messages are sent in Synch messages. The Synch class has this definition:
class Synch { int seqno = -1; char cmd = '='; string user = null; string article = null; int qty = -1; } |
Initialize ignores Synch messages with a sequence number lower than the corresponding entry in seqnos. Initialize applies the change described by the other Synch messages to cache.
SetCache uses the Newcomer protocol to create cache and seqnos.
SetCache first calls the GetParms method of the Grid object to retrieve the URLs of the Newcomer Web services of the other application instances. Then it selects randomly one of this URL. Next SetCache instantiates a Newcomer proxy with this URL and calls its GetCache method to get the cache in the Newcomer XML format described above. Eventually SetCache parses this XML stream to populate cache and seqnos.
Page_Load is called when the ShoppingBasket control is loaded.
Page_Load calls Initialize to create or refresh the cache and the seqnos Hashtables.
At the first invocation of a ShoppingBasket instance (when IsPostBack is false), Page_Load retrieve the user Shopping Basket from the cache and uses a BuildCmdRows method to convert the cache data to an array of CmdRow. Then Page_Load binds the article Datagrid to this array.
Note:
When IsPostBack is true, Page_Load cannot use the cache data. Other instance updates will be reflected in the OnAdd and OnDelete methods.
OnAdd is called when the user clicks on the Add button.
OnAdd first check that the user is authenticated. If the user is not authenticated OnAdd displays a message "Authenticate first".
Then OnAdd retrieves the shopping basket from the cache, updates the shopping basket and creates a Synch with the user request, uses the Grid API to scatter the Synch on the other Application instances. Next OnAdd updates the cache and calls BuildCmdRows to convert the cache data to an array of CmdRow. Eventually OnAdd binds the article Datagrid to this array.
OnDelete is called when the user clicks on a Delete button of the article Datagrid.
OnDelete first check that the user is authenticated. If the user is not authenticated OnDelete displays a message "Authenticate first".
Then OnDelete retrieves the shopping basket from the cache, updates the shopping basket and creates a Synch with the user request, uses the Grid API to scatter the Synch on the other Application instances. Next OnDelete updates the cache and calls BuildCmdRows to convert the cache data to an array of CmdRow. Eventually OnDelete binds the article Datagrid to this array.
Prepare is a method of the Coordinator’s Transaction interface. Prepare is called when a source calls the Commit method.
Prepare is modified in two ways in Polaris B:
Prepare checks if there is a shopping basket to commit in the cache
In case of failure Prepare removes the user’s shopping basket from the cache, populates a Synch object and uses the Grid API to remove the shopping basket from the other Application instances
Commit is a method of the Coordinator’s Transaction interface. Commit is called when a source calls the Commit method.
Commit is modified in Polaris B to check in the cache if there is a shopping basket to commit and to remove the user’s shopping basket from the cache, to populate a Synch object and to use the Grid API to remove the shopping basket from the other Application instances.
Rollback is a method of the Coordinator’s Transaction interface. Rollback is called when a source calls the Rollback method.
Rollback is modified in Polaris B to check in the cache if there is a shopping basket to rollback and to remove the user’s shopping basket from the cache, to populate a Synch object and to use the Grid API to remove the shopping basket from the other Application instances.
The Newcomer Web service is a service implemented by PolClient and allowing other PolClient instances to get an image of the cache.
The Newcomer Web service implements a single method, GetCache:
byte[] GetCache(); |
GetCache returns a byte array containing a cache image in the format described in the Newcomer protocol section.
GetCache first creates an XmlTextWriter on a MemoryStream. Then GetCache enumerates the cache and seqnos content and writes the corresponding elements on the XmlTextWriter. Eventually GetCache returns MemoryStream.GetBuffer().
NewcomerProxy.cs is a proxy of the Newcomer Web service. NewcomerProxy.cs has been generated using wsdl and modified to allow setting the Web service URL in the constructor. The NewcomerProxy class is called by ShoppingBasket.SetCache.
The replication is implemented in PolServer.
The ShoppingOrder class is defined in the ShoppingOrder.asmx.cs file.
The support of the replication is mainly implemented in three static methods:
StartGrid
ReadConfig
ReadConfig2
StartGrid calls the ReadConfig methods to read the Grid parameters. Then StartGrid instantiates a Grid object and calls its Subscribe method with a Callback object as parameter.
StartGrid has two formats:
static bool StartGrid(string userLow, string userHigh, System.Web.HttpContext context) | Called by RangeLoc at user range setting |
static bool StartGrid(System.Web.HttpContext context) | Called at the first invocation of CheckAccount or Order |
Where:
userLow is the lowest user number in the data-dependent routing range
userHigh is the highest user number in the data-dependent routing range
context is the Web service or page context
The second format instantiates a Grid object to start processing and sending replication messages.
In addition to that the first format insert a PolServer instance in a list of replicated instances.
ReadConfig has the same two formats as StartGrid:
static bool ReadConfig(string userLow, string userHigh, System.Web.HttpContext context) | Called by the first StartGrid method. Saves the user range and calls ReadConfig2. |
static bool ReadConfig(System.Web.HttpContext context) | Called by the second StartGrid method. Reads the user range and calls ReadConfig2. |
PolServer uses the data-dependent routing range (userLow and userHigh) to define at Grid instantiation which list of replicated instances this Polserver instance belongs to.
When ReadConfig is called with the first format it saves the data-dependent routing range in a file called PolRange.txt in the PolServer directory. When ReadConfig is called with the second format it reads the data-dependent routing range from PolRange.txt.
ReadConfig expects to find a configuration file called PolServer.xml in the PolServer directory that contains:
<polaris> <repository>Repository_URL</repository> <SMTP-server>SMTP_server_address</SMTP-server> <mail-address>Mail_address</mail-address> <POP-server>POP3_server_address</POP-server> <POP-user>POP3_user</POP-user> <POP-password>POP3_password</POP-password> </polaris> |
Where:
Repository_URL is the repository used by the Grid API, typically the same as the Repository used to deploy PolServer
SMTP_server_address is the IP address of the SMTP server used to send replication mails
Mail_address is the mail address (in the usual user@server format) used to send replication mails
POP3_server_address is the IP address of the POP 3 server that the Grid API will contact to check for replication mails. The POP 3 server must have access to the mails of Mail_address.
POP3_user and POP3_password are the credential used to connect to the POP 3 server
ReadConfig uses an XmlTextReader to parse the PolServer.xml file.
ReadConfig also sets the log file used by the Grid API to gridLog.txt and defines a different log file for messages of PolServer called ShoppingOrder.txt. Both files are created in the Polserver directory.
The main function of ShoppingOrder.txt is to log update requests.
The Order Web service method is modified in Polaris B to implement replication.
If the Grid class is not yet instantiated, Order creates it
Note:
This code was also added to the CheckAccount method.
Replication messages are wrapped in a Command class:
class Command { string accountID; OrderRow[] lines; } |
The Command class contains the same data as the Order method, the user name (accountID) and the article list (lines).
Then the Order method:
Creates and populates a Command object
Uses Grid.Scatter to send this Command object to the other PolServer instances (with the same Repository and user range)
Logs the Command content in ShoppingOrder.txt
The Callback class implements the GridCallback interface and therefore a Notify method.
The Callback class is defined in the ShoppingOrder.asmx.cs file.
Because StartGrid subscribes a Callback object the Grid instance calls the Notify method each time it receives a message from another PolServer instance.
The Notify method casts the message object into a Command object and logs its content in ShoppingOrder.txt.
The RangeLoc class is defined in the RangeLoc.aspx.cs file.
The only change in Polaris B is in the OnSet method.
OnSet is called when the user clicks on the Set button.
In the Polaris B implementation, OnSet calls the StartGrid method of ShoppingOrder.
Contact:support@pagebox.net
©2001-2004 Alexis Grandemange.
Last modified
.