New versions of PageBox: simple to use with examples
|Table of content|
The PageBox idea is to deploy and update Web Applications from central points called Repositories.
A PageBox agent is a specialized Web Application that process deployment requests coming from Repositories.
A Repository is another specialized Web Application that processes two kinds of requests:
Requests from PageBox administrators that want to get the Web Applications. These administrators subscribe to Repositories.
Requests from Web Application providers. These providers publish their Web applications to Repositories.
This model only uses proven technologies:
Web pages for publication
HTTP upload to publish
Web services for deployment
It leverages on the scalable and reliable Application servers available today.
A Repository and the PageBox agents subscribed to this Repository form a PageBox constellation.
In the PageBox documentation we also call PageBox the PageBox agent.
PageBox is a form of caching. Instead of caching HTML pages, PageBox caches the code able to produce these pages. This code can use a local database or local files and can invoke Web services. Because the code cached by PageBox runs closer to the user the response time is better. Because the provider doesn’t have to send the presentation with the data the provider spares resources and the consumer spares bandwidth, a win-win situation.
Caching is commonly used on the Web. For instance proxies keep downloaded pages in cache and call the server only when they don’t have the requested page in cache. Today the Internet infrastructure is over-sized, which means that there is no value reducing the traffic on the Internet backbones. The advent of increasingly fast link between small sites and ISPs (ADSL, WiFi...) has reduced the value of running a Web application locally rather than remotely. This is very true for bandwidth and somewhat less for latency.
Internet is actually a peer-to-peer network. Routers have huge routing tables. There is even a standard, RPSL that helps the autonomous systems that make Internet to keep in sync. The response time is made of:
The time spent between the PC or LAN and the ISP. This time depends on the message size.
The time spent between your ISP and the server. This time primarily depends on the number of routers, which, thank to the routing tables, is almost independent of the distance between the ISP access point and the server.
The server response time.
Some studies have shown that network bandwidth grows faster than processor speed. If these studies are right latency could become the next bottleneck, the routing taking place in CPUs.
Overall the value of PageBox depends on the cost and speed of PC or LAN links to Internet.
To be effective PageBox must be installed:
On the consumer LAN or
Close to the ISP access point
We chose to focus for the moment on the first target: providing a light-weight infrastructure to install components on LAN Web sites, these components being sometime stand-alone (it could be the case for franchised business) and sometime used in portals/WebTops.
The installation and the update of applications are costly for both customers and software publishers.
PageBox automates this process and makes it free. Two scenarios are possible:
The software publisher hosts the Repository. When they buy the application the publisher gives to the customer an account allowing them to subscribe to the Repository. Then the application is automatically deployed on the customer PageBox agent(s). When the publisher updates the application the update is automatically deployed to all subscribers.
The customer hosts the Repository. When they buy the application the customer gives to the software publisher an account allowing them to publish their application to the customer Repository.
Though PageBox is optimized for the deployment of Web application it may actually deploy any kind of application. The java version, which is the most advanced version to date, allows running an installation step where the application can create and populate a database or run an installation class. This facility allows designing dummy applications whose sole purpose is to update a reference database or to run a job. In that case PageBox acts like an RPC facility with the difference that the publisher doesn’t know how many machines will run the procedure and what are these machines.
Web application instances deployed from the same Repository are identical and offer the same services and serve the same requests. Moreover a Web application instance can get the location of its clones. Therefore PageBox is a framework for implementing distributed applications.
This property is used by the PageBox API. An application can use this API to send messages to its clones. An Active Naming feature allows routing a request to the most suitable instance.
In a constellation each machine may have access to a specific data set, to a special device or to a given resource. A PageBox-installed Web application instance can inform its clones on other machines about the data, device or other resource that it controls. Then any of these clones can route requests regarding this data, device or resource to the Web application using Active Naming.
This property can be used to implement data-dependent routing (distributed database) or to implement monitoring applications.
For security PageBox only needs existing and mature technology.
A Repository can be configured to use HTTPS, which implies that it has a certificate. The publisher and the subscriber can check the Repository certificate. Thank to HTTPS data are encrypted.
The Repository checks if the publisher, the subscriber and the Repository administrator are authenticated.
In the same way the subscribed PageBox agent can be configured to use HTTPS. The Repository can check the certificate of the PageBox agent when it deploys a Web application. Thank to HTTPS data are encrypted.
When the Repository deploys a Web application or when the PageBox administrator wants to use the agent’s user interface the PageBox agent checks if the Repository is authenticated.
In the java version, which is the most advanced version to date, the administrator can:
Define from which Repository and from publisher a Web application can be deployed
Define if the Web archive is allowed to run an installation step that can create a database and populate a database or run a installation class
Run the Web application in a Java 2 sandbox with a controlled access to sensitive resources like disk or serial interface
Both the Repository and the PageBox agent implement logging and Web pages displaying the log records. The PageBox agent implements an API allowing deployed applications to log on the agent log.
The java version allows a publisher connected to the Repository to display a non-sensitive, Web application dependent subset of the agent log for troubleshooting.
The new version of PageBox has been designed to be:
Small and simple to install
Implementable with all Web server technologies
Easy to port and extend
We provide three flavours of the new version:
PageBox for Java
PageBox for .NET
PageBox for PHP
Primarily because these technologies were then terra incogita for us, we implemented first the new version in PHP and .NET. The new version reuses extensively the most effective mechanisms introduced in the first Java version, especially the publish and subscribe model. However the design is more modular. The Repository has more functions and is server-only. There is a clean separation between
The distribution process
The monitoring system responsible to retry deployments
The installation process
The API designed to be called by deployed Web applications
Though it uses the same modular design as PageBox for PHP and PageBox for .NET, PageBox for Java introduces new functions:
A more scalable deployment model described in the Grid API V2. You can find details about this model in the Deployment with relays section.
A delta deployment. Only the difference between the current version and the installed version is sent to the target PageBox.
An installation API allowing fully automated deployments and updates.
A better security model. In PageBox for Java the PageBox administrator grants installation rights to repository and publishers. The PageBox subscriber and the publisher are checked everywhere.
A token API. Deployment with relays allows creating constellations of thousand and more PageBox agents. If all these PageBoxes called each other with Web services it would generate a huge number of messages. PageBox for Java implements a Token ring to address this issue. A subset of the PageBox API, the Token API allows Web application instances to broadcast or send messages to other Web application instances. These messages are stacked locally up to the reception of a token frame. Then these messages are added to the frame and the frame is sent to the next PageBox on the ring. This token ring architecture minimizes the number of messages.
An improved Active Naming leveraging on the Token API. This Active Naming allows imperative matching and weighted load balancing based on application-defined criteria and on the available resources (CPU, memory and network latency).
We identified four kinds of components:
Distributed applications. A distributed application is a Web application that relies on the fact that it can retrieve the URL of its clones with the PageBox API. This feature allows instances to cooperate. For instance an instance can redirect some requests to the clone that has the needed data.
Controls. A PageBox control is like a Windows control. It implements a small part of a user interface. For instance a Web service provider can publish a PageBox control as well as its Web service. The consumer just needs to insert the control in its portal or WebTop. When the Web service interface, the provider just need to publish the corresponding update of the PageBox control.
Grid components. The idea here is to distribute workload to the deployed components. When you have a resource consuming process and when you have identified a suitable parallel algorithm you can implement a Web application offering a Web service. Then you publish the application on a PageBox repository, you use the PageBox API to enumerate the instances of your application and you distribute your workload to the Web services of your application instances.
Site replication. Sites host a PageBox agent and subscribe to a Repository. Authors publish page and Web application archives. The PageBox agents inflate and install the archives on the sites.
PageBox for PHP was the first implementation of the new version.
It shows that we can implement a PageBox with a script language.
PageBox for PHP has been tested with PHP 4.0.6 and 4.1.0.
We checked that it works in combination with Apache and BadBlue.
BadBlue is a very small Web server. The idea with the BadBlue support is to show that PageBox can have a very small memory and disk requirement. The combination BadBlue + PHP + PageBox requires less memory than an office application like Word. An end user could easily run such a personal Web server and get its Web applications using PageBox.
PageBox for .NET designed is based on the PageBox for PHP design.
It is written in C# and uses extensively features introduced in ASP.NET such as Web Forms.
Its monitoring system is implemented as a Windows (NT) service.
PageBox for Java design is based on the PageBox for .NET. Like PageBox for .NET PageBox for Java can use SOAP Web services but it implements a plug-able network layer: beside SOAP you can use raw HTTP or implement your own protocol. The reference version of PageBox for Java runs on Java Web Services Developer pack (WSDP) and on Tomcat with Axis.
PageBox for Java can be easily ported to other Application servers that support:
JavaServer Page (JSP) specification 1.2 or 2
Servlet specification 2.3 or 2.4
COS, the com.oreilly.servlet package written by Jason Hunter for the Web archive upload (MultipartRequest class)
Reservation was developed as an example of application distributed with PageBox for .NET. It is:
A real application
With a XML configuration
Making database requests to a local database
Making Web service requests to other Reservation instances
GoogleControl is a PageBox control that implements a user interface to the Google API and allows setting
The control width
The control style with CSS classes
The Google license
GoogleControl is an example tested with PageBox for .NET.
Pandora is an example of distributed application tested with PageBox for Java.
Pandora uses three Web applications:
distributed, which provides the user interface to fill commands
central that emulates a central server where commands are processed
paydeliver that emulate either a payment server or a delivery server
Though Pandora is only an example it illustrates an e-commerce scenario: many portals sell some articles. The commands are processed by remote servers, which subcontract the payment processing and the article delivery to third parties.
Pandora implements two functions of interest:
Sandbox checking that can help PageBox administrators to check that they properly set up their environment
Trusted referrer as explained on the Trusted web site page. The idea is to check (1) that the user is coming from a trusted site, (2) to delegate the user authentication to this trusted site and (3) to get from this site information allowing to charge and invoice the user and to deliver the good if needed. With this mechanism Pandora doesn’t check if the user is authenticated. It can safely rely on the calling site for that. You can use this mechanism when you need to provide a single sign-on, for instance in case of integration to a portal.
Epimetheus is an example that illustrates the use of resources and extensions in PageBox.
A resource is a servlet container resource made available to the Web application through the PageBox API according to a setting defined by the PageBox administrator.
An extension is a component, typically calling native code, made available to the Web application through the PageBox API according to a setting defined by the PageBox administrator.
EuroLCC is an example allowing finding routes covered by Low Cost Carriers. Initially this list only contained European LCCs, hence the name. EuroLCC illustrates the use of the generic installation class. This installation class creates a database and tables where it stores the airport codes, the airport names and the routes.
Prometheus is a simple chat application that illustrates the use of the token API and of the Active Naming.
Java PHP .NET
Reservation Controls Java controls
Polaris Grid Coordinator Grid V2
Distribution Installation NWS
©2001-2004 Alexis Grandemange Last modified