|
|
|
|
|
|
Publish once, deploy everywhereIf you are architect, you are often asked to design an
application serving ten thousand and more users and thousand and more
locations. Many companies and government organizations have offices or agencies
in every town. You don’t need expensive marketing studies to recognize the
fact. As a consumer and citizen, wherever you live, you expect to find banks,
travel agencies and government offices. However when the RFP falls on your
desk, you know that you will suffer. Ok, it doesn’t have to be a garden of
roses, but things should be simpler today. Basically you have to choose between Charybdis, the
whirlpool, aka the fat client with a client/server protocol and Scylla, the
reef, aka the Web server farm. In any case, you suffer wreck. The customer
maintains a comprehensive list of all client/server projects that failed in the
last couple of years and asks questions about deployment. He knows less Scylla
but he learns quickly about bandwidth requirements and outages of redundant, fault-tolerant
server farms. The fact is that the customer is right in both cases. The technology made enough progress during the last couple
of years to consider another approach, PageBox. To explain it, I introduce first
the technology that it uses. JavaJava is well known for its portability and comprehensive
network support. It is the language that allows downloading a piece of code
from Internet and running it safely in a browser. It provides: 1. Class
loaders able to download code from sources such as HTTP servers 2. Java
2 security that allows running code with rights depending on its signature, on
how much the host trusts its certificate. J2EE application servers
A J2EE application server is a set of containers that
provides an environment for Web and EJB archives. Web archives and EJB archives
are deployment units. The most important for our need is the Web archive. It
gathers servlets, JavaServer Pages, taglibs, beans, java classes, libraries and
resources. As the interface between containers and deployment archives
is well specified, it is relatively easy to insert a package that the
Application Server sees as a Web Archive and that the Web archive sees as a
container. Publishing frameworkWeb archives have a drawback for a large-scale deployment:
they include procedural (Java) code. On one hand it is good as it allows
applying Object Oriented Programming concepts and Design Patterns to
presentation but on the other hand it is tough and expensive to get the same
level of robustness as with for instance XSL templates. Therefore it is useful to consider alternatives such as
Publishing frameworks. An example of Publishing framework is the Apache
project’s Cocoon.
When a user asks for a page, it starts a pipeline whose the
first step is a producer. The producer produces an XML document. The next step
is a processor. A processor is something that handles an input XML document
and generates an output XML document. What I call a network processor is a
generic processor that parses the input document, uses that information to call
a network service (RMI, CORBA, EJB, SOAP, XML-RPC) and uses the response to
generate the output message. A special case of network processor is a Web
service processor. In that case, the processor is generated from the WSDL
definition of a Web service. The next step is a standard XSL processor. It transforms the
document that it receives from the network processor using an XSL style sheet.
The last step is a formatter that converts the document that it receives from
the XSL processor into a stream returned to the user. A publishing framework runs in an Application server and the
Application server sees the publishing framework as a Web archive. XML and XSL
files are regular resources and network processors are regular java classes. We
have a layered architecture like this:
Presentation and data access on different machinesTherefore the diagram below makes more sense: It addresses well our initial problem. The Web archive or presentation doesn’t access central databases. We can install as many of them as we want and as close as we need to the users. We can have roughly the same bandwidth requirement as client/server applications because we use client/server protocols and we no longer need a big central server farm. However the customer objects: “With that solution, we have extra servers to administrate with FTP as main tool.” And it is still right; we need a better solution. PageBoxWe can insert a layer between the Application server and the
Web archive or Publishing framework:
This PageBox layer acts as an Application server for the Web
archive and the Publishing framework and acts as a Web archive for the
Application server. The PageBox layer takes care about deployment and security
with a simple principle: download the Presentation or the Web archive just like
a browser downloads an applet and run it in a sandbox. The whole picture is no more complex: Support libraries
exist to avoid downloading libraries shared by different presentations or web archives.
An issue remains however: we have to configure the PageBox to let it know where
it can download a presentation. Publish once, deploy everywhereWe have a population of PageBoxes, which need to know where
to download presentations and a population of Presentation providers who want
to deploy their presentation. We can use a Publish and Subscribe protocol to
automate the deployment:
1. The
PageBox host subscribes to a repository. 2. The
presentation author publishes (uploads) its presentation on the repository. 3. The
repository notifies its subscribers that a new presentation is available using
a deploy command. 4. The
PageBox downloads the presentation from the repository. As we use only HTTP, the repository itself is a Web
Application. The model has a couple of interesting properties: a) The
PageBox host can subscribe to many repositories b) The
presentation author can publish to many repositories c) Everyone
can create PageBoxes or repositories d) PageBoxes
can call Web services or cooperate using a P2P protocol: Let’s look at some use cases. Portal The simplest case is the case where you provide a Portal. You want to include a site that has a PageBox repository. It
stored the repository URL in an RSS file. You use the repository URL to
subscribe the repository and you get automatically the site presentations on
your Application server. You serve your customers faster and the remote site
spares machines because it no longer serves presentations. ConstellationThe customer still has to install and maintain a server per
location and he wastes resources. An Application server installed on a single
processor PC can typically serve twenty requests per second or eighty to two
hundred concurrent users whereas application servers have only to serve ten
users on average. We cannot address these issues if we keep the Application
servers in the company. We have to move them on ISP/ASP side. There are several
benefits to do so: 1. Presentation
application servers are better used. They can be shared between many customers. 2. ISPs
and ASPs have bigger links than companies that cost them less. A reason is that
they avoid paying for the last mille. They rent racks or are hosted in network
nodes. 3. ASPs
are specialists of hosting. We can create constellations of PageBoxes. In a constellation,
a set of ISPs and ASPs host PageBoxes and subscribe to Presentation providers’
repositories. PageBoxes access the Web services of Data providers and are
accessed by browsers. We show here two worldwide configurations, white and blue on
the figure. When a user makes a request she or he is served by one among the
closest PageBoxes. We got five benefits: 1. Reduced
latency. As the PageBox is close to the user, she or he gets the response
faster. It is not negligible for professional applications: the latency across
the Atlantic is about 130ms in the best case. 2. Lower
server cost. Data providers no longer need to operate complex server farms 3. Fewer
round trips. The PageBox maintains reference data and cache. It invokes the
Data provider only for updates. 4. Cheap
fault tolerance. A user can be served by many PageBoxes. 5. Bandwidth.
Like with client/server applications, only the data go on the wire. ConclusionNote that PageBox is just the implementation for dynamic
content of something that already exists for static content. Did you never
download from an Akamai server? In the long term PageBox model can have a significant
impact. As it allows deploying Web Applications anywhere, Presentation can
become an Internet commodity. If data owners provide Web services, anyone can
compete to provide the best presentation and publish it. PageBox repositories
are presentation yellow pages, just like UDDI directories are Web service
yellow pages. You can find which presentations are available and who host them. PageBox doesn’t only allow the development of sub-second
response time Web applications, it also allow creating a more cooperative
infrastructure and new kinds of applications. ResourcesWe develop PageBox in Open Source (Gnu LGPL). You can
download it on http://pagebox.net or http://pagebox.sourceforge.net/doc.html.
We also provide a simplified installation procedure on http://pagebox.net/install.htm. The
product is however still beta. Author BiographyAlexis Grandemange is architect and developer for a
Computerized Reservation System. A Java developer with 19 years of experience
in computer field, Alexis previously worked on large Intranet solutions at BEA
Systems. It can be contacted at alexis.grandemange@pagebox.net.
Installation
Constellations
Versions
Demo
Contact:support@pagebox.net |