A demonstration of what can be accomplished visually through CSS–based design. Select any style sheet from the list to load it into this page.
“Good show. Its effectiveness is proved. A strong case for Open Source anti-virus software!”
--Praveen Shrikhande, SVP, Asianet Dataline on ClamAV-based anti-virus solution from Linuxense

The Scenario

Most of the public utility web sites feature some types of lookup facilities like telephone directory and exam results. Usually they are built around conventional software components like relational database engine, application servers and scripting languages. This approach is fine with most web sites and can run on moderately powerful hardware resources. But when it comes to exam result lookup services it is found that either such software components cannot respond fast enough or hardware resources become inadequate or both. This happens because result lookup traffic has some special (and interesting) characteristics.

Characteristics of Exam Result Traffic

It is like a furious hurricane. Comes from nowhere. Strikes suddenly. Sweeps everything. Then vanishes. Such a traffic normally won't last more than 4-6 hours. But when it peaks it WILL hit the ceiling of your servers capability, for sure. Here the word peak means number of hits per second in the region of 50-500 or even more1. A thump rule says expect your entire visitors (3-4 lakhs for SSLC results) in the first three hours!
This is so because every candidate with Internet access has no reason to delay checking his/her result at the web site. As the time of opening up approaches the site will start getting hits. When it reaches that time it is like opening up a dam and the web site will run on its toes for a few hours from then; trying to serve up the requests and keeping some of them in queue and rejecting some others.
The graph shows a typical result lookup traffic (taken from 2003 SSLC/HSC traffic data). Tux@play

Problem with Conventional Approach

Conventional web sites are designed to handle slow varying traffic with a manageable peak value. In a result serving scenario the first component which is hit by such a surge traffic is the relational database engine. This is because a relational database engine is designed to handle only a limited number of database connections at a given point of time. And with a badly designed database schema, query processing time can increase as the number of queries increases and thus leading to to a runaway situation. Moreover relational database has a number of abstractions provided to tackle complexities found in enterprise system requirements. Such abstractions will make the processing still more slow. And unfortunately none of such features are required to create a lookup web site --it is an overkill.
The next component which will come to its knees is the web server. Scripting languages like PHP, ASP, etc. are used to build result-display pages on-the-fly in conventional approach. This is a demanding job for the web server/application server because this calls for parsing and interpreting its syntax to build the result page each time a visitor is making a request. In some technologies like JSP and Java Servlets this is avoided, by precompiling the code, to a great extent2. The browser, on the other hand, is left with the job of just rendering the page sent to it by the server.
A quick solution is to beef-up the hardware (and software) to make the setup strong enough. But this is a question of financial feasibility.

The Scuba Approach

Scuba, a high performance key-value lookup engine developed at Linuxense, avoids the requirement of a relational database engine. It makes use of Berkeley DB for its data storage and thus avoiding all performance problems associated with relational database engines. Berkeley DB is amazingly fast and lightweight. Scuba does not require to parse any scripts either. Instead it makes the browser do half of the job. Because Scuba believes in the fact: clients are as powerful as servers.
Most desktops, nowadays, are driven by processors running at 1GHz or more. Yet, only a fraction of this power is used while browsing. So, in Scuba, the job of creating the HTML code for result-rendering is shifted to browsers and it is done with Javascript. And the Javascript is generated by the core component of Scuba called mod_scuba, an Apache module written in C language for high performance. Browser, with the help of the Javascript code sent by mod_scuba, generates the HTML code by itself in a few Milli seconds and renders the page. In the mean time Scuba will be serving more requests which are otherwise would have been queued up for later processing.
We are currently working on Scuba to make it more generic and planning to create tools to make such web publishing easier. If you would like to know more about Scuba or Linuxense Performance Engineering services please send an email to info@linuxense.com.