by Dhaval Udani
In 1984, Citicorp Overseas Software Limited(COSL) was created by Citibank to produce low cost software for its various banking operations. Citicorp Information Technologies India Ltd.(CITIL), now know ans i-Flex, was formed out of this company around 10 years back to service non-Citi clients. In 2001, COSL was merged with another arm of Citibank, India known as Global Support Unit(GSU) to form OrbiTech Solutions Ltd which in turn merged with Polaris Software Labs in 2002. With its expertise in the banking domain, OrbiTech undertook to develop a suite of banking products. However with several players in the market, it needed something innovative and fast. With an aim of increasing productivity, an initiative was started to develop tools, code generators and reusable components to be used within the organization. It is in this aspect that I got involved with HTMLParser.
We were developing an MVC-based framework for performing static maintenance of information like bank accounts, customer records etc. To simplify development for users, we were asking our users to develop simple static HTML pages which we would convert to JSP pages capable of showing dynamic data. It is towards this goal that I required a tool which could parse HTML tags and allow me to play with them. I searched high and low for various options. One of them was the HTML DOM standard and APIs of W3C. However their inability to process JSP tags and inability to change the tags and reproduce them meant I had to discard it. Another implementation of the DOM standard was provided by NekoHTML.
However it had similar problems and was too complex. These factors drew
me to HTMLParser. Initially it was difficlt to understand but once I had
written my first parsing routine, it was too easy. I especially love the
easy manner in which scanners are registered and removed so that scanning
is enabled or disabled for particular tags. This feature is absolutely
fantastic. Having to search for tags which were not written in the original
HTMLParser caused a slight flutter in my heart. However Somik encouraged
me not to give up and write my own tag-scanner pairs.