How to Build the HTML Parser libraries


Set up java. I won't include instructions here, just a link to the Sun j2se site. I use version 1.5 (1.5.0_06-b05), and you need a JDK (java development kit), not a JRE (java runtime environment).

Test your installation by typing command:


This should display help on the java compiler options.


Set up maven (maven 2), the Java-based build tool from the Apache Maven project. Test your installation by typing the command:

mvn -help

This should display help on maven options.


The latest sources are only available from the Subversion repository on Sourceforge.

Install the Subversion client, change to an appropriate subdirectory and issue the checkout command:

      svn co htmlparser
If you've never done this before it will ask you to validate the sourceforge subversion server ssl certificate and then proceed to fetch the head revision files (in trunk):
A    htmlparser\trunk
A    htmlparser\trunk\lexer
A    htmlparser\trunk\lexer\src
A    htmlparser\trunk\lexer\src\main
A    htmlparser\trunk\lexer\src\main\java
A    htmlparser\trunk\lexer\src\main\java\org
A    htmlparser\trunk\lexer\src\main\java\org\htmlparser
A    htmlparser\trunk\lexer\src\main\java\org\htmlparser\http
A    htmlparser\trunk\lexer\src\main\java\org\htmlparser\http\
A    htmlparser\trunk\lexer\src\main\java\org\htmlparser\http\
A    htmlparser\trunk\parser\pom.xml
A    htmlparser\trunk\parser\build.xml
A    htmlparser\trunk\build.xml
Checked out revision 5.
The head revision number will differ from the example above.

The sources are laid out in the standard structure:

          htmlparser               subversion directory for htmlparser
            trunk                  directory for head revision
              pom.xml              main project maven project object model
              build.xml            generated ant build script
              build.cmd            helper command file for deployment
              src                  main project sources
              lexer                lexer component subdirectory
              parser               parser component subdirectory
              filterbuilder        FilterBuilder example application subdirectory
              sitecapturer         SiteCapturer example application subdirectory
              thumbelina           Thumbelina example application subdirectory
            branches               directory for branches
            tags                   directory for tags


Each project can be built separately, but to build everything change to the trunk directory and issue the 'install' target:
  $ cd htmlparser/trunk
  $ mvn install
This generates a 'target' directory for each component/application and installs the built artifacts into your local maven repository:
    windows:    C:\Documents and Settings\username\.m2\repository
    linux:      $home/.m2/repository
You may want to generate the site and documentation by adding the 'site' target:
  $ mvn install site
To avoid running the unit tests add the maven.test.skip switch to the maven command line:
    mvn -Dmaven.test.skip=true ..
To build the diretribution components use the assembly:assembly target:
  $ mvn install assembly:assembly
In order to sign jar files for the Java web start components you will need to generate a self signing certificate in a keystore:
keytool -keystore your keystore location -alias your signing key alias -genkey
Enter keystore password:  your keystore password
What is your first and last name?
  [Unknown]:  your full name
What is the name of your organizational unit?
  [Unknown]:  your organizational unit
What is the name of your organization?
  [Unknown]:  your organization
What is the name of your City or Locality?
  [Unknown]:  your city
What is the name of your State or Province?
  [Unknown]:  your state or province
What is the two-letter country code for this unit?
  [Unknown]:  your country code
Is CN=your full name, OU=your organizational unit, O=your organization, L=your city, ST=your state or province, C=your country code correct?
  [no]:  y

Enter key password for 
        (RETURN if same as keystore password): your signing key password

keytool -keystore your keystore location -alias your signing key alias -selfcert
Enter keystore password:  your keystore password
and create a settings.xml file in the .m2 directory:
    windows:    C:\Documents and Settings\username\.m2\settings.xml
    linux:      $home/.m2/settings.xml
        <keystore.location>your keystore location</keystore.location>
        <keystore.storepass>your keystore password</keystore.storepass>
        <keystore.alias>your signing key alias</keystore.alias>
        <keystore.keypass>your signing key password</keystore.keypass>