HTML Parser Home Page

Package org.htmlparser.parserapplications

Example applications.


Class Summary
LinkExtractor LinkExtractor extracts all the links from the given webpage and prints them on standard output.
SiteCapturer Save a web site locally.
StringExtractor Extract plaintext strings from a web page.
WikiCapturer Save a wikiwikiweb locally.

Package org.htmlparser.parserapplications Description

Example applications.

Link Extractor
Extract links/mail addresses from a web page.
    bin/linkextractor http://website_url [-maillinks]
    the optional -maillinks argument causes mailto: links to be printed
String Extractor
Extract text from a web page.
    bin/stringextractor http://website_url [-links]

    the optional -links argument causes hyperlinks to be shown within the text
Site Capturer
Save a web site locally.
    bin/sitecapturer http://source_website /target_directory/ [true|false]

    the optional boolean argument determines whether resources such as images,
    audio and video are to be captured
Wiki Capturer
Save a wiki locally.
org.htmlparser.parserapplications.WikiCapturer Subclass of SiteCapturer (see above) that eliminates specific Wiki pages.

© 2005 Derrick Oswald
Jun 10, 2006

HTML Parser is an open source library released under LGPL.