HTML Parser Home Page

Class WikiCapturer

  extended by org.htmlparser.parserapplications.SiteCapturer
      extended by org.htmlparser.parserapplications.WikiCapturer

public class WikiCapturer
extends SiteCapturer

Save a wikiwikiweb locally. Illustrative program to save a wiki locally.

Field Summary
Fields inherited from class org.htmlparser.parserapplications.SiteCapturer
mCaptureResources, mCopied, mFilter, mFinished, mImages, mPages, mParser, mSource, mTarget, TRANSFER_SIZE
Constructor Summary
          Create a wikicapturer.
Method Summary
protected  boolean isToBeCaptured(String link)
          Returns true if the link is one we are interested in.
static void main(String[] args)
          Mainline to capture a web site locally.
Methods inherited from class org.htmlparser.parserapplications.SiteCapturer
capture, copy, decode, getCaptureResources, getFilter, getSource, getTarget, isHtml, makeLocalLink, process, setCaptureResources, setFilter, setSource, setTarget
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public WikiCapturer()
Create a wikicapturer.

Method Detail


protected boolean isToBeCaptured(String link)
Returns true if the link is one we are interested in.

isToBeCaptured in class SiteCapturer
link - The link to be checked.
true if the link has the source URL as a prefix and doesn't contain '?' or '#'; the former because we won't be able to handle server side queries in the static target directory structure and the latter because presumably the full page with that reference has already been captured previously. This performs a case insensitive comparison, which is cheating really, but it's cheap.


public static void main(String[] args)
                 throws MalformedURLException,
Mainline to capture a web site locally.

args - The command line arguments. There are three arguments the web site to capture, the local directory to save it to, and a flag (true or false) to indicate whether resources such as images and video are to be captured as well. These are requested via dialog boxes if not supplied.
MalformedURLException - If the supplied URL is invalid.
IOException - If an error occurs reading the pages or resources.

© 2006 Derrick Oswald
Sep 17, 2006

HTML Parser is an open source library released under Common Public License.