org.htmlparser.filters (HTML Parser 2.0)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

HTML Parser Home Page

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.htmlparser.filters

The filters package contains example filters to select only desired nodes.

See:
Description

Class Summary
AndFilter	Accepts nodes matching all of its predicate filters (AND operation).
CssSelectorNodeFilter	A NodeFilter that accepts nodes based on whether they match a CSS2 selector.
HasAttributeFilter	This class accepts all tags that have a certain attribute, and optionally, with a certain value.
HasChildFilter	This class accepts all tags that have a child acceptable to the filter.
HasParentFilter	This class accepts all tags that have a parent acceptable to another filter.
HasSiblingFilter	This class accepts all tags that have a sibling acceptable to another filter.
IsEqualFilter	This class accepts only one specific node.
LinkRegexFilter	This class accepts tags of class LinkTag that contain a link matching a given regex pattern.
LinkStringFilter	This class accepts tags of class LinkTag that contain a link matching a given pattern string.
NodeClassFilter	This class accepts all tags of a given class.
NotFilter	Accepts all nodes not acceptable to it's predicate filter.
OrFilter	Accepts nodes matching any of its predicates filters (OR operation).
RegexFilter	This filter accepts all string nodes matching a regular expression.
StringFilter	This class accepts all string nodes containing the given string.
TagNameFilter	This class accepts all tags matching the tag name.

Package org.htmlparser.filters Description

The filters package contains example filters to select only desired nodes. For example, to display tags having the "id" attribute, you could use:

Parser parser = new Parser ("http://yadda");
parser.parse (new HasAttributeFilter ("id"));

These filters can be combined to yield powerful extraction capabilities. For example, to get a list of links where the contents is an image, you could use:

NodeList list = new NodeList ();
NodeFilter filter =
    new AndFilter (
        new TagNameFilter ("A"),
        new HasChildFilter (
            new TagNameFilter ("IMG")));
for (NodeIterator e = parser.elements (); e.hasMoreNodes (); )
    e.nextNode ().collectInto (list, filter);