|
HTML Parser Home Page | |||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
Node | Specifies the minimum requirements for nodes returned by the Lexer or Parser. |
NodeFactory | This interface defines the methods needed to create new nodes. |
NodeFilter | Implement this interface to select particular nodes. |
Remark | This interface represents a comment in the HTML document. |
Tag | This interface represents a tag (<xxx yyy="zzz">) in the HTML document. |
Text | This interface represents a piece of the content of the HTML document. |
Class Summary | |
---|---|
Attribute | An attribute within a tag. |
Parser | The main parser class. |
PrototypicalNodeFactory | A node factory based on the prototype pattern. |
The basic API classes which will be used by most developers when working with the HTML Parser.
The Parser
class is the main high level class that
provides simplified access to the contents of an HTML page.
A wide range of methods is available to customize the operation of the Parser,
as well as access specific pieces of the page as
Nodes
.
The NodeFactory
interface specifies the requirements
for a developer to have the Parser or Lexer generate nodes. Three types of
nodes are required: Text
, Remark
and Tags
. Tags contain lists
of child nodes and attributes
.
The only provided implementation of the NodeFactory interface
is the PrototypicalNodeFactory
which
operates by holding example nodes and cloning them as needed to satisfy the
requests for nodes by the Parser. By default, a Lexer is it's own NodeFactory,
returning new TextNode
,
RemarkNode
and undifferentiated
Tagnodes
(see the
nodes
package), but when the parser uses a lexer
it replaces this behaviour with a PrototypicalNodeFactory to return a rich
set of specific tags (see the tags
package).
The NodeFilter
interface is used by the filtering
code to determine if a node meets a certain criteria. Some generic examples of
filters can be found in the filters
package.
|
© 2006 Derrick Oswald Sep 17, 2006
|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
HTML Parser is an open source library released under Common Public License. |