HTML Parser Home Page

org.htmlparser.visitors
Class TextExtractingVisitor

java.lang.Object
  extended by org.htmlparser.visitors.NodeVisitor
      extended by org.htmlparser.visitors.TextExtractingVisitor

public class TextExtractingVisitor
extends NodeVisitor

Extracts text from a web page. Usage: Parser parser = new Parser(...); TextExtractingVisitor visitor = new TextExtractingVisitor(); parser.visitAllNodesWith(visitor); String textInPage = visitor.getExtractedText();


Constructor Summary
TextExtractingVisitor()
           
 
Method Summary
 String getExtractedText()
           
 void visitEndTag(Tag tag)
          Called for each Tag visited that is an end tag.
 void visitStringNode(Text stringNode)
          Called for each StringNode visited.
 void visitTag(Tag tag)
          Called for each Tag visited.
 
Methods inherited from class org.htmlparser.visitors.NodeVisitor
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf, visitRemarkNode
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextExtractingVisitor

public TextExtractingVisitor()
Method Detail

getExtractedText

public String getExtractedText()

visitStringNode

public void visitStringNode(Text stringNode)
Description copied from class: NodeVisitor
Called for each StringNode visited.

Overrides:
visitStringNode in class NodeVisitor
Parameters:
stringNode - The string node being visited.

visitTag

public void visitTag(Tag tag)
Description copied from class: NodeVisitor
Called for each Tag visited.

Overrides:
visitTag in class NodeVisitor
Parameters:
tag - The tag being visited.

visitEndTag

public void visitEndTag(Tag tag)
Description copied from class: NodeVisitor
Called for each Tag visited that is an end tag.

Overrides:
visitEndTag in class NodeVisitor
Parameters:
tag - The end tag being visited.

© 2006 Derrick Oswald
Sep 17, 2006

HTML Parser is an open source library released under Common Public License. SourceForge.net