HTML Parser Home Page | |||||||||
Packages that use ParserException | |
org.htmlparser | The basic API classes which will be used by most developers when working with the HTML Parser. |
org.htmlparser.beans | The beans package contains Java Beans using the HTML Parser. |
org.htmlparser.http | The http package is responsible for HTTP connections to servers. |
org.htmlparser.lexer | The lexer package is the base level I/O subsystem. |
org.htmlparser.lexerapplications.thumbelina | Extract the images behind thumbnail images. |
org.htmlparser.nodes | The nodes package has the concrete node implementations. |
org.htmlparser.parserapplications | |
org.htmlparser.sax | The sax package implements a SAX (Simple API for XML) parser for HTML. |
org.htmlparser.scanners | The scanners package contains classes responsible for the tertiary identification of tags. |
org.htmlparser.tags | The tags package contains specific tags. |
org.htmlparser.util |
Uses of ParserException in org.htmlparser |
Methods in org.htmlparser that throw ParserException | |
Remark |
NodeFactory.createRemarkNode(Page page,
int start,
int end)
Create a new remark node. |
Text |
NodeFactory.createStringNode(Page page,
int start,
int end)
Create a new text node. |
Tag |
NodeFactory.createTagNode(Page page,
int start,
int end,
Vector attributes)
Create a new tag node. |
void |
Perform the meaning of this tag. |
NodeIterator |
Returns an iterator (enumeration) over the html nodes. |
NodeList |
Parser.extractAllNodesThatMatch(NodeFilter filter)
Extract all nodes matching the given filter. |
NodeList |
Parser.parse(NodeFilter filter)
Parse the given resource, using the filter provided. |
void |
Parser.postConnect(HttpURLConnection connection)
Called just after calling connect. |
void |
Parser.preConnect(HttpURLConnection connection)
Called just prior to calling connect. |
void |
Parser.setConnection(URLConnection connection)
Set the connection for this parser. |
void |
Parser.setEncoding(String encoding)
Set the encoding for the page this parser is reading from. |
void |
Parser.setInputHTML(String inputHTML)
Initializes the parser with the given input HTML String. |
void |
Parser.setResource(String resource)
Set the html, a url, or a file. |
void |
Parser.setURL(String url)
Set the URL for this parser. |
void |
Parser.visitAllNodesWith(NodeVisitor visitor)
Apply the given visitor to the current page. |
Constructors in org.htmlparser that throw ParserException | |
Parser(String resource)
Creates a Parser object with the location of the resource (URL or file). |
Parser(String resource,
ParserFeedback feedback)
Creates a Parser object with the location of the resource (URL or file) You would typically create a DefaultHTMLParserFeedback object and pass it in. |
Parser(URLConnection connection)
Construct a parser using the provided URLConnection. |
Parser(URLConnection connection,
ParserFeedback fb)
Constructor for custom HTTP access. |
Uses of ParserException in org.htmlparser.beans |
Methods in org.htmlparser.beans that throw ParserException | |
protected NodeList |
Apply each of the filters. |
protected URL[] |
Internal routine to extract all the links from the parser. |
protected String |
Extract the text from a page. |
Uses of ParserException in org.htmlparser.http |
Methods in org.htmlparser.http that throw ParserException | |
URLConnection |
ConnectionManager.openConnection(String string)
Opens a connection based on a given string. |
URLConnection |
ConnectionManager.openConnection(URL url)
Opens a connection using the given url. |
void |
ConnectionMonitor.postConnect(HttpURLConnection connection)
Called just after calling connect. |
void |
ConnectionMonitor.preConnect(HttpURLConnection connection)
Called just prior to calling connect. |
Uses of ParserException in org.htmlparser.lexer |
Methods in org.htmlparser.lexer that throw ParserException | |
char |
Page.getCharacter(Cursor cursor)
Read the character at the given cursor position. |
static void |
Lexer.main(String[] args)
Mainline for command line operation |
protected Node |
Lexer.makeRemark(int start,
int end)
Create a remark node based on the current cursor and the one provided. |
protected Node |
Lexer.makeString(int start,
int end)
Create a string node based on the current cursor and the one provided. |
protected Node |
Lexer.makeTag(int start,
int end,
Vector attributes)
Create a tag node based on the current cursor and the one provided. |
Node |
Get the next node from the source. |
Node |
Lexer.nextNode(boolean quotesmart)
Get the next node from the source. |
Node |
Return CDATA as a text node. |
Node |
Lexer.parseCDATA(boolean quotesmart)
Return CDATA as a text node. |
protected Node |
Lexer.parseJsp(int start)
Parse a java server page node. |
protected Node |
Lexer.parsePI(int start)
Parse an XML processing instruction. |
protected Node |
Lexer.parseRemark(int start,
boolean quotesmart)
Parse a comment. |
protected Node |
Lexer.parseString(int start,
boolean quotesmart)
Parse a string node. |
protected Node |
Lexer.parseTag(int start)
Parse a tag. |
protected void |
Lexer.scanJIS(Cursor cursor)
Advance the cursor through a JIS escape sequence. |
void |
Page.setConnection(URLConnection connection)
Set the URLConnection to be used by this page. |
void |
StringSource.setEncoding(String character_set)
Set the encoding to the given character set. |
abstract void |
Source.setEncoding(String character_set)
Set the encoding to the given character set. |
void |
Page.setEncoding(String character_set)
Begins reading from the source with the given character set. |
void |
InputStreamSource.setEncoding(String character_set)
Begins reading from the source with the given character set. |
void |
Page.ungetCharacter(Cursor cursor)
Return a character. |
Constructors in org.htmlparser.lexer that throw ParserException | |
Lexer(URLConnection connection)
Creates a new instance of a Lexer. |
Page(URLConnection connection)
Construct a page reading from a URL connection. |
Uses of ParserException in org.htmlparser.lexerapplications.thumbelina |
Methods in org.htmlparser.lexerapplications.thumbelina that throw ParserException | |
protected URL[][] |
Thumbelina.extractImageLinks(Lexer lexer,
URL docbase)
Get the links of an element of a document. |
Uses of ParserException in org.htmlparser.nodes |
Methods in org.htmlparser.nodes that throw ParserException | |
void |
Perform the meaning of this tag. |
Uses of ParserException in org.htmlparser.parserapplications |
Methods in org.htmlparser.parserapplications that throw ParserException | |
protected boolean |
SiteCapturer.isHtml(String link)
Returns true if the link contains text/html content. |
protected void |
SiteCapturer.process(NodeFilter filter)
Process a single page. |
Uses of ParserException in org.htmlparser.sax |
Methods in org.htmlparser.sax with parameters of type ParserException | |
void |
Feedback.error(String message,
ParserException e)
Error message. |
Methods in org.htmlparser.sax that throw ParserException | |
protected void |
XMLReader.doSAX(Node node)
Process nodes recursively on the DocumentHandler. |
Uses of ParserException in org.htmlparser.scanners |
Methods in org.htmlparser.scanners that throw ParserException | |
protected Tag |
CompositeTagScanner.createVirtualEndTag(Tag tag,
Lexer lexer,
Page page,
int position)
Creates an end tag with the same name as the given tag. |
static String |
ScriptDecoder.Decode(Page page,
Cursor cursor)
Decode script encoded by the Microsoft obfuscator. |
protected void |
CompositeTagScanner.finishTag(Tag tag,
Lexer lexer)
Finish off a tag. |
Tag |
StyleScanner.scan(Tag tag,
Lexer lexer,
NodeList stack)
Scan for style definitions. |
Tag |
ScriptScanner.scan(Tag tag,
Lexer lexer,
NodeList stack)
Scan for script. |
Tag |
CompositeTagScanner.scan(Tag tag,
Lexer lexer,
NodeList stack)
Collect the children. |
Tag |
TagScanner.scan(Tag tag,
Lexer lexer,
NodeList stack)
Scan the tag. |
Tag |
Scanner.scan(Tag tag,
Lexer lexer,
NodeList stack)
Scan the tag. |
Uses of ParserException in org.htmlparser.tags |
Methods in org.htmlparser.tags that throw ParserException | |
void |
Perform the META tag semantic action. |
void |
Perform the meaning of this tag. |
Uses of ParserException in org.htmlparser.util |
Subclasses of ParserException in org.htmlparser.util | |
class |
The encoding is changed invalidating already scanned characters. |
Methods in org.htmlparser.util with parameters of type ParserException | |
void |
ParserFeedback.error(String message,
ParserException e)
static void |
FeedbackManager.error(String message,
ParserException e)
void |
DefaultParserFeedback.error(String message,
ParserException exception)
Print an error message. |
Methods in org.htmlparser.util that throw ParserException | |
static Parser |
ParserUtils.createParserParsingAnInputString(String input)
Create a Parser Object having a String Object as input (instead of a url or a string representing the url location). |
boolean |
Check if more nodes are available. |
boolean |
Check if more nodes are available. |
Node |
Get the next node. |
Node |
Get the next node. |
static String[] |
ParserUtils.splitTags(String input,
Class nodeType)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String[] |
ParserUtils.splitTags(String input,
Class nodeType,
boolean recursive,
boolean insideTag)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String[] |
ParserUtils.splitTags(String input,
NodeFilter filter)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String[] |
ParserUtils.splitTags(String input,
NodeFilter filter,
boolean recursive,
boolean insideTag)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String[] |
ParserUtils.splitTags(String input,
String[] tags)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String[] |
ParserUtils.splitTags(String input,
String[] tags,
boolean recursive,
boolean insideTag)
Split the input string in a string array, considering the tags as delimiter for splitting. |
static String |
ParserUtils.trimTags(String input,
Class nodeType)
Trim all tags in the input string and return a string like the input one without the tags and their content. |
static String |
ParserUtils.trimTags(String input,
Class nodeType,
boolean recursive,
boolean insideTag)
Trim all tags in the input string and return a string like the input one without the tags and their content (optional). |
static String |
ParserUtils.trimTags(String input,
NodeFilter filter)
Trim all tags in the input string and return a string like the input one without the tags and their content. |
static String |
ParserUtils.trimTags(String input,
NodeFilter filter,
boolean recursive,
boolean insideTag)
Trim all tags in the input string and return a string like the input one without the tags and their content (optional). |
static String |
ParserUtils.trimTags(String input,
String[] tags)
Trim all tags in the input string and return a string like the input one without the tags and their content. |
static String |
ParserUtils.trimTags(String input,
String[] tags,
boolean recursive,
boolean insideTag)
Trim all tags in the input string and return a string like the input one without the tags and their content (optional). |
void |
NodeList.visitAllNodesWith(NodeVisitor visitor)
Utility to apply a visitor to a node list. |
© 2006 Derrick Oswald Sep 17, 2006
HTML Parser is an open source library released under Common Public License. | |