HTML Parser Home Page

org.htmlparser.lexer
Class Source

java.lang.Object
  extended by java.io.Reader
      extended by org.htmlparser.lexer.Source
All Implemented Interfaces:
Closeable, Serializable, Readable
Direct Known Subclasses:
InputStreamSource, StringSource

public abstract class Source
extends Reader
implements Serializable

A buffered source of characters. A Source is very similar to a Reader, like:

 new InputStreamReader (connection.getInputStream (), charset)
 
It differs from the above, in three ways:

See Also:
Serialized Form

Field Summary
static int EOF
          Return value when the source is exhausted.
 
Fields inherited from class java.io.Reader
lock
 
Constructor Summary
Source()
           
 
Method Summary
abstract  int available()
          Get the number of available characters.
abstract  void close()
          Does nothing.
abstract  void destroy()
          Close the source.
abstract  char getCharacter(int offset)
          Retrieve a character again.
abstract  void getCharacters(char[] array, int offset, int start, int end)
          Retrieve characters again.
abstract  void getCharacters(StringBuffer buffer, int offset, int length)
          Append characters already read into a StringBuffer.
abstract  String getEncoding()
          Get the encoding being used to convert characters.
abstract  String getString(int offset, int length)
          Retrieve a string comprised of characters already read.
abstract  void mark(int readAheadLimit)
          Mark the present position.
abstract  boolean markSupported()
          Tell whether this source supports the mark() operation.
abstract  int offset()
          Get the position (in characters).
abstract  int read()
          Read a single character.
abstract  int read(char[] cbuf)
          Read characters into an array.
abstract  int read(char[] cbuf, int off, int len)
          Read characters into a portion of an array.
abstract  boolean ready()
          Tell whether this source is ready to be read.
abstract  void reset()
          Reset the source.
abstract  void setEncoding(String character_set)
          Set the encoding to the given character set.
abstract  long skip(long n)
          Skip characters.
abstract  void unread()
          Undo the read of a single character.
 
Methods inherited from class java.io.Reader
read
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EOF

public static final int EOF
Return value when the source is exhausted. Has a value of -1.

See Also:
Constant Field Values
Constructor Detail

Source

public Source()
Method Detail

getEncoding

public abstract String getEncoding()
Get the encoding being used to convert characters.

Returns:
The current encoding.

setEncoding

public abstract void setEncoding(String character_set)
                          throws ParserException
Set the encoding to the given character set. If the current encoding is the same as the requested encoding, this method is a no-op. Otherwise any subsequent characters read from this source will have been decoded using the given character set.

If characters have already been consumed from this source, it is expected that an exception will be thrown if the characters read so far would be different if the encoding being set was used from the start.

Parameters:
character_set - The character set to use to convert characters.
Throws:
ParserException - If a character mismatch occurs between characters already provided and those that would have been returned had the new character set been in effect from the beginning. An exception is also thrown if the character set is not recognized.

close

public abstract void close()
                    throws IOException
Does nothing. It's supposed to close the source, but use destroy() instead.

Specified by:
close in interface Closeable
Specified by:
close in class Reader
Throws:
IOException - not used
See Also:
destroy()

read

public abstract int read()
                  throws IOException
Read a single character. This method will block until a character is available, an I/O error occurs, or the source is exhausted.

Overrides:
read in class Reader
Returns:
The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or EOF if the source is exhausted.
Throws:
IOException - If an I/O error occurs.

read

public abstract int read(char[] cbuf,
                         int off,
                         int len)
                  throws IOException
Read characters into a portion of an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.

Specified by:
read in class Reader
Parameters:
cbuf - Destination buffer
off - Offset at which to start storing characters
len - Maximum number of characters to read
Returns:
The number of characters read, or EOF if the source is exhausted.
Throws:
IOException - If an I/O error occurs.

read

public abstract int read(char[] cbuf)
                  throws IOException
Read characters into an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.

Overrides:
read in class Reader
Parameters:
cbuf - Destination buffer.
Returns:
The number of characters read, or EOF if the source is exhausted.
Throws:
IOException - If an I/O error occurs.

ready

public abstract boolean ready()
                       throws IOException
Tell whether this source is ready to be read.

Overrides:
ready in class Reader
Returns:
true if the next read() is guaranteed not to block for input, false otherwise. Note that returning false does not guarantee that the next read will block.
Throws:
IOException - If an I/O error occurs.

reset

public abstract void reset()
Reset the source. Repositions the read point to begin at zero.

Overrides:
reset in class Reader

markSupported

public abstract boolean markSupported()
Tell whether this source supports the mark() operation.

Overrides:
markSupported in class Reader
Returns:
true if and only if this source supports the mark operation.

mark

public abstract void mark(int readAheadLimit)
                   throws IOException
Mark the present position. Subsequent calls to reset() will attempt to reposition the source to this point. Not all sources support the mark() operation.

Overrides:
mark in class Reader
Parameters:
readAheadLimit - The minimum number of characters that can be read before this mark becomes invalid.
Throws:
IOException - If an I/O error occurs.

skip

public abstract long skip(long n)
                   throws IOException
Skip characters. This method will block until some characters are available, an I/O error occurs, or the source is exhausted. Note: n is treated as an int

Overrides:
skip in class Reader
Parameters:
n - The number of characters to skip.
Returns:
The number of characters actually skipped
Throws:
IOException - If an I/O error occurs.

unread

public abstract void unread()
                     throws IOException
Undo the read of a single character.

Throws:
IOException - If the source is closed or no characters have been read.

getCharacter

public abstract char getCharacter(int offset)
                           throws IOException
Retrieve a character again.

Parameters:
offset - The offset of the character.
Returns:
The character at offset.
Throws:
IOException - If the source is closed or the offset is beyond offset().

getCharacters

public abstract void getCharacters(char[] array,
                                   int offset,
                                   int start,
                                   int end)
                            throws IOException
Retrieve characters again.

Parameters:
array - The array of characters.
offset - The starting position in the array where characters are to be placed.
start - The starting position, zero based.
end - The ending position (exclusive, i.e. the character at the ending position is not included), zero based.
Throws:
IOException - If the source is closed or the start or end is beyond offset().

getString

public abstract String getString(int offset,
                                 int length)
                          throws IOException
Retrieve a string comprised of characters already read.

Parameters:
offset - The offset of the first character.
length - The number of characters to retrieve.
Returns:
A string containing the length characters at offset.
Throws:
IOException - If the source is closed.

getCharacters

public abstract void getCharacters(StringBuffer buffer,
                                   int offset,
                                   int length)
                            throws IOException
Append characters already read into a StringBuffer.

Parameters:
buffer - The buffer to append to.
offset - The offset of the first character.
length - The number of characters to retrieve.
Throws:
IOException - If the source is closed or the offset or (offset + length) is beyond offset().

destroy

public abstract void destroy()
                      throws IOException
Close the source. Once a source has been closed, further read, ready, mark, reset, skip, unread, getCharacter or getString invocations will throw an IOException. Closing a previously-closed source, however, has no effect.

Throws:
IOException - If an I/O error occurs.

offset

public abstract int offset()
Get the position (in characters).

Returns:
The number of characters that have already been read, or EOF if the source is closed.

available

public abstract int available()
Get the number of available characters.

Returns:
The number of characters that can be read without blocking.

© 2006 Derrick Oswald
Sep 17, 2006

HTML Parser is an open source library released under Common Public License. SourceForge.net