org.htmlparser.scanners
Class ScriptScanner
java.lang.Object
org.htmlparser.scanners.TagScanner
org.htmlparser.scanners.CompositeTagScanner
org.htmlparser.scanners.ScriptScanner
- All Implemented Interfaces:
- Serializable, Scanner
public class ScriptScanner
- extends CompositeTagScanner
The ScriptScanner handles script CDATA.
- See Also:
- Serialized Form
Field Summary |
static boolean |
STRICT
Strict parsing of CDATA flag. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
STRICT
public static boolean STRICT
- Strict parsing of CDATA flag.
If this flag is set true, the parsing of script is performed without
regard to quotes. This means that erroneous script such as:
document.write("</script>");
will be parsed in strict accordance with appendix
B.3.2 Specifying non-HTML data of the
HTML 4.01 Specification and
hence will be split into two or more nodes. Correct javascript would
escape the ETAGO:
document.write("<\/script>");
If true, CDATA parsing will stop at the first ETAGO ("</") no matter
whether it is quoted or not. If false, balanced quotes (either single or
double) will shield an ETAGO. Beacuse of the possibility of quotes within
single or multiline comments, these are also parsed. In most cases,
users prefer non-strict handling since there is so much broken script
out in the wild.
ScriptScanner
public ScriptScanner()
- Create a script scanner.
scan
public Tag scan(Tag tag,
Lexer lexer,
NodeList stack)
throws ParserException
- Scan for script.
Accumulates text from the page, until </[a-zA-Z] is encountered.
- Specified by:
scan
in interface Scanner
- Overrides:
scan
in class CompositeTagScanner
- Parameters:
tag
- The tag this scanner is responsible for.lexer
- The source of CDATA.stack
- The parse stack, not used.
- Returns:
- The resultant tag (may be unchanged).
- Throws:
ParserException
- if an unrecoverable problem occurs.