Lexer States, HTML and the <SCRIPT> tag
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   John_Clarke
Posted On:   Sunday, June 16, 2002 03:28 PM

Basically I am trying to develop a simple HTML parser that understands what a Tag, Comment, Document Definition, Script and style block look like. The problem that I am having is as follows. I have a lexer that starts in a state where it only identifies text. When is sees a < it changes state so that it processes Tags. This is OK until you reach the SCRIPT tag. This tag can have attributes and code between the begin and end tags. eg : &ltscript&gt var cimage; cImage = new Image; &lt/script&gt Basically I need it to keep the text between the begin and end script tags. How can I do this ? I would be grateful fo   More>>

Basically I am trying to develop a simple HTML parser that understands what a
Tag, Comment, Document Definition, Script and style block look like.


The problem that I am having is as follows. I have a lexer that starts in a
state where it only identifies text. When is sees a < it changes state so
that it processes Tags. This is OK until you reach the SCRIPT tag. This tag can
have attributes and code between the begin and end tags. eg :



			&ltscript&gt
			

var cimage;

cImage = new Image;

&lt/script&gt


Basically I need it to keep the text between the begin and end script tags.



How can I do this ? I would be grateful for all advice offered.



Thanks



John

   <<Less

Re: Lexer States, HTML and the &lt;SCRIPT&gt; tag

Posted By:   Monty_Zukowski  
Posted On:   Monday, June 17, 2002 06:39 AM

Seems like you need a parser too. After your script start tag you should be back to lexing text, right? Then you will lex the script end tag. So you could have a parser that understands those tags like so:

script: SCRIPT_BEGIN (TEXT)* SCRIPT_END;
About | Sitemap | Contact