how can one lex/parse tokens that are not delimited
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   richard_koch
Posted On:   Wednesday, May 23, 2001 12:23 AM

how do i parse the following example:

11111112222222333333abcdefg


i want to extract the 1', the 2' etc. this problem arises from an old cobol output that we can not alter. the 1,2,3's are any numbers and distinuishable by pattern, only as a sequence of length x.

thanx in advance

richard koch

Re: how can one lex/parse tokens that are not delimited

Posted By:   Monty_Zukowski  
Posted On:   Thursday, May 24, 2001 06:34 AM

The only way for antlr to lex X number of characters to make a token is for you to keep your own counter and use a semantic predicate. You might find it easier to do the initial lexing by hand or use another lexer toolkit and then hand that off to antlr.



If I were doing it myself based only on the example you give, and I would make my own table of lengths and token types and make my own object implementing TokenStream (which is just one method--nextToken()). Then I would just read the specified number of characters, create a token object with the correct type, and bump my table pointer to the next position for the next token.

About | Sitemap | Contact