dcsimg
Is Antlr able to sensibly handle 200000 regular expressions
0 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Harald_Kirsch
Posted On:   Friday, June 18, 2004 06:20 AM

200000 regular expressions generalized from 200000 protein names must be matched in text in order to annotate the text with links into a protein database. My first guess to try this with antlr would be to rewrite the regexps as an antlr lexer grammar. Each regexp would be a rule, and the rule would rewrite the token to be a link to the database. I know antlr only from the overview and quick examples of its homepage. Before I dig any deeper, I would appreciate hints as to whether there is any chance of success. Concerns arising from my limited knowledge of antlr include: "warning:lexical nondetermini   More>>


200000 regular expressions generalized from
200000 protein names must be matched in text in
order to annotate the text with links into
a protein database.

My first guess to try this with antlr would
be to rewrite the regexps as an antlr lexer
grammar. Each regexp would be a rule, and the
rule would rewrite the token to be a link
to the database.

I know antlr only from the overview and
quick examples of its homepage. Before I dig
any deeper, I would appreciate hints as to
whether there is
any chance of success. Concerns arising from
my limited knowledge of antlr include:

"warning:lexical nondeterminism between rules"

copying "unmatched" input unchanged to the output

size

Thanks,
Harald.

   <<Less
About | Sitemap | Contact