dcsimg
Document Fingerprint
2 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   nawaz_ahmed
Posted On:   Sunday, April 25, 2004 11:09 PM

Hi ,

I am designing a Web Crawler, i want to put up a filter for "Content Seen" verification. I know it has something to do with "document fingerprint set" or "copy catch" concepts. Can anyone help me find one or any API which deals with these concepts.

Thanks in advance

Regards,
Nawaz Ahmed R

Re: Document Fingerprint

Posted By:   Anonymous  
Posted On:   Monday, April 26, 2004 10:45 PM

I had written a selector for Ant so that only really files are handled. Basically it uses MD5 and a local cachefile. Maybe you´ll find something interesting.

Re: Document Fingerprint

Posted By:   Christopher_Koenigsberg  
Posted On:   Monday, April 26, 2004 01:37 PM

MD5 is used as a fairly unique hash or "fingerprint", though it can be slower than some other, less unique algorithms.

About | Sitemap | Contact