Thursday, December 04, 2008

Pubget now on Solr

The latest version of Pubget has been rolled out and it is now based on its own search index. It has over 18 million medical and scientific papers index with over 6 million PDF paths ready for use.

The index is now Solr based and thus uses a new lucene based query syntax. To search only open access articles, you can use the query string access:open or alternatively you can limit your results by institution (e.g. access:ucsf, access:harvard, access:mit, etc).

I have been really impressed with Solr and its new 1.3 distributed sharding feature. It has allowed the use of low cost machines, amazon ec2, and s3 services.

No comments: