RDF Search

| | Comments (0) | TrackBacks (0)

Lost Boy asks what RDF storage and retrieval solutions people are using. I offer some quick notes on our setup.

Currently we are storing a couple million RDF instances, where each instance has about 5-10 triples.

For search and retrieval, we are using Mark Logic as an XML index engine. Generally we don't fetch a whole RDF instance but a collection of relevant URI's. Mark Logic uses XQuery and we have built some RESTful query interfaces.

So that we can handle about 20+ query requests per second (often much more), we have placed Squid in front of the Mark Logic server (actually on the same hardware). Obviously stale query results are perfectly OK, especially when new RDF isnt' added for a week.

Note that Mark Logic can handle the current load fine, but there isn't any point in keeping the utilization of that process high if we don't need to. This way we can use Mark Logic in more applications.

Our RDF is based on PRISM.

Anyways, we have not found much need for any 'reasoning' about RDF triples as of yet. If we do, we will probably use an offline relational storage system and render new RDF from it, pushing that to Mark Logic in batch.

0 TrackBacks

Listed below are links to blogs that reference this entry: RDF Search.

TrackBack URL for this entry: http://www.manamplified.org/cgi-bin/mt-tb.cgi/229

Leave a comment