Scott Penrose

SPARQL PERL

Scott is an expert software developer with over 30 years experience, specialising in education, automation and remote data.

Using a lot of SPARQL at the moment, and I have a learning curve no doubt. So some of my assessments, code, may be ignorant... but when has that ever stopped me with an opinion :-)

Perl Modules

Firstly, take a look at:

These modules take the approach of trying to do an excellent job of doing just about everything to do with RDF. Including, but not limited to doing local stores, providing their own SPARQL implementation to allow you to search your own data formats, SPARQL end points and LOTS more.

However... after 4 hours I can't get it to work with 2 out of 3 of the SPARQL endpoints I am using. Works great with example endpoints like DBPedia. I am even using a Virtuoso server.

The bug/reason is the MIME type. It cleverly uses the MIME type to try and work out how to parse the output. I just can't get the endpoints to put the endpoint I need. Also I couldn't find any parser that parsed either SPARQL JSON or SPARQL XML - because they are not RDF. You can on some endpoints get it to return RDF instead (either XML or JSON) and that then works, except for the MIME type.

The endpoints are no doubt doing the wrong thing a bit, sometimes returning text/plain or similar.

I recommend looking at these modules, and not mind, but if you get stuck, keep reading.

Simplicity

My module works on all the endpoints I am currently using, and is tiny. But as stated in my opening sentence, is no doubt naive or just plain ignorant. But it provides the bare minimum of what I need.

#!Perl
#!/usr/bin/perl
use perl5i::2;
use Experimental::SPARQL;
use Data::Dumper;

my $sparql = NSIPMD::SPARQL->new(endpoint => 'http://localhost:8890/sparql/');

my $sparql_query = "";
while (<>) {
	$sparql_query .= $_;
}

my $data = $sparql->query($sparql_query);

print Dumper($data);

Now straight away, if you know SPARQL you will see some issues:

  • SPARQL should not be limited to one endpoint. This is why you often see the endpoint in the query portion.
    • I chose that structure so I can build something like a DBD string
    • Not only including endpoint, but how to decode or any other special cases

If it is useful, I am happy to release this to CPAN (note: Experimental name space, just for now).

Essence

This is what the module can do:

  • new - accept endpoint details - currently only the URL
  • query - accept a SPARQL query

This is how it works:

  • encodes parameters (CGI escape etc)
  • creates an LWP user agent
  • Does a GET, asking for JSON SPARQL format for return (future add some more)
  • Create an array of results, each being an object with the field, value from the query
    • This is like doing a fetchall_arrayref({}) in DBI - e.g. arrayref of hashrefs returned. Very useful.

Often I can just take that output and put it into JSON to return. Of course I could access the JSON directly from the Javascript - but often can't because of cross site rules, or because I need to embed other information from the server into the query etc.

TODO

  • Keep LWP instance for performance
  • Consider POST instead of GET
  • More serialisation formats
  • Keep updating as my knowledge increases - maybe get to the point where I can throw it away because it needs to be as complicated as RDF::Query::Client

  • Perl
  • RDF
  • SPARQL