The Limitations of SPARQL

Recently, I have been looking at RDF model and try to compare that with the property graph model that I mention in a previous post. I also look at the SPARQL query model. While I think it is a very powerful query language based by variable bindings, I also observe a couple of limitations that it doesn't handle well.

Note that I haven't used SPARQL in very simple examples and don't claim to be expert in this area. I am hoping my post here can invite other SPARQL experts to share their experience.

Here are the limitations that I have seen.

Support of Negation

Because of the “Open World” assumption, SPARQL doesn’t support “negation” well, this means expressing "negation" in SPARQL is not easy.
  • Find all persons who is Bob’s friends but doesn’t know Java
  • Find all persons who know Bob but doesn't know Alice
Support of Path Expression

In SPARQL, expressing a variable length path is not easy.
  • Find all posts written by Bob’s direct and indirect friends (everyone reachable from Bob)
Predicates cannot have Properties

This may be a RDF limitation that SPARQL inherits. Since RDF represents everything in Triples. It is easy to implement properties of a Node using extra Triples, but it is very difficult to implement properties in Edges.

In SPARQL, there is no way to attaching a property to a “predicate”.
  • Bob knows Peter for 5 years
RDF inference Rule

Inference rules are build around RDFS and OWL which is focusing mainly on type and set relationships and is implemented using a Rule: (conditions => derived triple) expression. But it is not easy to express a derived triples whose object’s value is an expression of existing triples.
  • Family income is the sum of all individual member’s income
Support of Fuzzy Matches with Ranked results

SPARQL is based on a boolean query model which is designed for exact match. Express a fuzzy match with ranked result is very difficult.
  • Find the top 20 posts that is “similar” to this post ranked by degree of similarity (lets say similarity is measured by the number of common tags that the 2 posts share)

I am also very interested to see if there is any large scale deployment of RDF graph in real-life scenarios. I am not aware of any popular social network sites are using RDF to store the social graph or social activities. I guess this may be due to scalability of the RDF implementation today. I may be wrong though.

Check out this stream