Algebra equivalence or superset #79

kjetilk · 2016-03-09T15:38:23Z

@RubenVerborgh and I had a discussion today about different abstraction levels, and their consequences further down the stack, and I came to the conclusion that it would be interesting to support something motivated from the following.

Say that you had some kind of a service that could efficiently answer things like

?s dct:identifier "965363" ;
    foaf:name ?name .

or

?s dct:title ?title .
FILTER regex(str(?title), "^Dahuts")

The main point being that there is a clear pattern that can be detected and that a plan can be generated for just this pattern. Of course, we can do that now, but I was thinking of ways to do it in less code and more concisely.

My idea was then that even though query containment is complex, it is likely to be used in very constrained ways, so a simple method to check if two patterns are the same, so that I could do stuff in plans_for_algebra like

my $bgp = Attean::Algebra::BGP->new(triples => [triplepattern(variable('s') ... , ...), ...]);
if ($algebra->isa('Attean::Algebra::BGP') && $algebra->equiv($bgp)) {
  push(@plans, AtteanX::Plan::FooBar->new(...));
}

Does this sound like a good idea?

The text was updated successfully, but these errors were encountered:

kasei · 2016-03-09T16:14:03Z

Yeah, that sounds good. The one challenge to making this really useful in planning is that you won't always get the algebra you want, even if it's equivalent to what you want. Expanding your second example a bit:

?s a ?type ; dct:title ?title .
FILTER regex(str(?title), "^Dahuts")

Let's say your system can handle just the dct:title triple and the FILTER, but not the rdf:type triple. Your planner is going to get a 2-triple BGP wrapped in a filter instead of the equivalent plan that you might prefer, a 1-triple BGP wrapped in a filter joined with another 1-triple BGP.

This is a general problem and one that's hard to solve. There are just so many equivalent rewritings of a query. I'm not sure if we should be looking towards letting extension code try to express a preference for what patterns they want and try to rewrite the query to support that, or if the extension might pick apart the algebra, handle what it can, and hand back the remaining partial algebra for re-planning.

Doing the simple stuff should be doable, though. I'll think about it and try to hack something together.

kjetilk · 2016-03-10T00:27:16Z

Right!

So, wrapping it in curly brackets, like:

{ 
  ?s a ?type . 
  { 
    ?s dct:title ?title .
    FILTER regex(str(?title), "^Dahuts")
  }
}

might that help? It would require the query author to understand what backends would be available, though.

kasei · 2016-03-10T05:01:45Z

Yeah, that would probably work, but not in all cases, and is exactly the sort of thing I want to avoid (requiring users to understand the implementation choices).

kasei · 2016-03-10T05:03:19Z

How "equal" do you want this equality test to be? Identical structures? Canonical term values are equal? Equal after canonical variable naming and triple pattern ordering?

kjetilk · 2016-03-10T20:49:10Z

It sounds to me that canonical variable naming and triple pattern ordering is the easiest to implement, so we can go for that, I think…

kasei · 2016-03-10T21:01:31Z

Well... the easiest would be identical structures (triples in the same order, variables named the same). That probably isn't super useful, though...

kjetilk · 2016-03-10T21:29:22Z

Ah, right! No, using the same variable names wouldn't be terribly useful ;-)

kjetilk · 2017-07-25T10:07:16Z

Just a little loudthinking around this subject:

Since the main problem with the curly brackets solution is that the query author has to understand what the query engine can do, I was thinking about heuristics that might often help on the engine. Then, I thought about some heuristics, say that we had a dictionary that recorded parts of the query that could be answered efficiently, like for example, for

    ?s dct:title ?title .
    FILTER regex(str(?title), "^Dahuts")

we would record

    ?s dct:title ?title .

and

    FILTER regex(str(?title), "^Dahuts")

separately in the dictionary. Then, as early as possible, for example at parse time, the parser would look up in the dictionary and see if any other part of the query matches the rest of the pattern, and then construct the algebra appropriately, to make it easier for the planner to construct the wanted query plan.

Might that work?

RubenVerborgh · 2017-08-01T13:20:01Z

Makes me think of the Bucket Algorithm by Levy et al., where you also consider other constraints instead of only relations.

kjetilk · 2019-05-13T21:09:36Z

We've started to talk about shapes now, and I'm wondering if using shapes to group patterns in such a way that this becomes feasible is an option... Like, the shapes could be used to choose certain algebras over others?

kasei · 2019-05-13T21:45:24Z

@kjetilk I think shapes probably have a use here. Also possibly relevant is the framing work that's been going on with JSON-LD. I'm not very swapped in on that, but I wonder if there might be applications of framing on query patterns. That could allow a general solution for the problem of planners expressing how they expect/prefer the query patterns.

kasei added a commit that referenced this issue Mar 12, 2016

Add isomorphic_with method to many algebra classes (proposal for #79).

d8fa444

kasei self-assigned this Mar 30, 2016

kasei assigned kasei and unassigned kasei Jan 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algebra equivalence or superset #79

Algebra equivalence or superset #79

kjetilk commented Mar 9, 2016

kasei commented Mar 9, 2016

kjetilk commented Mar 10, 2016

kasei commented Mar 10, 2016

kasei commented Mar 10, 2016

kjetilk commented Mar 10, 2016

kasei commented Mar 10, 2016

kjetilk commented Mar 10, 2016

kjetilk commented Jul 25, 2017

RubenVerborgh commented Aug 1, 2017

kjetilk commented May 13, 2019

kasei commented May 13, 2019

Algebra equivalence or superset #79

Algebra equivalence or superset #79

Comments

kjetilk commented Mar 9, 2016

kasei commented Mar 9, 2016

kjetilk commented Mar 10, 2016

kasei commented Mar 10, 2016

kasei commented Mar 10, 2016

kjetilk commented Mar 10, 2016

kasei commented Mar 10, 2016

kjetilk commented Mar 10, 2016

kjetilk commented Jul 25, 2017

RubenVerborgh commented Aug 1, 2017

kjetilk commented May 13, 2019

kasei commented May 13, 2019