-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Algebra equivalence or superset #79
Comments
Yeah, that sounds good. The one challenge to making this really useful in planning is that you won't always get the algebra you want, even if it's equivalent to what you want. Expanding your second example a bit:
Let's say your system can handle just the This is a general problem and one that's hard to solve. There are just so many equivalent rewritings of a query. I'm not sure if we should be looking towards letting extension code try to express a preference for what patterns they want and try to rewrite the query to support that, or if the extension might pick apart the algebra, handle what it can, and hand back the remaining partial algebra for re-planning. Doing the simple stuff should be doable, though. I'll think about it and try to hack something together. |
Right! So, wrapping it in curly brackets, like: {
?s a ?type .
{
?s dct:title ?title .
FILTER regex(str(?title), "^Dahuts")
}
} might that help? It would require the query author to understand what backends would be available, though. |
Yeah, that would probably work, but not in all cases, and is exactly the sort of thing I want to avoid (requiring users to understand the implementation choices). |
How "equal" do you want this equality test to be? Identical structures? Canonical term values are equal? Equal after canonical variable naming and triple pattern ordering? |
It sounds to me that canonical variable naming and triple pattern ordering is the easiest to implement, so we can go for that, I think… |
Well... the easiest would be identical structures (triples in the same order, variables named the same). That probably isn't super useful, though... |
Ah, right! No, using the same variable names wouldn't be terribly useful ;-) |
Just a little loudthinking around this subject: Since the main problem with the curly brackets solution is that the query author has to understand what the query engine can do, I was thinking about heuristics that might often help on the engine. Then, I thought about some heuristics, say that we had a dictionary that recorded parts of the query that could be answered efficiently, like for example, for ?s dct:title ?title .
FILTER regex(str(?title), "^Dahuts") we would record ?s dct:title ?title . and FILTER regex(str(?title), "^Dahuts") separately in the dictionary. Then, as early as possible, for example at parse time, the parser would look up in the dictionary and see if any other part of the query matches the rest of the pattern, and then construct the algebra appropriately, to make it easier for the planner to construct the wanted query plan. Might that work? |
Makes me think of the Bucket Algorithm by Levy et al., where you also consider other constraints instead of only relations. |
We've started to talk about shapes now, and I'm wondering if using shapes to group patterns in such a way that this becomes feasible is an option... Like, the shapes could be used to choose certain algebras over others? |
@kjetilk I think shapes probably have a use here. Also possibly relevant is the framing work that's been going on with JSON-LD. I'm not very swapped in on that, but I wonder if there might be applications of framing on query patterns. That could allow a general solution for the problem of planners expressing how they expect/prefer the query patterns. |
@RubenVerborgh and I had a discussion today about different abstraction levels, and their consequences further down the stack, and I came to the conclusion that it would be interesting to support something motivated from the following.
Say that you had some kind of a service that could efficiently answer things like
or
The main point being that there is a clear pattern that can be detected and that a plan can be generated for just this pattern. Of course, we can do that now, but I was thinking of ways to do it in less code and more concisely.
My idea was then that even though query containment is complex, it is likely to be used in very constrained ways, so a simple method to check if two patterns are the same, so that I could do stuff in
plans_for_algebra
likeDoes this sound like a good idea?
The text was updated successfully, but these errors were encountered: