|
| 1 | +## Query idempotence |
| 2 | + |
| 3 | +A query is *idempotent* if it can be applied multiple times without changing the result of the initial application. For |
| 4 | +example: |
| 5 | + |
| 6 | +* `update my_table set list_col = [1] where pk = 1` is idempotent: no matter how many times it gets executed, `list_col` |
| 7 | + will always end up with the value `[1]`; |
| 8 | +* `update my_table set list_col = [1] + list_col where pk = 1` is not idempotent: if `list_col` was initially empty, |
| 9 | + it will contain `[1]` after the first execution, `[1, 1]` after the second, etc. |
| 10 | + |
| 11 | +Idempotence matters for [retries](../retries/) and [speculative query executions](../speculative_execution/). The |
| 12 | +corresponding policies inspect the [Statement#isIdempotent()][isIdempotent] flag. |
| 13 | + |
| 14 | +In most cases, you must set that flag manually. The driver does not parse query strings, so it can't infer it |
| 15 | +automatically (except for statements coming from the query builder, see below). |
| 16 | + |
| 17 | +Statements start out as non-idempotent by default. You can override the flag on each statement: |
| 18 | + |
| 19 | +```java |
| 20 | +Statement s = new SimpleStatement("SELECT * FROM users WHERE id = 1"); |
| 21 | +s.setIdempotent(true); |
| 22 | +``` |
| 23 | + |
| 24 | +The default is also configurable: if you want all statements to start out as idempotent, do this: |
| 25 | + |
| 26 | +```java |
| 27 | +// Make all statements idempotent by default: |
| 28 | +cluster.getConfiguration().getQueryOptions().setDefaultIdempotence(true); |
| 29 | +``` |
| 30 | + |
| 31 | +Any statement on which you didn't call `setIdempotent` gets this default value. |
| 32 | + |
| 33 | +Bound statements inherit the flag from the prepared statement they were created from: |
| 34 | + |
| 35 | +```java |
| 36 | +PreparedStatement pst = session.prepare("SELECT * FROM users WHERE id = ?"); |
| 37 | +// This cast is for backward-compatibility reasons. On 3.0+, you can do pst.setIdempotent(true) directly |
| 38 | +((IdempotenceAwarePreparedStatement) pst).setIdempotent(true); |
| 39 | + |
| 40 | +BoundStatement bst = pst.bind(); |
| 41 | +assert bst.isIdempotent(); |
| 42 | +``` |
| 43 | + |
| 44 | +### Idempotence in the query builder |
| 45 | + |
| 46 | +The [QueryBuilder] DSL tries to infer the `isIdempotent` flag on the statements it generates. The following statements |
| 47 | +will be marked **non-idempotent**: |
| 48 | + |
| 49 | +* counter updates: |
| 50 | + |
| 51 | + ```java |
| 52 | + update("mytable").with(incr("c")).where(eq("k", 1)); |
| 53 | + ``` |
| 54 | +* prepend, append or deletion operations on lists: |
| 55 | + |
| 56 | + ```java |
| 57 | + update("mytable").with(append("l", 1)).where(eq("k", 1)); |
| 58 | + delete().listElt("l", 1).from("mytable").where(eq("k", 1)); |
| 59 | + ``` |
| 60 | +* queries that insert the result of a function call or a "raw" string in a column (or as an element in a collection |
| 61 | + column): |
| 62 | + |
| 63 | + ```java |
| 64 | + update("mytable").with(set("v", now())).where(eq("k", 1)); |
| 65 | + update("mytable").with(set("v", fcall("myCustomFunc"))).where(eq("k", 1)); |
| 66 | + update("mytable").with(set("v", raw("myCustomFunc()"))).where(eq("k", 1)); |
| 67 | + ``` |
| 68 | + |
| 69 | + This is a conservative approach, since the driver can't guess whether a function is idempotent, or what a raw string |
| 70 | + contains. It might yield false negatives, that you'll have to fix manually. |
| 71 | + |
| 72 | +* lightweight transactions (see the next section for a detailed explanation): |
| 73 | + |
| 74 | + ```java |
| 75 | + insertInto("mytable").value("k", 1).value("v", 2).ifNotExists(); |
| 76 | + ``` |
| 77 | + |
| 78 | +If these rules produce a false negative, you can manually override the flag on the built statement: |
| 79 | + |
| 80 | +```java |
| 81 | +BuiltStatement s = update("mytable").with(set("v", fcall("anIdempotentFunc"))).where(eq("k", 1)); |
| 82 | + |
| 83 | +// False negative because the driver can't guess that anIdempotentFunc() is safe |
| 84 | +assert !s.isIdempotent(); |
| 85 | + |
| 86 | +// Fix it |
| 87 | +s.setIdempotent(true); |
| 88 | +``` |
| 89 | + |
| 90 | + |
| 91 | +### Idempotence and lightweight transactions |
| 92 | + |
| 93 | +As explained in the previous section, the query builder considers lightweight transactions as non-idempotent. This might |
| 94 | +sound counter-intuitive, as these queries can sometimes be safe to execute multiple times. For example, consider the |
| 95 | +following query: |
| 96 | + |
| 97 | +``` |
| 98 | +UPDATE mytable SET v = 4 WHERE k = 1 IF v = 1 |
| 99 | +``` |
| 100 | + |
| 101 | +If we execute it twice, the `IF` condition will fail the second time, so the second execution will do nothing and `v` |
| 102 | +will still have the value 4. |
| 103 | + |
| 104 | +However, the problem appears when we consider multiple clients executing the query with retries: |
| 105 | + |
| 106 | +1. `v` has the value 1; |
| 107 | +2. client 1 executes the query above, performing a a CAS (compare and set) from 1 to 4; |
| 108 | +3. client 1's connection drops, but the query completes successfully. `v` now has the value 4; |
| 109 | +4. client 2 executes a CAS from 4 to 2; |
| 110 | +5. client 2's transaction succeeds. `v` now has the value 2; |
| 111 | +6. since client 1 lost its connection, it considers the query as failed, and transparently retries the CAS from 1 to 4. |
| 112 | + But since the column now has value 2, it receives a "not applied" response. |
| 113 | + |
| 114 | +One important aspect of lightweight transactions is [linearizability]: given a set of concurrent operations on a column |
| 115 | +from different clients, there must be a way to reorder them to yield a sequential history that is correct. From our |
| 116 | +clients' point of view, there were two operations: |
| 117 | + |
| 118 | +* client 1 executed a CAS from 1 to 4, that was not applied; |
| 119 | +* client 2 executed a CAS from 4 to 2, that was applied. |
| 120 | + |
| 121 | +But overall the column changed from 1 to 2. There is no ordering of the two operations that can explain that change. We |
| 122 | +broke linearizability by doing a transparent retry at step 6. |
| 123 | + |
| 124 | +To avoid this, the driver considers lightweight transactions as non-idempotent, and provides a |
| 125 | +[retry policy](../retries/) that doesn't retry non-idempotent statements. If linearizability is important for you, you |
| 126 | +should use that policy, and ensure that lightweight transactions are appropriately flagged. |
| 127 | + |
| 128 | +[isIdempotent]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#isIdempotent-- |
| 129 | +[setDefaultIdempotence]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/QueryOptions.html#setDefaultIdempotence-boolean- |
| 130 | +[QueryBuilder]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/QueryBuilder.html |
| 131 | + |
| 132 | +[linearizability]: https://en.wikipedia.org/wiki/Linearizability#Definition_of_linearizability |
0 commit comments