Skip to content

Commit 0281d65

Browse files
committed
Move idempotence to its own page.
1 parent 17545ad commit 0281d65

File tree

2 files changed

+133
-49
lines changed

2 files changed

+133
-49
lines changed

Diff for: manual/idempotence/README.md

+132
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
## Query idempotence
2+
3+
A query is *idempotent* if it can be applied multiple times without changing the result of the initial application. For
4+
example:
5+
6+
* `update my_table set list_col = [1] where pk = 1` is idempotent: no matter how many times it gets executed, `list_col`
7+
will always end up with the value `[1]`;
8+
* `update my_table set list_col = [1] + list_col where pk = 1` is not idempotent: if `list_col` was initially empty,
9+
it will contain `[1]` after the first execution, `[1, 1]` after the second, etc.
10+
11+
Idempotence matters for [retries](../retries/) and [speculative query executions](../speculative_execution/). The
12+
corresponding policies inspect the [Statement#isIdempotent()][isIdempotent] flag.
13+
14+
In most cases, you must set that flag manually. The driver does not parse query strings, so it can't infer it
15+
automatically (except for statements coming from the query builder, see below).
16+
17+
Statements start out as non-idempotent by default. You can override the flag on each statement:
18+
19+
```java
20+
Statement s = new SimpleStatement("SELECT * FROM users WHERE id = 1");
21+
s.setIdempotent(true);
22+
```
23+
24+
The default is also configurable: if you want all statements to start out as idempotent, do this:
25+
26+
```java
27+
// Make all statements idempotent by default:
28+
cluster.getConfiguration().getQueryOptions().setDefaultIdempotence(true);
29+
```
30+
31+
Any statement on which you didn't call `setIdempotent` gets this default value.
32+
33+
Bound statements inherit the flag from the prepared statement they were created from:
34+
35+
```java
36+
PreparedStatement pst = session.prepare("SELECT * FROM users WHERE id = ?");
37+
// This cast is for backward-compatibility reasons. On 3.0+, you can do pst.setIdempotent(true) directly
38+
((IdempotenceAwarePreparedStatement) pst).setIdempotent(true);
39+
40+
BoundStatement bst = pst.bind();
41+
assert bst.isIdempotent();
42+
```
43+
44+
### Idempotence in the query builder
45+
46+
The [QueryBuilder] DSL tries to infer the `isIdempotent` flag on the statements it generates. The following statements
47+
will be marked **non-idempotent**:
48+
49+
* counter updates:
50+
51+
```java
52+
update("mytable").with(incr("c")).where(eq("k", 1));
53+
```
54+
* prepend, append or deletion operations on lists:
55+
56+
```java
57+
update("mytable").with(append("l", 1)).where(eq("k", 1));
58+
delete().listElt("l", 1).from("mytable").where(eq("k", 1));
59+
```
60+
* queries that insert the result of a function call or a "raw" string in a column (or as an element in a collection
61+
column):
62+
63+
```java
64+
update("mytable").with(set("v", now())).where(eq("k", 1));
65+
update("mytable").with(set("v", fcall("myCustomFunc"))).where(eq("k", 1));
66+
update("mytable").with(set("v", raw("myCustomFunc()"))).where(eq("k", 1));
67+
```
68+
69+
This is a conservative approach, since the driver can't guess whether a function is idempotent, or what a raw string
70+
contains. It might yield false negatives, that you'll have to fix manually.
71+
72+
* lightweight transactions (see the next section for a detailed explanation):
73+
74+
```java
75+
insertInto("mytable").value("k", 1).value("v", 2).ifNotExists();
76+
```
77+
78+
If these rules produce a false negative, you can manually override the flag on the built statement:
79+
80+
```java
81+
BuiltStatement s = update("mytable").with(set("v", fcall("anIdempotentFunc"))).where(eq("k", 1));
82+
83+
// False negative because the driver can't guess that anIdempotentFunc() is safe
84+
assert !s.isIdempotent();
85+
86+
// Fix it
87+
s.setIdempotent(true);
88+
```
89+
90+
91+
### Idempotence and lightweight transactions
92+
93+
As explained in the previous section, the query builder considers lightweight transactions as non-idempotent. This might
94+
sound counter-intuitive, as these queries can sometimes be safe to execute multiple times. For example, consider the
95+
following query:
96+
97+
```
98+
UPDATE mytable SET v = 4 WHERE k = 1 IF v = 1
99+
```
100+
101+
If we execute it twice, the `IF` condition will fail the second time, so the second execution will do nothing and `v`
102+
will still have the value 4.
103+
104+
However, the problem appears when we consider multiple clients executing the query with retries:
105+
106+
1. `v` has the value 1;
107+
2. client 1 executes the query above, performing a a CAS (compare and set) from 1 to 4;
108+
3. client 1's connection drops, but the query completes successfully. `v` now has the value 4;
109+
4. client 2 executes a CAS from 4 to 2;
110+
5. client 2's transaction succeeds. `v` now has the value 2;
111+
6. since client 1 lost its connection, it considers the query as failed, and transparently retries the CAS from 1 to 4.
112+
But since the column now has value 2, it receives a "not applied" response.
113+
114+
One important aspect of lightweight transactions is [linearizability]: given a set of concurrent operations on a column
115+
from different clients, there must be a way to reorder them to yield a sequential history that is correct. From our
116+
clients' point of view, there were two operations:
117+
118+
* client 1 executed a CAS from 1 to 4, that was not applied;
119+
* client 2 executed a CAS from 4 to 2, that was applied.
120+
121+
But overall the column changed from 1 to 2. There is no ordering of the two operations that can explain that change. We
122+
broke linearizability by doing a transparent retry at step 6.
123+
124+
To avoid this, the driver considers lightweight transactions as non-idempotent, and provides a
125+
[retry policy](../retries/) that doesn't retry non-idempotent statements. If linearizability is important for you, you
126+
should use that policy, and ensure that lightweight transactions are appropriately flagged.
127+
128+
[isIdempotent]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#isIdempotent--
129+
[setDefaultIdempotence]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/QueryOptions.html#setDefaultIdempotence-boolean-
130+
[QueryBuilder]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/QueryBuilder.html
131+
132+
[linearizability]: https://en.wikipedia.org/wiki/Linearizability#Definition_of_linearizability

Diff for: manual/speculative_execution/README.md

+1-49
Original file line numberDiff line numberDiff line change
@@ -60,57 +60,9 @@ sections cover the practical details and how to enable them.
6060

6161
### Query idempotence
6262

63-
One important aspect to consider is whether queries are idempotent, i.e.
64-
whether they can be applied multiple times without changing the result
65-
beyond the initial application. **If a query is not idempotent, the
66-
driver will never schedule speculative executions for it**, because
63+
If a query is [not idempotent](../idempotence/), the driver will never schedule speculative executions for it, because
6764
there is no way to guarantee that only one node will apply the mutation.
6865

69-
As of Cassandra 2.1.4, the only queries that are *not* idempotent are:
70-
71-
* counter operations;
72-
* prepending or appending to a list column;
73-
* using non-idempotent CQL functions, like `now()` or `uuid()`.
74-
75-
In the driver, this is determined by
76-
[Statement#isIdempotent()][isIdempotent]. Unfortunately, the driver
77-
doesn't parse query strings, so in most cases it has no information
78-
about what the query actually does. Therefore:
79-
80-
* **`Statement#isIdempotent()` is only computed automatically for
81-
statements built with [QueryBuilder][QueryBuilder]**.
82-
Note that the driver takes a rather conservative approach with uses
83-
of `fcall()` or `raw()`: whenever they appear in a value to be
84-
inserted in the database (like the values of an `Insert` or the
85-
right-hand side of an assignment in an `Update`), the statement
86-
will be considered non-idempotent by default. If you know that your
87-
CQL functions or expressions are safe, force idempotence to `true`
88-
on the statement manually (see below);
89-
* **for all other types of statements, it defaults to `false`.** You'll
90-
need to set it manually, with one of the mechanism described below.
91-
92-
You can override the value on each statement:
93-
94-
```java
95-
Statement s = new SimpleStatement("SELECT * FROM users WHERE id = 1");
96-
s.setIdempotent(true);
97-
```
98-
99-
Note that this will also work for built statements (and override the
100-
computed value).
101-
102-
Additionally, if you know for a fact that your application does not use
103-
any of the non-idempotent CQL queries listed above, you can change the
104-
default cluster-wide:
105-
106-
```java
107-
// Make all statements idempotent by default:
108-
cluster.getConfiguration().getQueryOptions().setDefaultIdempotence(true);
109-
```
110-
111-
[isIdempotent]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#isIdempotent--
112-
[QueryBuilder]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/QueryBuilder.html
113-
11466
### Enabling speculative executions
11567

11668
Speculative executions are controlled by an instance of

0 commit comments

Comments
 (0)