Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close pending statements on connection close #170

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

staticlibs
Copy link
Contributor

@staticlibs staticlibs commented Mar 20, 2025

This change adds synchronization between accessing and deleting
underlying native objects for Connection, Statement and
ResultSet. All synchronization is done in JNI part, Java-level
synchronization is removed from a few places where it was used.
volatile fields are added when checking whether the object is closed
or not.

Connection's underlying native object maintains a list of Statements
currently open on this Connection. These statements are closed when
the connection is closed. Running queries are cancelled (interrupted)
automatically when the Connection is closed.

Note: Statement.close() is blocked if a long query, that was started
from this statement, is still running. Statement.cancel() must be
called manually before calling close() to perform the closing
promptly. This cannot be done automatically, because cancel() is
implemented in the engine as a Connection-level operation, thus
calling cancel() on a Statement can interrupt the query running
on another Statement on the same Connection.

Synchronization for Appender is going to be added in a separate PR.

Testing: new tests added for various sequential and concurrent closure
scenarios.

Fixes: #101

Edit: description is updated to match updated impl.

}
// Closing remaining statements is not required by JDBC spec,
// but it is a reasonable expectation from clients point of view.
List<DuckDBPreparedStatement> stmtList = new ArrayList<>(statements);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why creating a new ArrayList to iterate over statements?
What if statements is already empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! This set is being modified by statements themselves, when they are closed. So we are getting the local copy to iterate over. If it is empty - then there is nothing to close.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use a concurrent set (like ConcurrentHashMap.newKeySet()) instead of using a non-concurrent one and making a defensive copy for iteration ? (Assuming order of iteration is not required to be maintained)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, another pair of eyes on concurrency topics is always appreciated! ConcurrentHashSet was considered, but without additional locking it is not "synchronized enough" (we don't want new elements added when removal is running, though this scenario handling seems to be incomplete now - needs to be improved). And with additional locking we don't need additional "synchronization" that happens inside the ConcurrentHashMap. Also the order of destruction of statements is a nice property (perhaps needs to be reversed to follow "last created - first deleted" convention from C++), another list will still be needed for it with ConcurrentHashSet.

Copy link
Contributor

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Can we perhaps add some tests with multi-threading as well? I'm not sure how this works in the Java world but I can imagine there being some potential problems when one thread is using a prepared statement and the other closes the connection.

@staticlibs
Copy link
Contributor Author

Can we perhaps add some tests with multi-threading as well?

Thanks for the review! While, in general, client code is not expected to use Connection or Statement instances concurrently in different threads (common case is: taking connection from a pool, using it in a single thread, and then returning it back to pool), close calls can realistically happen from other threads (for example, in shutdown cleanup code). So in this change only closing logic is synchronized for potential concurrent usage. The behaviour on this concurrent call is a valid concern - per JDBC spec it is "implementation-defined" what happens when active connection is closed while queries are running. We at least should not crash when native statement is deleted while still in use. Will add the concurrent closure test coverage.

@jonathanswenson
Copy link
Contributor

The big one that we use (primarily) for motherduck, is statement.cancel() from a different thread for query cancellation.

In one thread we use the standard JDBC flow for running a query.

  • create connection
  • create statement (and stash it somewhere)
  • executeQuery on statement
  • Iterate through results.

In another thread we may need to kill the query:

  • detect that the query needs to be killed, grab the stashed statement
  • call cancel on that statement to cancel the inflight query
  • [optionally] close the statement to try to prevent new queries from starting -- we have this disabled for duckdb / motherduck now due to causing a variety of SIGSEGVs.

The blocking nature of the JDBC API make this frustratingly tough to make reasonably threadsafe. It is nice if the statement close also cancels queries, but that isn't the case with all JDBC drivers 😭

if (conn_ref == null) {
return true;
}
synchronized (this) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of "synchronized" might cause thread-pinning when used with virtual threads (atleast for java versions from 19 to 23). Would it be better to use a ReentrantLock instead ?

Similar discussion in pgjdbc: pgjdbc/pgjdbc#1951

Copy link
Contributor Author

@staticlibs staticlibs Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M, thread pinning here is either very short (when synchronized block only flips the flag) or unavoidable, when it goes to a native call. Unlike Postgres, there is no IO done from Java in DuckDB - the pinned thread is used to do the actual work in DB engine code. At the same time, ReentrantLock s (with a few volatile fields) should not be in any way worse than synchronized blocks. So perhaps it makes sense to use them consistently instead of synchronized.

}
// Closing remaining statements is not required by JDBC spec,
// but it is a reasonable expectation from clients point of view.
List<DuckDBPreparedStatement> stmtList = new ArrayList<>(statements);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use a concurrent set (like ConcurrentHashMap.newKeySet()) instead of using a non-concurrent one and making a defensive copy for iteration ? (Assuming order of iteration is not required to be maintained)

@staticlibs
Copy link
Contributor Author

@jonathanswenson

Thanks for the details!

With concurrent closing, despite no synchronization at all between close() and other operations, it appeared to be not trivial to get a SIGSEGV at will. The only long operation execute() is actually carefully written to get all required data at the beginning and it is not touching the statement state at all after the execution begins. So to get a SIGSEGV it is necessary to call close on a statement after the JNI execute() call entered, but before the execution begins in the engine, that is a pretty narrow time period.

I now have a SIGSEGV reproducer and going to add synchronization in JNI (perhaps moving all synchronization from Java there too). And cancelling of queries seems to work reliably, so going to add cancelling before closing the statements in a connection cleanup.

The hanging is also reproducible, when connection is closed while some query is still running, it happens even when corresponding statement is closed beforehands. Cancelling queries before closing statements seems to be solving the hanging as well.

@staticlibs staticlibs mentioned this pull request Mar 25, 2025
2 tasks
@staticlibs staticlibs force-pushed the statement_close branch 2 times, most recently from 897e47f to a2fc047 Compare March 27, 2025 01:06
@staticlibs
Copy link
Contributor Author

staticlibs commented Mar 27, 2025

@Mytherin

I've added concurrent tests and implemented synchronization in the native part to make these tests to not crash or hang. Now all operations on connections, statements and results are only performed while holding a lock specific to this object. I've ended up using global registries to keep the locks for objects shared with Java part (added longer description on registries and their usage to holders.hpp). These registries are clunky, but I was unable to get anything more elegant (like weak_ptr) - the main problem is that bare pointers are coming from Java side, the underlying contents of these pointers can be deleted concurrently, so some external lock was required to dereference such a pointer.

Locking is done as straightforward as possible, only scoped std::lock_guards are used (no passing locks between calls, no recursive locks, no atomics etc). Their usage requires the multi-step dance on every access to a connection/statement/result set:

  • check that object is alive
  • get its shared_ptr mutex into a local var
  • lock this mutex
  • re-check that object is still alive
  • dereference the object and do the work

This is very verbose, but at least should be straightforward to maintain if used consistently.

Another thing, is that the long queries execution is done while holding a statement lock. I was thinking on releasing the lock while query is running (and re-locking to prepare the result to pass it to Java), but decided to keep this part simple (at least for now). Query interrupt seems to be effective to quickly stop running queries, this interrupt is used on connection when it is closed.

Also, I did not touch synchronization in Appender and in Arrow - going to address these parts separately.

PS: the CI run is failing on linking/symbols problem that seems to be unrelated to this change, will look at it tomorrow.
edit: fixed this, was a missing .cpp entry in CMakeLists.txt.in.

This change adds synchronization between accessing and deleting
underlying native objects for `Connection`, `Statement` and
`ResultSet`. All synchronization is done in JNI part, Java-level
synchronization is removed from a few places where it was used.
`volatile` fields are added when checking whether the object is closed
or not.

`Connection`'s underlying native object maintains a list of `Statement`s
currently open on this `Connection`. These statements are closed when
the connection is closed. Running queries are cancelled (interrupted)
automatically when the `Connection` is closed.

Note: `Statement.close()` is blocked if a long query, that was started
from this statement, is still running. `Statement.cancel()` must be
called manually before calling `close()` to perform the closing
promptly. This cannot be done automatically, because `cancel()` is
implemented in the engine as a `Connection`-level operation, thus
calling `cancel()` on a `Statement` can interrupt the query running
on another `Statement` on the same `Connection`.

Synchronization for `Appender` is going to be added in a separate PR.

Testing: new tests added for various sequential and concurrent closure
scenarios.

Fixes: duckdb#101
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock when opening a 2nd connection with an unclosed statement from previous connection
5 participants