Skip to content

Honor intent.prepare in _exec_insert to avoid OCI8 half-duplex deadlock#2620

Closed
yahonda wants to merge 1 commit into
rsim:masterfrom
yahonda:oci8-returning-deadlock-2619
Closed

Honor intent.prepare in _exec_insert to avoid OCI8 half-duplex deadlock#2620
yahonda wants to merge 1 commit into
rsim:masterfrom
yahonda:oci8-returning-deadlock-2619

Conversation

@yahonda
Copy link
Copy Markdown
Collaborator

@yahonda yahonda commented Apr 27, 2026

Summary

Re-gate the prepared-statement cache in _exec_insert on intent.prepare, the Rails 8.2 QueryIntent
successor to without_prepared_statement?. PostgreSQL uses the same gate (rails/rails@3d7458f
postgresql/database_statements.rb:167).

This restores the behavior _exec_insert had before the prepared_statements config was dropped from
the cache decision, and incidentally avoids reaching a SQL*Net half-duplex deadlock in the OCI / Oracle
Instant Client client path that surfaces under a specific combination of conditions (see #2619).

Where the deadlock comes from, where this PR sits

Issue #2619 narrows the deadlock to behavior in the OCI / Instant Client layer that has not been
classified or acknowledged by the upstream owner; whether it is a defect, a documented limitation, or a
configuration interaction is still open. All of the following must hold for it to manifest:

  1. The Oracle Database server is amd64 (x86_64). The arm64 build of the same release does not
    reproduce it.
  2. The session has CURSOR_SHARING = FORCE (the adapter default).
  3. A single SQL statement contains both a literal that the server will rewrite and a RETURNING ... INTO :out_bind clause.
  4. The same parsed cursor is executed at least twice.

Removing any one of (2)–(4), or running against an arm64 server, avoids the deadlock.

Cross-version server reproduction. The same conditions reproduce unchanged on every amd64 Oracle
Database server tested:

Server Edition Image source
11.2.0.2.0 11g Express gvenzl/oracle-xe:11
19.3.0.0.0 19c Enterprise Edition built from oracle/docker-images + Oracle-provided
LINUX.X64_193000_db_home.zip
21.3.0.0.0 21c Enterprise Edition built from oracle/docker-images + Oracle-provided
LINUX.X64_213000_db_home.zip
23.26.1.0.0 26ai Free gvenzl/oracle-free:23.26.1

The 19c / 21c images were built directly from Oracle's oracle/docker-images repository using
Oracle-provided installer ZIPs, so the behavior is not specific to community-maintained packaging.
Reproduction across releases from 2014 to 2025 indicates this is not a recent regression.

Client-side dimensions checked and ruled out as discriminators. Verified across Instant Client 23.3 /
23.4 / 23.6 / 23.7 / 23.8 / 23.9 / 23.26.0 / 23.26.1 and ruby-oci8 2.2.14 (gem release) / master tip —
none of those affect reproduction.

Negative control (same OCI library, no reproduction). The same INSERT ... RETURNING ... INTO :rid
issued twice from SQL*Plus under CURSOR_SHARING=FORCE returns normally each time. SQL*Plus is itself
an OCI client, but it parses each statement freshly per command — at the protocol level the second
INSERT is a new parse + execute, not a re-execute of the existing statement handle, so condition (4)
above is not met. This is the same mechanism by which the workarounds below — and this PR — avoid the
deadlock.

#2619 lists four independent workarounds (W1–W4) that each break one of the conditions above:

  • W1 — set CURSOR_SHARING = EXACT for the session
  • W2 — bind every literal as an explicit IN bind
  • W3 — close and re-parse the cursor between successive exec calls
  • W4 — adapter-level: take the connection.insert raw-SQL path through a fresh cursor on every call

This PR is W4. It fixes the cache decision in _exec_insert and only incidentally side-steps the OCI
deadlock for typical Rails callers; the underlying OCI / Instant Client behavior itself is not addressed
here and is best tracked / classified at that layer.

How it surfaced

Three things lined up:

  1. A set of pre-2018 specs that issue the same raw-SQL INSERT with RETURNING ... INTO :returning_id
    more than once on the same adapter were removed in 2017 (commit
    ef321a15 "Drop trigger based primary key
    support").

  2. While those specs were absent, the Rails 8 support work landed via
    #2425 (merged 2025-06-21). That PR included commit
    baf9fd48 "Rails 8 support", which replaced
    _exec_insert's cache gate

    if without_prepared_statement?(binds)  # fresh cursor
    else                                   # cached cursor
    end

    with the predicate

    if binds.nil? || binds.empty?  # fresh cursor
    else                            # cached cursor
    end

    The PR description and commit message record this as following the upstream removal of
    without_prepared_statement? in
    rails/rails@2306c10e;
    the practical effect on the cache predicate (no longer factoring in prepared_statements) is not
    discussed there. After the change, any caller with non-empty binds goes through the cached-cursor branch.

  3. The same pre-2018 spec patterns were restored in 2026 as part of porting the trigger-based primary-key
    option (Restore trigger-based primary key generation as opt-in primary_key_trigger: option #2615), and ran against the post-baf9fd48 cache decision for the first time.

Under that combination, plus the four conditions in the section above, the second exec on the cached
cursor produced the SQL*Net half-duplex deadlock — server stuck on SQL*Net more data from client,
client stuck in OCIStmtExecute → ... → nttfprd → read().

Fix

Reintroduce the prepared-statements check using intent.prepare, the Rails 8.2 QueryIntent successor to
without_prepared_statement?. PostgreSQL gates on the same flag (rails-3d7458fecf49
postgresql/database_statements.rb:167).

For raw SQL paths, intent.prepare is left as its default false (set by connection.insert in
abstract/database_statements.rb), so the fresh-cursor path runs and the cached-cursor combination above
is not reached.

Behavior after this change:

  • klass.create!(attrs) (Arel insert, prepared_statements=true) keeps the cached path.
  • connection.insert("INSERT ...", nil, "id") (raw SQL) takes the fresh-cursor path and never reuses a
    cursor with a stale OCI bind state.
  • Connections with prepared_statements: false always use fresh cursors, matching the pre-baf9fd48
    behavior.

Tradeoff

This moves raw-SQL connection.insert paths to a fresh cursor on every call, giving up the cached-cursor
reuse benefit for those calls. Acceptable because:

  • Issuing the exact same raw SQL string through connection.insert twice in a row is uncommon in Rails
    application code; model paths go through Arel and parameterise their bind values, so the cached path
    still applies to them.
  • A SQL*Net deadlock that wedges the rest of the suite is a larger problem than losing per-call cursor
    reuse on a rarely exercised path.

The cached path is preserved unchanged for ORM (Arel-based) inserts, which are the high-volume callers.

Regression spec

spec/active_record/connection_adapters/oracle_enhanced/exec_insert_caching_spec.rb issues the same
INSERT ... NEXTVAL ... RETURNING "ID" INTO :returning_id SQL twice through connection.insert. Without
the fix, the second exec hits the deadlock and trips the spec's 5-second Timeout; with the fix, it
completes well under 100ms. A short comment in the spec documents the dependency on cursor_sharing = force being the adapter default plus an amd64 Oracle Database server, so a future change in defaults —
or running CI exclusively against arm64 servers — does not silently turn the regression test into a
no-op.

Test plan

  • bundle exec rspec spec/active_record/connection_adapters/oracle_enhanced/exec_insert_caching_spec.rb passes (~0.1s) with
    the fix applied.
  • Same spec fails after 5 seconds (Timeout::ExitException inside OCI8::Cursor#__execute) when the
    fix is reverted, confirming the test exercises the regression.
  • Full CI on Oracle Free 23ai (amd64 server image).

Closes #2619.

@yahonda yahonda force-pushed the oci8-returning-deadlock-2619 branch 2 times, most recently from 2c62dd4 to 974015b Compare April 27, 2026 22:15
yahonda added a commit to yahonda/ruby-oci8 that referenced this pull request Apr 27, 2026
Re-executing the same parsed cursor under CURSOR_SHARING=FORCE (or
SIMILAR) when the SQL contains BOTH a literal AND a RETURNING ... INTO
:bind clause hangs forever on the second exec. The server reports
"SQL*Net more data from client"; the client sits in
OCIStmtExecute -> kpuexec -> ... -> nttfprd -> read().

A SQL*Net wire trace shows OCIStmtExecute constructs a 31-byte fast-path
"execute already-parsed cursor" packet, and that packet differs from the
working (CURSOR_SHARING=EXACT) case by exactly one byte computed inside
libclntsh from cursor state set by exec kubo#1's response. ruby-oci8 calls
OCIStmtExecute with identical arguments in both runs, so the defect is
in the Oracle Client TTC marshalling, not ruby-oci8.

Add an opt-in Cursor#reparse_between_execs accessor (default false) that
releases and re-prepares the underlying statement handle before every
exec after the first, forcing OCI onto the full parse+exec path and
avoiding the buggy fast path. Bind values are preserved.

- ext/oci8/stmt.c: add private __reprepare(sql), and let oci8_bind cleanly
  detach a bind object from a previous statement so the same valuep/indp
  buffers can be re-attached to a fresh statement handle.
- lib/oci8/cursor.rb: remember the parsed SQL on initialize, expose
  reparse_between_execs, and re-bind every entry in @bind_handles after
  __reprepare.
- test/test_reparse_between_execs.rb: regression test covering the
  literal + RETURNING + same-cursor scenario plus a plain repeated
  INSERT to ensure the flag does not affect ordinary cursors.

The flag is off by default; existing cursors keep their fast-path
behavior. Leave it false unless you actually hit the deadlock - the
extra round trip costs measurable latency on hot paths.

Refs: rsim/oracle-enhanced#2619, rsim/oracle-enhanced#2620.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Apr 27, 2026
The "does not raise NoMethodError for :returning_id Symbol when logging"
example exposes a SQL*Net half-duplex deadlock when ruby-oci8 +
cursor_sharing=force are combined with RETURNING ... INTO :returning_id;
see rsim#2619 for the full diagnosis and minimal repro. The fix is tracked
on the adapter side in rsim#2620.

Skip the example until rsim#2619 / rsim#2620 lands; restore in a follow-up
commit on this branch once those merge.
…ehavior

Background
----------

Before the Rails 8 adoption commit (baf9fd4 "Rails 8 support",
2025-02-23), `_exec_insert` decided whether to reuse a cached prepared
statement using `without_prepared_statement?(binds)`, which factored
in the `prepared_statements` config:

  if without_prepared_statement?(binds)
    cursor = _connection.prepare(sql)            # fresh cursor
  else
    @Statements[sql] ||= _connection.prepare(sql) # cached cursor
    cursor = @Statements[sql]
    cursor.bind_params(...)
    ...
  end

`baf9fd48` replaced that helper with the following predicate, in line
with the broader Rails 8 changes that retired `without_prepared_statement?`:

  if binds.nil? || binds.empty?
    cursor = _connection.prepare(sql)            # fresh cursor
  else
    @Statements[sql] ||= _connection.prepare(sql) # cached cursor
    ...
  end

After that change, the `prepared_statements` config is no longer part
of the cache decision in `_exec_insert`. Calls that arrive with
non-empty binds — including raw
`connection.insert("INSERT ... VALUES ('literal')", nil, "id")`, where
`sql_for_insert` appends a single `:returning_id` placeholder — go
through the cached-cursor branch on every invocation.

How it surfaced
---------------

Three things lined up:

  1. A set of pre-2018 specs that issue the same raw-SQL INSERT with
     `RETURNING ... INTO :returning_id` more than once on the same
     adapter were removed in 2017 (commit ef321a1).
  2. While those specs were absent, baf9fd4 (Rails 8 support, Feb
     2025) widened `_exec_insert`'s cached-cursor branch as described
     above.
  3. The same spec patterns were restored in 2026 as part of porting
     the trigger-based primary-key option (rsim#2615), and ran against the
     post-baf9fd48 cache decision for the first time.

Under that combination, plus `cursor_sharing = FORCE` (the adapter
default), executing the same cached cursor a second time produces a
SQL*Net half-duplex deadlock between ruby-oci8 and the Oracle 23ai
server: the server waits in `SQL*Net more data from client` while
ruby-oci8 stays in `OCIStmtExecute -> nttfprd -> read()`. See rsim#2619
for the full diagnosis and a 5-line standalone repro.

Fix
---

Reintroduce the prepared-statements check using the Rails 8.2
QueryIntent successor to `without_prepared_statement?`: `intent.prepare`.
PostgreSQL gates on the same flag (rails-3d7458fecf49
postgresql/database_statements.rb:167).

For raw SQL paths, `intent.prepare` is left as its default `false`
(set by `connection.insert` in abstract/database_statements.rb), so
the fresh-cursor path runs and the cached-cursor combination above is
not reached.

Behavior after this change:

  - `klass.create!(attrs)` (Arel insert, prepared_statements=true)
    keeps the cached path.
  - `connection.insert("INSERT ...", nil, "id")` (raw SQL) takes the
    fresh-cursor path and never reuses a cursor with a stale OCI bind
    state.
  - Connections with `prepared_statements: false` always use fresh
    cursors, matching the pre-`baf9fd48` behavior.

Tradeoff
--------

The change moves raw-SQL `connection.insert` paths to a fresh cursor
on every call, giving up the cached-cursor reuse benefit for those
calls. That tradeoff is acceptable because:

  - Issuing the exact same raw SQL string through `connection.insert`
    twice in a row is uncommon in Rails application code (model paths
    go through Arel and parameterise their bind values).
  - A SQL*Net deadlock that wedges the rest of the suite is a larger
    problem than losing per-call cursor reuse on a rarely exercised
    path.

The cached path is preserved unchanged for ORM (Arel-based) inserts,
which are the high-volume callers.

Regression spec
---------------

`spec/.../exec_insert_caching_spec.rb` issues the same
`INSERT ... NEXTVAL ... RETURNING "ID" INTO :returning_id` SQL twice
through `connection.insert`. Without the fix the second exec hits the
half-duplex deadlock and trips the spec's 5-second `Timeout`; with the
fix it completes well under 100ms. A short comment notes the
dependency on the default `cursor_sharing = force` so a future change
in defaults does not silently turn the regression test into a no-op.

Closes rsim#2619.
@yahonda yahonda force-pushed the oci8-returning-deadlock-2619 branch from 974015b to cb44e74 Compare April 28, 2026 02:25
yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Apr 28, 2026
Introduces :default as the new effective default for the :cursor_sharing
connection option. With :default (or unset) the adapter does not run
ALTER SESSION SET cursor_sharing and the server's instance-level setting
is kept. Explicit values ('exact', 'force', 'similar') continue to issue
the corresponding ALTER SESSION.

The previous behavior unconditionally applied 'force', a default that
predates AR's prepared statement support (commit ebb60f3, 2009). With
prepared_statements: true (the AbstractAdapter default for adapters that
do not override it -- includes oracle_enhanced) AR is binding literals
already, and 'force' offered no additional cursor reuse on that path
while exposing connections to the rsim#2619 hang combination on amd64
servers (cached cursor + RETURNING INTO bind + a literal the server
rewrites). rsim#2620 already removed that exposure for the typical Rails
ORM path; dropping the implicit force default removes it for raw-SQL
callers as well.

A code comment near the new logic flags the dictionary-introspection
trade-off: AR's schema queries (all_tab_columns / all_indexes / etc.)
embed owner/table/column names as literals, so without 'force' each
unique combination produces a fresh shared cursor on the server. If
that becomes a measurable shared-pool problem for installs with very
large schemas, the default can be reconsidered or an explicit
`cursor_sharing: 'force'` can be documented as the recommendation.

Tests use a separate SYSTEM AR connection (SystemObserver) only to
read v$parameter / v$ses_optimizer_env; the test sessions themselves
use the regular CONNECTION_PARAMS user.

Refs rsim#2622.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Apr 28, 2026
Introduces :default as the new effective default for the :cursor_sharing
connection option. With :default (or unset) the adapter does not run
ALTER SESSION SET cursor_sharing and the server's instance-level setting
is kept. Explicit values ('exact', 'force', 'similar') continue to issue
the corresponding ALTER SESSION.

The previous behavior unconditionally applied 'force', a default that
predates AR's prepared statement support (commit ebb60f3, 2009). With
prepared_statements: true (the AbstractAdapter default since Rails 7.1
via rails/rails#45945) AR is binding literals already, and 'force'
offered no additional cursor reuse on that path while exposing
connections to the rsim#2619 hang combination on amd64 servers (cached
cursor + RETURNING INTO bind + a literal the server rewrites). rsim#2620
already removed that exposure for the typical Rails ORM path; dropping
the implicit force default removes it for raw-SQL callers as well.

The doc comment for :cursor_sharing notes the dictionary-introspection
trade-off: AR's schema queries (all_tab_columns / all_indexes / etc.)
embed owner/table/column names as literals, so without 'force' each
unique combination produces a fresh shared cursor on the server. Installs
hitting shared-pool pressure from very large schemas can opt back in by
setting cursor_sharing: 'force' explicitly.

Tests use a separate SYSTEM AR connection (SystemObserver) only to read
v$parameter / v$ses_optimizer_env (normalized to upper-case to match
across the two views); the test sessions themselves use the regular
CONNECTION_PARAMS user.

Refs rsim#2622.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Apr 29, 2026
Stops the adapter from issuing `ALTER SESSION SET cursor_sharing = force`
at session login. Introduces :default as the new effective default for
the :cursor_sharing connection option; with :default (or unset) the
adapter does not run any ALTER SESSION and the database's instance-level
setting is what the session sees. Explicit values ('exact', 'force')
continue to issue the corresponding ALTER SESSION.

Oracle's documented software default for CURSOR_SHARING is EXACT
(https://docs.oracle.com/en/database/oracle/oracle-database/26/refrn/CURSOR_SHARING.html).
Most installations have not changed it, and many users have not been
aware that oracle_enhanced has been silently overriding it to FORCE at
the session level since 2009 (commit ebb60f3).

The implicit `force` made sense at the time it was added: ActiveRecord
did not yet bind literals, so the adapter sent SQL with literals inlined
and `force` was the only way the server could collapse that into a
single shared cursor. That premise no longer holds:

  * The AbstractAdapter default for prepared_statements is true since
    Rails 7.1 (rails/rails#45945, commit a0fd15ee7e), inherited by
    oracle_enhanced. Application SQL is bound at the AR layer, so 'force'
    has nothing left to rewrite for that traffic.
  * All all_* / user_* dictionary queries inside oracle_enhanced are
    now bind-driven (rsim#2629 finished the last four), so dictionary
    access doesn't depend on 'force' for shared-cursor reuse either.
  * Keeping 'force' on by default also keeps connections exposed to the
    rsim#2619 hang combination on amd64 Oracle Database (cached cursor +
    a literal the server rewrites + RETURNING ... INTO). rsim#2620 already
    removed the exposure for the typical Rails ORM path; dropping the
    implicit force default removes it for raw-SQL callers as well.

After this PR, sessions run with whatever CURSOR_SHARING the database
is configured with — for almost all installations that means Oracle's
documented default EXACT, which is what users were already (unknowingly)
running on the server side. The adapter simply stops imposing its own
value on top.

Anyone who relied on FORCE — typically connections still configured
with prepared_statements: false, where AR interpolates application-SQL
literals at the visitor level — can opt back in with
cursor_sharing: 'force' in database.yml.

Tests use a separate SYSTEM AR connection (SystemObserver) only to read
v$parameter / v$ses_optimizer_env (normalized to upper-case to match
across the two views); the test sessions themselves use the regular
CONNECTION_PARAMS user.

Refs rsim#2622.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
yahonda added a commit to yahonda/oracle-enhanced that referenced this pull request Apr 29, 2026
Stops the adapter from issuing `ALTER SESSION SET cursor_sharing = force`
at session login. Introduces :default as the new effective default for
the :cursor_sharing connection option; with :default (or unset) the
adapter does not run any ALTER SESSION and the database's instance-level
setting is what the session sees. Explicit values ('exact', 'force')
continue to issue the corresponding ALTER SESSION.

Oracle's documented software default for CURSOR_SHARING is EXACT
(https://docs.oracle.com/en/database/oracle/oracle-database/26/refrn/CURSOR_SHARING.html).
Most installations have not changed it, and many users have not been
aware that oracle_enhanced has been silently overriding it to FORCE at
the session level since 2009 (commit ebb60f3).

The implicit `force` made sense at the time it was added: ActiveRecord
did not yet bind literals, so the adapter sent SQL with literals inlined
and `force` was the only way the server could collapse that into a
single shared cursor. That premise no longer holds:

  * The AbstractAdapter default for prepared_statements is true since
    Rails 7.1 (rails/rails#45945, commit a0fd15ee7e), inherited by
    oracle_enhanced. Application SQL is bound at the AR layer, so 'force'
    has nothing left to rewrite for that traffic.
  * All all_* / user_* dictionary queries inside oracle_enhanced are
    now bind-driven (rsim#2629 finished the last four), so dictionary
    access doesn't depend on 'force' for shared-cursor reuse either.
  * Keeping 'force' on by default also keeps connections exposed to the
    rsim#2619 hang combination on amd64 Oracle Database (cached cursor +
    a literal the server rewrites + RETURNING ... INTO). rsim#2620 already
    removed the exposure for the typical Rails ORM path; dropping the
    implicit force default removes it for raw-SQL callers as well.

After this PR, sessions run with whatever CURSOR_SHARING the database
is configured with — for almost all installations that means Oracle's
documented default EXACT, which is what users were already (unknowingly)
running on the server side. The adapter simply stops imposing its own
value on top.

Anyone who relied on FORCE — typically connections still configured
with prepared_statements: false, where AR interpolates application-SQL
literals at the visitor level — can opt back in with
cursor_sharing: 'force' in database.yml.

Tests use a separate SYSTEM AR connection (SystemObserver) only to read
v$parameter / v$ses_optimizer_env (normalized to upper-case to match
across the two views); the test sessions themselves use the regular
CONNECTION_PARAMS user.

Refs rsim#2622.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@yahonda
Copy link
Copy Markdown
Collaborator Author

yahonda commented Apr 29, 2026

Closing without merging.

The deadlock combination this PR worked around (#2619: ruby-oci8 + cursor_sharing=force × RETURNING ... INTO bind, server-side) is no longer reachable from the adapter's default configuration after #2626 stopped imposing cursor_sharing = force at session login. Users who still opt back in with cursor_sharing: 'force' retain the underlying server-side exposure, but that is a deliberate choice on their side, not something the adapter should silently work around.

The companion specs that were re-enabled in #2653 confirm the deadlock no longer triggers under the default config.

@yahonda yahonda closed this Apr 29, 2026
@yahonda yahonda deleted the oci8-returning-deadlock-2619 branch April 29, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OCI8 + cursor_sharing=force x INSERT ... RETURNING INTO :returning_id hangs in SQL*Net half-duplex deadlock

1 participant