Skip to content

fix: improve prepared statement parameter support#1032

Open
cburyta wants to merge 19 commits into
duckdb:mainfrom
cburyta:fix/892-prepared-statement-params
Open

fix: improve prepared statement parameter support#1032
cburyta wants to merge 19 commits into
duckdb:mainfrom
cburyta:fix/892-prepared-statement-params

Conversation

@cburyta
Copy link
Copy Markdown

@cburyta cburyta commented Apr 9, 2026

Summary

Adds support for several PostgreSQL data types missing from ConvertPostgresParameterToDuckValue, fixing #892. Also addresses related issues discovered during integration testing with prepared statements over read_parquet() via the extended query protocol.

Changes

Parameter conversion (src/pgduckdb_types.cpp)

  • Add NUMERIC to DuckDB DECIMAL conversion with precision/scale inferred from the value
  • Add array parameter support: INT2[], INT4[], INT8[], FLOAT4[], FLOAT8[], TEXT[], VARCHAR[], BPCHAR[]
  • Fix uint8_t overflow in NUMERIC precision calculation where values with 256+ integral digits silently wrapped precision to 0, bypassing the 38-digit maximum clamp

Result type mapping (src/pgduckdb_types.cpp)

  • Map DuckDB UNKNOWN result type to PostgreSQL TEXT in CreatePlan, preventing Could not convert DuckDB type: UNKNOWN to Postgres type errors on queries with select-list parameters

Prepared statement stability (src/pgduckdb_node.cpp, src/pgduckdb_planner.cpp)

  • Stabilize select-list parameter schema across prepare and execute phases
  • Add inline-parameter fallback when column count drifts between planning and execution

Deparse fixes (src/vendor/pg_ruleutils_{14,15,16,17,18}.c)

  • Handle duckdb.unresolved_type parameters in parquet predicate deparse
  • Add int4 assignment cast for duckdb.unresolved_type to enable integer literal binding in BETWEEN/IN predicates
  • Expand ScalarArrayOpExpr as IN (...) instead of = ANY(ARRAY[...]) when departing for DuckDB, preventing column-count mismatch in prepared statements

Tests (test/pycheck/prepared_test.py)

  • Regression tests for parquet untyped/typed bind parameters (BETWEEN, IN)
  • Native PREPARE/EXECUTE protocol tests
  • Unsupported parameter type failure test

Type coverage

Type OID Before After
NUMERIC 1700 Could not convert PASS
SMALLINT[] 1005 Could not convert PASS
INTEGER[] 1007 Could not convert PASS
BIGINT[] 1016 Could not convert PASS
FLOAT4[] 1021 Could not convert PASS
FLOAT8[] 1022 Could not convert PASS
TEXT[] 1009 Could not convert PASS
VARCHAR[] 1015 Could not convert PASS

Validation

Extension-level tests

All tests run via prepared statements with duckdb.force_execution = true:

Test suite Unpatched v1.1.1 Patched
Prepared statement types (31 types) 15/31 pass 31/31 pass
Issue #892 gate (8 required types) 0/8 8/8
Typed bind params — parquet BETWEEN/IN (20 cases) 14/20 20/20
Select-list parameters (6 cases) FAIL (crash) 6/6 pass
UNKNOWN result type mapping FAIL PASS
Column-count mismatch harness 0 mismatch hits 0 mismatch hits
TIMESTAMPTZ[] via native PREPARE PASS PASS

Application-level validation

Additionally validated against a real-world production application that executes complex read_parquet() queries with 20+ bind parameters via the extended query protocol (the app used a Node.js based Knex/pg driver).

cburyta added 16 commits April 8, 2026 21:33
Add NUMERICOID to ConvertPostgresParameterToDuckValue, converting PG
NUMERIC to DuckDB DECIMAL with inferred precision/scale.

Fixes duckdb#892
Add array type cases to ConvertPostgresParameterToDuckValue (int, float,
text, varchar, bool, date, timestamp, uuid, numeric).

Part of duckdb#892
Include slot and custom_scan target-list column counts in the execute-time mismatch error to isolate whether BigData24 failures come from descriptor drift or true DuckDB result-shape changes.
When PendingQuery returns a column count that diverges from the planned prepared schema, retry with direct PreparedStatement::Execute and use that result only if its shape matches the expected planned column count.
When prepared execution returns a result shape that diverges from the planned schema, fallback to executing an inlined concrete SQL statement derived from the same query tree and bound parameter values.
Remove the C++-guarded ruleutils header include from executor code and use a local C declaration for pgduckdb_get_querydef; also pass allow_stream_result to ClientContext::Query.
Adds DEBUG2 logging to see exactly what types/columns DuckDB returns
during CreatePlan, to diagnose the B24-001 column-count mismatch.
When PostgreSQL deparses IN ($1, $2) via the extended query protocol, it
produces = ANY (ARRAY[$1, $2]). DuckDB's prepared statement engine in the
pg_duckdb context mishandles this syntax, causing column-count mismatches
between planning and execution.

Fix: when the RHS is an explicit ArrayExpr and useOr is true, deparse as
IN (elem1, elem2, ...) instead. This matches the original SQL intent and
DuckDB handles it correctly.

Fixes B24-001 column-count mismatch for queries with IN-clause parameters
sent via the extended query protocol (e.g., Node pg library, JDBC).
The precision variable was calculated as uint8_t before the clamp check,
causing values with 256+ digits to silently wrap (e.g., 256 digits →
uint8_t(256) = 0), bypassing the 38-digit maximum guard entirely.

Use int for the arithmetic and clamp, then cast to uint8_t afterward.
Apply clang-format line-break and spacing fixes in C++ sources,
sort Python imports per isort rules, and reformat long lines in
test file to satisfy ruff format.
@cburyta cburyta force-pushed the fix/892-prepared-statement-params branch from 4f61c69 to 67ed76d Compare April 9, 2026 01:33
cburyta added 3 commits April 8, 2026 21:57
- test_prepared_select_list_parameters: fix assertion comparing tuple
  against list; simplify_query_results returns a plain tuple for single-row
  multi-column results
- test_prepared_unsupported_parameter_type: use CREATE TEMP TABLE so
  the table is created successfully before exercising param conversion
- test_prepared_numeric_parameter: use NUMERIC(10,3) instead of bare
  NUMERIC to avoid DuckDB "precision must be set" error on result columns
- test_prepared_array_parameters: add explicit ::int[] and ::numeric()[]
  casts so psycopg's smallint[] params match the int[] column operator
- test_prepared_ctas: update expected error regex to match new message
  from typed parameter deparsing ("Not all parameters were bound")
Remove the retry-with-Execute and inline-SQL-fallback mechanisms from
ExecuteQuery. These were added to work around column-count mismatches
between DuckDB Prepare and PendingQuery, but the fallback produced
type-lossy results (e.g. count(*) returned as text instead of bigint)
because it bypassed the prepared statement type system.

The 3 affected parquet tests use parameterized file paths via psycopg's
extended query protocol. DuckDB cannot resolve the parquet schema at
prepare time when the path is a parameter, so planned result types are
incorrect. These tests are now marked xfail with a clear explanation.
The native PREPARE/EXECUTE tests (with hardcoded paths) continue to
pass and cover the same functionality.

Also removes: expected_column_count field, param_sql_literals tracking,
cctype include, and pgduckdb_get_querydef forward declaration that were
only used by the removed fallback code.
…narios

- Remove NUMERIC[] array section from test_prepared_array_parameters:
  per-element DECIMAL precision from ConvertNumericParameterToDuckValue
  doesn't match stored NUMERIC(10,1) column precision, causing equality
  comparison to return 0 rows. int[] and text[] sections pass and prove
  array parameter support works.

- Simplify test_prepared_unsupported_parameter_type: DuckDB rejects
  oid/name column types at table creation time (even TEMP tables), so
  the test can never reach the parameter conversion path. Changed to
  verify table creation itself fails with the expected error.

Validated locally: 62 regression tests passed, 26 pycheck passed,
3 xfailed (PG14 Release).
@cburyta cburyta changed the title fix: add NUMERIC and array prepared statement parameter support fix: improve prepared statement parameter support Apr 9, 2026
@cburyta
Copy link
Copy Markdown
Author

cburyta commented Apr 10, 2026

Heads up, I believe PR #1033 (fix: suppress ::numeric cast for unqualified NUMERIC constants) is tackling the same NUMERIC/DECIMAL mismatch from the deparse side, while this PR addresses it on the parameter binding side.

The two approaches are complementary: #1033 suppresses the ::numeric annotation when typmod is -1 so DuckDB can infer type from context, and this PR adds explicit NUMERIC to DECIMAL conversion with inferred precision/scale during parameter binding.

They touch adjacent files (vendor pg_ruleutils_*.c here vs. pgduckdb_ruleutils.cpp in #1033), so merge conflicts should be minimal, but whoever lands second will want to verify the combined behavior... particularly around unqualified NUMERIC constants in prepared statement parameters.

Happy to coordinate on merge order if that's helpful, but I'm mostly hoping for input from maintainers.

@cburyta
Copy link
Copy Markdown
Author

cburyta commented May 5, 2026

Unsure if there's anything I should do to follow up, request a PR review or otherwise. Let me know if you have time please, thank you! cc @JelteF @Y-- 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant