Skip to content

[FLINK-32609] Support Projection Pushdown#174

Open
fqshopify wants to merge 3 commits intoapache:mainfrom
Shopify:support_projection_pushdown
Open

[FLINK-32609] Support Projection Pushdown#174
fqshopify wants to merge 3 commits intoapache:mainfrom
Shopify:support_projection_pushdown

Conversation

@fqshopify
Copy link

@fqshopify fqshopify commented May 9, 2025

Implements SupportsProjectionPushDown for KafkaDynamicSource allowing queries to skip deserializing columns they don't need.

Configuration

Option Default Values
key.format-projection-pushdown-level NONE NONE, TOP_LEVEL, ALL
value.format-projection-pushdown-level NONE NONE, TOP_LEVEL, ALL

Benefits

  1. Schema evolution tolerance - Queries only fail on breaking changes to fields they actually use. For example, if column a changes from INT to STRING, the query SELECT b FROM kafka still succeeds because a is never deserialized. (Only applies to formats that decode fields independently e.g. json, avro-confluent, debezium-avro-confluent)

  2. Performance - Unneeded columns are filtered at the TableSourceScan node rather than downstream. Note: Kafka itself does not support projection pushdown, so the optimization happens during deserialization.

Why opt-in?

Formats have varying support levels, and incorrect configuration can cause runtime errors or wrong results. Users must explicitly enable the appropriate level for their format. Recommended configuration:

Format Recommended Level
avro NONE (has issues, see FLINK-35324)
avro-confluent TOP_LEVEL
debezium-avro-confluent TOP_LEVEL
csv TOP_LEVEL (CSV can't have nested fields technically so users could set ALL)
json ALL

@fqshopify fqshopify force-pushed the support_projection_pushdown branch 3 times, most recently from 218aec0 to 1192c9b Compare May 9, 2025 13:00
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from 1192c9b to e22ed76 Compare June 15, 2025 10:17
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from e22ed76 to 3dc9030 Compare June 25, 2025 18:00
@fqshopify fqshopify changed the base branch from main to v4.0 June 25, 2025 18:00
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from 3dc9030 to acc9e2e Compare June 25, 2025 18:04
@fqshopify fqshopify marked this pull request as ready for review June 25, 2025 20:35
@fqshopify fqshopify force-pushed the support_projection_pushdown branch 2 times, most recently from cc64369 to 722df65 Compare June 25, 2025 20:38
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from 722df65 to 04e0bd6 Compare July 25, 2025 21:44
@fqshopify fqshopify changed the base branch from v4.0 to main July 25, 2025 21:44
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from 04e0bd6 to e9fa8ff Compare July 25, 2025 21:45
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from e9fa8ff to dda90a7 Compare October 16, 2025 10:12
@github-actions
Copy link

This PR is being marked as stale since it has not had any activity in the last 90 days.
If you would like to keep this PR alive, please leave a comment asking for a review.
If the PR has merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out to the
community, contact details can be found here: https://flink.apache.org/what-is-flink/community/

If this PR is no longer valid or desired, please feel free to close it.
If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions bot added the stale label Jan 15, 2026
@ryanvanhuuksloot
Copy link

bumping to keep alive

@github-actions github-actions bot removed the stale label Jan 16, 2026
@fqshopify fqshopify force-pushed the support_projection_pushdown branch 4 times, most recently from 2b1c998 to 95374fd Compare January 22, 2026 14:55
Copy link
Contributor

@Savonitar Savonitar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi.
Thanks for working on this feature, it is a useful optimization and the implementation looks solid. I've started reviewing the PR and posted some questions/comments.

TOP_LEVEL,

/** The format supports projection pushdown for top-level and nested fields. */
ALL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we validate format compatibility with projection pushdown levels? To prevent unsupported combination.

Copy link
Author

@fqshopify fqshopify Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't validate format compatibility at configuration time. This was intentional. Although I've done some of the work to figure out what level of projection is supported by some formats (avro, csv, json, avro-confluent, debezium-avro-confluent), the fact is there are more formats out there beyond those maintained by Flink or even in the open source ecosystem (we have some custom formats at my company for example). Even for the formats I've figured out, it's possible the support could potentially vary by version e.g. if the avro format projection pushdown bugs are fixed.

Something we could consider long-term is adding a method to ProjectableDecodingFormat to indicate its supported projection level. However, even that wouldn't fully solve the problem e.g. the avro format implements ProjectableDecodingFormat but doesn't actually support all projection pushdowns.

In the short-term, I've chosen to simply document the appropriate combinations in the flink-connector-kafka config docs here.


@Override
public void applyProjection(final int[][] projectedFields, final DataType producedDataType) {
this.projectedPhysicalFields = projectedFields;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update this.producedDataType here?

Copy link
Author

@fqshopify fqshopify Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! We don't need to update producedDataType in applyProjection because the javadoc for applyReadableMetadata explicitly addresses this scenario:

    /**
     * ...
     *
     * <p>Note: Use the passed data type instead of {@link ResolvedSchema#toPhysicalRowDataType()}
     * for describing the final output data type when creating {@link TypeInformation}. If the
     * source implements {@link SupportsProjectionPushDown}, the projection is already considered in
     * the given output data type, use the {@code producedDataType} provided by this method instead
     * of the {@code producedDataType} provided by {@link
     * SupportsProjectionPushDown#applyProjection(int[][], DataType)}.
     * 
     * ...
     */
    void applyReadableMetadata(List<String> metadataKeys, DataType producedDataType);

So we're following the recommended approach i.e. using the producedDataType from applyReadableMetadata rather than the one from applyProjection

@Override
public boolean supportsMetadataProjection() {
return false;
throw new IllegalStateException(
Copy link
Contributor

@Savonitar Savonitar Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw an exception instead of returning a boolean? Could you please elaborate on throwing an exception?

E.g. what if this method is called for logging purpose?

I've checked other flink-connectors and they either return false or use the default (return true), none throw an exception.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the javadoc for supportsMetadataProjection:

    /**
     * ...
     *
     * <p>This method is only called if the source does <em>not</em> implement {@link
     * SupportsProjectionPushDown}.
     *
     * ...
     */
    default boolean supportsMetadataProjection() {

Since KafkaDynamicSource now implements SupportsProjectionPushDown, this method should never be called by the planner. Originally, my reasoning for throwing an exception here was to catch any unexpected invocations that would indicate a bug in the planner.

That said, I take your point about consistency with other connectors and potential issues if it's called for logging/debugging purposes. I've now removed the method entirely (so it will by default return true) 👍

* org.apache.flink.streaming.connectors.kafka.table.KafkaDynamicSource}.
*/
@Internal
public enum FormatProjectionPushdownLevel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum (NONE, TOP_LEVEL, ALL) requires users to understand:

  1. Whether the format supports projection pushdown
  2. Whether the format supports nested projections
  3. Known bugs in specific formats (e.g. as was mentioned in the PR description "Avro FLINK-35324")

Maybe we can replace per-format level configuration with a single boolean flag:

'projection-pushdown.enabled' = 'true'

and internally format will decide which projection it could use.

Would this approach address the use cases you had in mind? Or this is added intentionally as a potential workaround for problems with formats?

Copy link
Author

@fqshopify fqshopify Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or this is added intentionally as a potential workaround for problems with formats?

Yep, that's one of the main reasons but also because there are more formats out there that I can't account for. See my response to your first question here for more details.

I'm open to discussing this more. This is the only part of this PR I don't like but I felt was a necessary compromise in the short-term.


/** A {@link ScanTableSource} for {@link DynamicKafkaSource}. */
@Internal
public class DynamicKafkaTableSource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify why DynamicKafkaTableSource does not implement SupportsProjectionPushDown?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DynamicKafkaTableSource is a new source that was just recently added a couple of weeks ago. I'm trying to keep this PR as small as possible so I haven't added SupportsProjectionPushDown support here yet but I'd be happy to implement it in the future.

Let me know if you'd prefer I include it in this PR or follow up separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with keeping PRs small and focused. A follow-up PR is fine. I think it makes sense to create a follow-up Jira ticket after the approvals/merge so we don't lose it

@fqshopify fqshopify force-pushed the support_projection_pushdown branch 2 times, most recently from 7248a83 to 1f283ee Compare February 2, 2026 20:41
@Savonitar
Copy link
Contributor

Sorry for the delay. I want to take another look and complete my review next week.

Copy link
Contributor

@Savonitar Savonitar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates/replies!

}

@Override
public Set<ConfigOption<?>> optionalOptions() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Shouldn't we add KEY_PROJECTION_PUSHDOWN_LEVEL and VALUE_PROJECTION_PUSHDOWN_LEVEL to optionalOptions() method?
    The factory reads these options at line 190-191 (in createDynamicTableSource) and passes them to KafkaDynamicSource, but they aren't registered as optional options.
    As a result, when a user sets key.format-projection-pushdown-level = TOP_LEVEL on an upsert-kafka table, FactoryHelper will throw
ValidationException: Unsupported options found for 'upsert-kafka' 

because the key isn't in the consumed set.

Or am I missing something?

  1. Could we add a test that creates an upsert-kafka table with a non-default pushdown level to catch this?

Copy link
Author

@fqshopify fqshopify Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was a miss, great catch!
I haven't had to use projection-pushdown + upsert yet.

Added the options and test in the latest commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the pushdown options to optionalOptions().

However, it looks like format-level projection pushdown doesn't work for upsert-kafka. Regardless of the configured pushdown level, all value fields are always fully deserialized, and projection happens only afterwards via the Projector.

The root cause is DecodingFormatWrapper -> it only implements DecodingFormat, not ProjectableDecodingFormat. When Decoder.create() checks decodingFormat instanceof ProjectableDecodingFormat, the wrapper always fails the check, so the projectInsideDeserializer path is never taken for upsert-kafka.

I implemented a failing test to verify this hypothesis and to provide a basis for the fix: Savonitar@8e2c1aed

The test writes data with a breaking schema change on a non-projected value field (name changes from INT to STRING), then reads selecting only user_id and payload. With working format-level pushdown, name would be skipped during deserialization and the query would succeed. Instead, it fails with JsonParseException because all fields are deserialized.

I took into account the style of existing tests in UpsertKafkaTableITCase and your KafkaTableITCase#testProjectionPushdownWithJsonFormatAndBreakingSchemaChange to align. I kept the commit in my fork rather than pushing to your branch -> feel free to cherry-pick it as a regression test, or reimplement it entirely if you prefer. The test should pass when the fix is implemented.

A possible fix: make DecodingFormatWrapper also implement ProjectableDecodingFormat and delegate createRuntimeDecoder(context, producedDataType, projections) to the inner format when it's a ProjectableDecodingFormat.

Could you please take a look and implement the fix if you agree with the analysis?

@fqshopify fqshopify force-pushed the support_projection_pushdown branch from 1f283ee to e49caa8 Compare February 20, 2026 18:55
@fqshopify fqshopify force-pushed the support_projection_pushdown branch from e49caa8 to a525328 Compare February 20, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants