Skip to content

Conversation

renovate[bot]
Copy link

@renovate renovate bot commented Apr 6, 2025

This PR contains the following updates:

Package Change Age Confidence
github.com/twmb/franz-go v1.3.1 -> v1.19.5 age confidence

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

twmb/franz-go (github.com/twmb/franz-go)

v1.19.5

Compare Source

===

Fixes a bug introduced in 1.19.3 that caused batched FindCoordinator requests
to no longer work against older brokers (Kafka brokers before 2.4, or all
Redpanda versions brokers).

All credit to @​douglasbouttell for exactly diagnosing the bug.

  • 06272c66 bugfix kgo: bugfix batched FindCoordinator requests against older brokers

v1.19.4

Compare Source

===

Fixes one bug introduced from the prior release (an obvious data race in
retrospect), and one data race introduced in 1.19.0. I've looped the tests more
in this release and am not seeing further races. I don't mean to downplay the
severity here, but these are races on pointer-sized variables where reading the
before or after state is of little difference. One of the read/write races is
on a context.Context, so there are actually two pointer sized reads & writes --
but reading the (effectively) type vtable for the new context and then the data
pointer for the old context doesn't really break things here. Anyway, you
should upgrade.

This also adds a workaround for Azure EventHubs, which does not handle
ApiVersions correctly when the broker does not recognize the version we are
sending. The broker should reply with an UNSUPPORTED_VERSION error and
reply with the version the broker can handle. Instead, Azure is resetting the
connection. To workaround, we detect a cxn reset twice and then downgrade the
request we send client side to 0.

  • 7910f6b6 kgo: retry connection reset by peer from ApiVersions to work around EventHubs
  • d310cabd kgo: fix data read/write race on ctx variable
  • 7a5ddcec kgo bugfix: guard sink batch field access more

v1.19.3

Compare Source

===

This release fully fixes (and has a positive field report) the KIP-890 problem
that was meant to be fixed in v1.19.2. See the commit description for more
details.

  • a13f633b kgo: remove pinReq wrapping request

v1.19.2

Compare Source

===

This release fixes two bugs, a data race and a misunderstanding in some of the
implementation of KIP-890.

The data race has existed for years and has only been caught once. It could
only be encountered in a specific section of decoding a fetch response WHILE a
metadata response was concurrently being handled, and the metadata response
indicated a partition changed leaders. The race was benign; it was a read race,
and the decoded response is always discarded because a metadata change
happened. Regardless, metadata handling and fetch response decoding are no
longer concurrent.

For KIP-890, some things were not called out all to clearly (imo) in the KIP.
If your 4.0 cluster had not yet enabled the transaction.version feature v2+,
then transactions would not work in this client. As it turns out, Kafka 4
finally started using a v2.6 introduced "features" field in a way that is
important to clients. In short: I opted into KIP-890 behavior based on if a
broker could handle requests (produce v12+, end txn v5+, etc). I also needed to
check if "transaction.version" was v2+. Features handling is now supported in
the client, and this single client-relevant feature is now implemented.

See the commits for more details.

  • dda08fd9 kgo: fix KIP-890 handling of the transaction.version feature
  • 8a364819 kgo: fix data race in fetch response handling

v1.19.1

Compare Source

===

This release fixes a very old bug that finally started being possible to hit in
v1.19.0. The v1.19.0 release does not work for Kafka versions pre-4.0. This
release fixes that (by fixing the bug that has existed since Kafka 2.4) and
adds a GH action to test against Kafka 3.8 to help prevent regressions against
older brokers as this library marches forward.

  • 50aa74f1 kgo bugfix: ApiVersions replies only with key 18, not all keys

v1.19.0

Compare Source

===

This is the largest release of franz-go yet. The last patch release was Jan 20, '25.
The last minor release was Oct 14, '24.

A big reason for delays the past few month+ has been from spin looping tests
and investigating any issue that popped up. Another big delay is that Kafka has
a full company adding features -- some questionable -- and I'm one person that
spent a significant amount of time catching this library up with the latest
Kafka release. Lastly, Kafka released Kafka v3.9 three weeks after my last
major release, and simultaneously, a few requests came in for new features in
this library that required a lot of time. I wanted a bit of a break and only
resumed development more seriously in late Feb. This release is likely >100hrs
of work over the last ~4mo, from understanding new features and implementing
them, reviewing PRs, and debugging rare test failures.

The next Kafka release is poised to implement more large features (share
groups), which unfortunately will mean even more heads down time trying to bolt
in yet another feature to an already large library. I hope that Confluent
chills with introducing massive client-impacting changes; they've introduced
more in the past year than has been introduced from 2019-2023.

Bug fixes / changes / deprecations

  • The BasicLogger will no longer panic if only a single key (no val) is used. Thanks @​vicluq!

  • An internal coding error around managing fetch concurrency was fixed. Thanks @​iimos!

  • Some off by ones with retries were fixed (tldr: we retried one fewer times than configured)

  • AllowAutoTopicCreation and ConsumeRegex can now be used together.
    Previously, topics would not be created if you were producing and consuming
    from the same client AND if you used the ConsumeRegex option.

  • A data race in the consumer code path has been fixed. The race is hard to
    encounter (which is why it never came up even in my weeks of spin-looping
    tests with -race). See PR #​984
    for more details.

  • EndBeginTxnUnsafe is deprecated and unused. EndAndBeginTransaction now
    flushes, and you cannot produce while the function happens (the function will
    just be stuck flushing). As of KIP-890, the behavior that the library relied on
    is now completely unsupported. Trying to produce while ending & beginning a
    transaction very occasionally leads to duplicate messages. The function now is
    just a shortcut for flush, end, begin.

  • The kversion package guts have been entirely reimplemented; version guessing
    should be more reliable.

  • OnBrokerConnect now encompasses the entire SASL flow (if using SASL) rather
    than just connection dialing. This allows you more visibility into successful
    or broken connections, as well as visibility into how long it actually takes
    to initialize a connection. The dialDur arg has been renamed to initDur.
    You may see the duration increase in your metrics. enough If feedback comes
    in that this is confusing or unacceptable, I may issue a patch to revert
    the change and instead introduce a separate hook in the next minor release.
    I do not aim to create another minor release for a while.

Features / improvements

  • This release adds support for user-configurable memory pooling to a few select
    locations. See any "Pool" suffixed interface type in the documentation. You can
    use this to add bucketed pooling (or whatever strategy you choose) to cut down
    on memory waste in a few areas. As well, a few allocations that were previously
    many-tiny allocs have been converted to slab allocations (slice backed). Lastly,
    if you opt into kgo.Record pooling, the Record type has a new Recycle
    method to send it and all other pooled slices back to their pools.

  • You can now completely override how compression or decompression is done via
    the new WithCompressor and WithDecompressor options. This allows you to
    use libraries or options that franz-go does not automatically support, perhaps
    opting for higher performance libraries or options or using memory more memory
    pooling behind the scenes.

  • ConsumeResetOffset has been split into two options, ConsumeResetOffset and
    ConsumeStartOffset. The documentation has been cleaned up. I personally always
    found it confusing to use the reset offset for both what to start consuming from
    and what to reset to when the client sees an offset out of range error. The start
    offset defaults to the reset offset (and vice versa) if you only set one.

  • For users that produce infrequently but want the latency to be low when producing,
    the client now has a EnsureProduceConnectionIsOpen method. You can call this
    before producing to force connections to be open.

  • The client now has a RequestCachedMetadata function, which can be used to
    request metadata only if the information you're requesting is not cached,
    or is cached but is too stale. This can be very useful for admin packages that
    need metadata to do anything else -- rather than requesting metadata for every
    single admin operation, you can have metadata requested once and use that
    repeatedly. Notably, I'll be switching kadm to using this function.

  • KIP-714 support: the client now internally aggregates a small set of metrics
    and sends them to the broker by default. This client implements all required
    metrics and a subset of recommended metrics (the ones that make more sense).
    To opt out of metrics collection & sending to the broker by default, you
    can use the new DisableClienMetrics option. You can also provide your own
    metrics to send to the broker via the new UserMetricsFn option. The client
    does not attempt to sanitize any user provided metric names; be sure you provide
    the names in the correct format (see docs).

  • KIP-848 support: this exists but is hidden. You must explicitly opt in by using
    the new WithContext option, and the context must have a special string key,
    opt_in_kafka_next_gen_balancer_beta. I noticed while testing that if you
    repeat ConsumerGroupHeartbeat requests (i.e. what can happen when clients
    are on unreliable networks), group members repeatedly get fenced. This is
    recoverable, but it happens way way more than it should and I don't believe
    the broker implementation to be great at the moment. Confluent historically
    ignores any bug reports I create on the KAFKA issue tracker, but if you
    would like to follow along or perhaps nudge to help get a reply, please
    chime in on KAFKA-19222, KAFKA-19233, and KAFKA-19235.

  • A few other more niche APIs have been added. See the full breadth of new APIs
    below and check pkg.go.dev for docs for any API you're curious about.

API additions

This section contains all net-new APIs in this release. See the documentation
on pkg.go.dev.

const (
        CodecNone CompressionCodecType = iota
        CodecGzip
        CodecSnappy
        CodecLz4
        CodecZstd
        CodecError = -1
)
const CompressDisableZstd CompressFlag = 1 + iota
const (
    MetricTypeSum = 1 + iota
    MetricTypeGauge
)

type CompressFlag uint16
type CompressionCodecType int8
type Compressor interface {
    Compress(dst *bytes.Buffer, src []byte, flags ...CompressFlag) ([]byte, CompressionCodecType)
}
type Decompressor interface {
    Decompress(src []byte, codecType CompressionCodecType) ([]byte, error)
}
type Metric struct {
        Name string
        Type MetricType
        ValueInt int64
        ValueFloat float64
        Attrs map[string]any
}
type MetricType uint8
type Pool any
type PoolDecompressBytes interface {
        GetDecompressBytes(compressed []byte, codec CompressionCodecType) []byte
        PutDecompressBytes([]byte)
}
type PoolKRecords interface {
        GetKRecords(n int) []kmsg.Record
        PutKRecords([]kmsg.Record)
}
type PoolRecords interface {
        GetRecords(n int) []Record
        PutRecords([]Record)
}
type ProcessFetchPartitionOpts struct {
        KeepControlRecords bool
        DisableCRCValidation bool
        Offset int64
        IsolationLevel IsolationLevel
        Topic string
        Partition int32
        Pools []Pool
}

func DefaultCompressor(...CompressionCodec) (Compressor, error)
func DefaultDecompressor(...Pool) Decompressor
func IsRetryableBrokerErr(error) bool
func ProcessFetchPartition(ProcessFetchPartitionOpts, *kmsg.FetchResponseTopicPartition, Decompressor, func(FetchBatchMetrics)) (FetchPartition, int64)

func DisableClientMetrics() Opt
func OnRebootstrapRequired(func() ([]string, error)) Opt
func UserMetricsFn(fn func() iter.Seq[Metric]) Opt
func WithContext(ctx context.Context) Opt
func WithPools(pools ...Pool) Opt

func ConsumeStartOffset(Offset) ConsumerOpt
func DisableFetchCRCValidation() ConsumerOpt
func RecheckPreferredReplicaInterval(time.Duration) ConsumerOpt
func WithDecompressor(decompressor Decompressor) ConsumerOpt

func DefaultProduceTopicAlways() ProducerOpt
func WithCompressor(Compressor) ProducerOpt

func (*Client) Context() context.Context
func (*Client) EnsureProduceConnectionIsOpen(context.Context, ...int32) error
func (*Client) RequestCachedMetadata(context.Context, *kmsg.MetadataRequest, time.Duration) (*kmsg.MetadataResponse, error)

func (*Record) Recycle()

Relevant commits

This is a small selection of what I think are the most pertinent commits in
this release. This release is very large, though. Many commits and PRs have
been left out that introduce or change smaller things.

  • 07e57d3e kgo: remove all EndAndBeginTransaction internal "optimizations"
  • a54ffa96 kgo: add ConsumeStartOffset, expand offset docs, update readme KIPs
  • PR #​988#​988 kgo: add support for KIP-714 (client metrics)
  • 7a17a03c kgo: fix data race in consumer code path
  • ae96af1d kgo: expose IsRetryableBrokerErr
  • 1eb82fee kgo: add EnsureProduceConnectionIsOpen
  • fc778ba8 kgo: fix AllowAutoTopicCreation && ConsumeRegex when used together
  • ae7eea7c kgo: add DisableFetchCRCValidation option
  • 6af90823 kgo: add the ability to pool memory in a few places while consuming
  • 8c7a36db kgo: export utilities for decompressing and parsing partition fetch responses
  • 33400303 kgo: do a slab allocation for Record's when processing a batch
  • 39c2157a kgo: add WithCompressor and WithDecompressor options
  • 9252a6b6 kgo: export Compressor and Decompressor
  • be15c285 kgo: add Client.RequestCachedMetadata
  • fc040bc0 kgo: add OnRebootstrapRequired
  • c8aec00a kversion: document changes through 4.0
  • 718c5606 kgo: remove all code handling EndBeginTxnUnsafe, make it a no-op
  • 5494c59e kversions: entirely reimplement internals
  • 9d266fcd kgo: allow outstanding produce requests to be context canceled if the user disables idempotency
  • c60bf4c2 kgo: add DefaultProduceTopicAlways ProducerOpt
  • 50cfe060 kgo: fix off-by-one with retries accounting
  • e9ba83a6, 05099ba0 kgo: add WithContext, Client.Context()
  • ddb0c0c3 kgo: fix cancellation of a fetch in manageFetchConcurrency
  • 83843a53 kgo: fixed panic when keyvals len equals 1

v1.18.1

Compare Source

===

This patch release contains a myriad of fixes for relatively minor bugs, a
few improvements, and updates all dependencies. Both pkg/kadm and pkg/sr
are also being released as minors in tandem with a few quality of life APIs.

Bug fixes

  • Previously, if records were successfully produced but returned with an
    invalid offset (-1), the client would erroneously return bogus offsets
    to the end users. This has been fixed to return -1. (Note this was never
    encountered in the wild).

  • Pausing topics & partitions while using PollRecords previously could result
    in incorrect accounting in BufferedFetchRecords and BufferedFetchBytes,
    permanently causing the numbers returned to be larger than reality. That is,
    it is possible the functions would return non-zero numbers even though nothing
    was buffered.

  • When consuming from a follower (i.e. you were using the Rack option and your
    cluster is configured with follower fetching), if the follower you consumed from
    had a higher log start offset than the leader, and if you were trying to consume
    from an early offset that exists on the leader but not on the follower, the client
    would enter a permanent spinloop trying to list offsets against the follower.
    This is due to KIP-320 case 3, which mentions that clients should send a ListOffsets
    to the follower -- this is not the case, Kafka actually returns NotLeaderOrFollower
    when sending that request to the follower. Now the client clears the preferred replica
    and sends the next fetch request to the leader, at which point the leader can either
    serve the request or redirect back to a different preferred replica.

Improvements

  • When finishing batches, if any records were blocked in Produce due to
    the client hitting the maximum number of buffered records, the client would broadcast
    to all waiters that a message was finished for every message finished until there were
    no other goroutines waiting to try to produce. When lingering
    is enabled, linger occurs except when the client has reached the maximum number of
    buffered records. Once the client is as max buffered records, the client tries to flush until more records can be buffered.
    If you have a few concurrent producers, they will all hang trying to buffer. As soon
    as one is signaled, it will grab the free spot, enter into the client as buffered,
    and then see the client is now again at max buffered and immediately create a batch
    rather than lingering. Thus, signalling one at a time would cause many small single-record
    batches to be created and each cause a round trip to the cluster. This would result in slow performance.
    Now, by finishing a batch at a time, the client opens many slots at a time for any producers waiting,
    and ideally they can fit into being buffered without hitting max buffered and clearing any linger state.
    Note that single-message batches can still cause the original behavior, but there is not
    much more that can be done.

  • Decompression errors encountered while consuming are now returned to the end user, rather
    than being stripped internally. Previously, stripping the error internally would result in
    the client spinlooping: it could never make forward progress and nothing ever signaled the
    end user that something was going wrong.

Relevant commits

  • 13584b5 feature kadm: always request authorized operations
  • 847095b bugfix kgo: redirect back to the leader on KIP-392 case 3 failure
  • d6d3015 feature pkg/sr: add PreReq option (and others by @​mihaitodor, thank you!)
  • 1473778 improvement kgo: return decompression errors while consuming
  • 3e9beae bugfix kgo: fix accounting when topics/partitions are {,un}paused for PollRecords
  • ead18d3 improvement kgo: broadcast batch finishes in one big blast
  • aa1c73c feature kadm: add func to decode AuthorizedOperations (thanks @​weeco!)
  • f66d495 kfake: do not listen until the cluster is fully set up
  • 2eed36e bugfix pkg/kgo: fix handling of invalid base offsets (thanks @​rodaine!)

v1.18.0

Compare Source

===

This release adds support for Kafka 3.7, adds a few community requested APIs,
some internal improvements, and fixes two bugs. One of the bugfixes is for a
deadlock; it is recommended to bump to this release to ensure you do not run
into the deadlock. The features in this release are relatively small.

This adds protocol support for KIP-890 and KIP-994, and
adds further protocol support for [KIP-848][KIP-848]. If you are using
transactions, you may see a new kerr.TransactionAbortable error, which
signals that your ongoing transaction should be aborted and will not be
successful if you try to commit it.

Lastly, there have been a few improvements to pkg/sr that are not mentioned
in these changelog notes.

Bug fixes

  • If you canceled the context used while producing while your client was
    at the maximum buffered records or bytes, it was possible to experience
    deadlocks. This has been fixed. See #​832 for more details.

  • Previously, if using GetConsumeTopics while regex consuming, the function
    would return all topics ever discovered. It now returns only the topics that
    are being consumed.

Improvements

  • The client now internaly ignores OutOfOrderSequenceNumber errors that are
    encountered when consuming if possible. If a producer produces very infrequently,
    it is possible the broker forgets the producer by the next time the producer
    produces. In this case, the producer receives an OutOfOrderSequenceNumber error.
    The client now internally resets properly so that you do not see the error.

Features

  • AllowRebalance and CloseAllowingRebalance have been added to GroupTransactSession.
  • The FetchTopic type now has includes the topic's TopicID.
  • The ErrGroupSession internal error field is now public, allowing you to test how you handle the internal error.
  • You may now receive a kerr.TransactionAbortable error from many functions while using transactions.

Relevant commits

  • 0fd1959d kgo: support Kafka 3.8's kip-890 modifications
  • 68163c55 bugfix kgo: do not add all topics to internal tps map when regex consuming
  • 3548d1f7 improvement kgo: ignore OOOSN where possible
  • 6a759401 bugfix kgo: fix potential deadlock when reaching max buffered (records|bytes)
  • 4bfb0c68 feature kgo: add TopicID to the FetchTopic type
  • 06a9c47d feature kgo: export the wrapped error from ErrGroupSession
  • 4affe8ef feature kgo: add AllowRebalance and CloseAllowingRebalance to GroupTransactSession

v1.17.1

Compare Source

===

This patch release fixes four bugs (two are fixed in one commit), contains two
internal improvements, and adds two other minor changes.

Bug fixes

  • If you were using the MaxBufferedBytes option and ever hit the max, odds are
    likely that you would experience a deadlock eventually. That has been fixed.

  • If you ever produced a record with no topic field and without using DefaultProduceTopic,
    or if you produced a transactional record while not in a transaction, AND if the client
    was at the maximum buffered records, odds are you would eventually deadlock.
    This has been fixed.

  • It was previously not possible to set lz4 compression levels.

  • There was a data race on a boolean field if a produce request was being
    written at the same time a metadata update happened, and if the metadata
    update has an error on the topic or partition that is actively being written.
    Note that the race was unlikely and if you experienced it, you would have noticed
    an OutOfOrderSequenceNumber error. See this comment
    for more details.

Improvements

  • Canceling the context you pass to Produce now propagates in two more areas:
    the initial InitProducerID request that occurs the first time you produce,
    and if the client is internally backing off due to a produce request failure.
    Note that there is no guarantee on which context is used for cancelation if
    you produce many records, and the client does not allow canceling if it is
    currently unsafe to do so. However, this does mean that if your cluster is
    somewhat down such that InitProducerID is failing on your new client, you
    can now actually cause the Produce to quit. See this comment
    for what it means for a record to be "safe" to fail.

  • The client now ignores aborted records while consuming only if you have
    configured FetchIsolationLevel(ReadCommitted()). Previously, the client relied
    entirely on the FetchResponse AbortedTransactions field, but it's possible
    that brokers could send aborted transactions even when not using read committed.
    Specifically, this was a behavior difference in Redpanda, and the KIP that introduced
    transactions and all relevant documents do not mention what the broker behavior
    actually should be here. Redpanda itself was also changed to not send aborted
    transactions when using read committed, but we may as well improve franz-go as well.

  • Decompression now better reuses buffers under the hood, reducing allocations.

  • Brokers that return preferred replicas to fetch from now causes an info level
    log in the client.

Relevant commits

  • 305d8dc kgo: allow record ctx cancelation to propagate a bit more
  • 24fbb0f bugfix kgo: fix deadlock in Produce when using MaxBufferedBytes
  • 1827add bugfix kgo sink: fix read/write race for recBatch.canFailFromLoadErrs
  • d7ea2c3 bugfix fix setting lz4 compression levels (thanks @​asg0451!)
  • 5809dec optimise: use byteBuffer pool in decompression (thanks @​kalbhor!)
  • cda897d kgo: add log for preferred replicas
  • e62b402 improvement kgo sink: do not back off on certain edge case
  • 9e32bf9 kgo: ignore aborted txns if using READ_UNCOMMITTED

v1.17.0

Compare Source

===

This long-coming release, four months after v1.16.0, adds support for Kafka 3.7
and adds a few community added or requested APIs. There will be a kadm release
shortly following this one, and maybe a plugin release.

This adds full support for KIP-951, as well as protocol support for
KIP-919 (which has no client facing features) and KIP-848
(protocol only, not the feature!). KIP-951 should make the client faster at
handling when the broker moves partition leadership to a different broker.

There are two fairly minor bug fixes in the kgo package in this release, both
described below. There is also one bugfix in the pkg/sr independent (and
currently) untagged module. Because pkg/sr is untagged, the bugfix was released
a long time ago, but the relevant commit is still mentioned below.

Bug fixes

  • Previously, upgrading a consumer group from non-cooperative to cooperative
    while the group was running did not work. This is now fixed (by @​hamdanjaveed, thank you!).

  • Previously, if a cooperative consumer group member rebalanced while fetching
    offsets for partitions, if those partitions were not lost in the rebalance,
    the member would call OnPartitionsAssigned with those partitions again.
    Now, those partitions are passed to OnPartitionsAssigned only once (the first time).

Improvements

  • The client will now stop lingering if you hit max buffered records or bytes.
    Previously, if your linger was long enough, you could stall once you hit
    either of the Max options; that is no longer the case.

  • If you are issuing admin APIs on the same client you are using for consuming
    or producing, you may see fewer metadata requests being issued.

There are a few other even more minor improvements in the commit list if you
wish to go spelunking :).

Features

  • The Offset type now has a new method AtCommitted(), which causes the
    consumer to not fetch any partitions that do not have a previous commit.
    This mirrors Kafka's auto.offset.reset=none option.

  • KIP-951, linked above and the commit linked below, improves latency around
    partition leader transfers on brokers.

  • Client.GetConsumeTopics allows you to query what topics the client is
    currently consuming. This may be useful if you are consuming via regex.

  • Client.MarkCommitOffsets allows you to mark offsets to be committed in
    bulk, mirroring the non-mark API CommitOffsets.

Relevant commits

franz-go
  • a7caf20 feature kgo.Offset: add AtCommitted()
  • 55dc7a0 bugfix kgo: re-add fetch-canceled partitions AFTER the user callback
  • db24bbf improvement kgo: avoid / wakeup lingering if we hit max bytes or max records
  • 993544c improvement kgo: Optimistically cache mapped metadata when cluster metadata is periodically refreshed (thanks @​pracucci!)
  • 1ed02eb feature kgo: add support for KIP-951
  • 2fbbda5 bugfix fix: clear lastAssigned when revoking eager consumer
  • d9c1a41 pkg/kerr: add new errors
  • 54d3032 pkg/kversion: add 3.7
  • 892db71 pkg/sr bugfix sr SubjectVersions calls pathSubjectVersion
  • ed26ed0 feature kgo: adds Client.GetConsumeTopics (thanks @​UnaffiliatedCode!)
  • 929d564 feature kgo: adds Client.MarkCommitOffsets (thanks @​sudo-sturbia!)
kfake

kfake as well has a few improvements worth calling out:

  • 18e2cc3 kfake: support committing to non-existing groups
  • b05c3b9 kfake: support KIP-951, fix OffsetForLeaderEpoch
  • 5d8aa1c kfake: fix handling ListOffsets with requested timestamp

v1.16.1

Compare Source

===

This patch release fixes one bug and un-deprecates SaramaHasher.

SaramaHasher, while not identical to Sarama's partitioner, actually is
identical to some other partitioners in the Kafka client ecosystem. So, the old
function is now un-deprecated, but the documentation correctly points you to
SaramaCompatHasher and mentions why you may still want to use SaramaHasher.

For the bug: if you tried using CommitOffsetsSync during a group rebalance, and
you canceled your context while the group was still rebalancing, then
CommitOffsetsSync would enter a deadlock and never return. That has been fixed.

v1.16.0

Compare Source

===

This release contains a few minor APIs and internal improvements and fixes two
minor bugs.

One new API that is introduced also fixes a bug. API-wise, the SaramaHasher
was actually not a 1:1 compatible hasher. The logic was identical, but there
was a rounding error because Sarama uses int32 module arithmetic, whereas kgo
used int (which is likely int64) which caused a different hash result. A new
SaramaCompatHasher has been introduced and the old SaramaHasher has been
deprecated.

The other bugfix is that OptValue on the kgo.Logger option panicked if you
were not using a logger. That has been fixed.

The only other APIs that are introduced are in the kversions package; they
are minor, see the commit list below.

If you issue a sharded request and any of the responses has a retryable error
in the response, this is no-longer returned as a top-level shard error. The
shard error is now nil, and you can properly inspect the response fully.

Lastly (besides other internal minor improvements not worth mentioning),
metadata fetches can now inject fake fetches if the metadata response has topic
or partition load errors. This is unconditionally true for non-retryable
errors. If you use KeepRetryableFetchErrors, you can now also see when
metadata fetching is showing unknown topic errors or other retryable errors.

  • a2340eb improvement pkg/kgo: inject fake fetches on metadata load errors
  • d07efd9 feature kversion: add VersionStrings, FromString, V3_6_0
  • 8d30de0 bugfix pkg/kgo: fix OptValue with no logger set
  • 012cd7c improvement kgo: do not return response ErrorCode's as shard errors
  • 1dc3d40 bugfix: actually have correct sarama compatible hasher (thanks @​C-Pro)

v1.15.4

Compare Source

===

This patch release fixes a difficult to encounter, but
fatal-for-group-consuming bug.

The sequence of events to trigger this bug:

  • OffsetCommit is issued before Heartbeat
  • The coordinator for the group needs to be loaded (so, likely, a previous NOT_COORDINATOR error was received)
  • OffsetCommit triggers the load
  • a second OffsetCommit happens while the first is still running, canceling the first OffsetCommit's context

In this sequence of events, FindCoordinator will fail with context.Canceled
and, importantly, also return that error to Heartbeat. In the guts of the
client, a context.Canceled error should only happen when a group is being
left, so this error is recognized as a group-is-leaving error and the group
management goroutine exits. Thus, the group is never rejoined.

This likely requires a system to be overloaded to begin with, because
FindCoordinator requests are usually very fast.

The fix is to use the client context when issuing FindCoordinator, rather than
the parent request. The parent request can still quit, but FindCoordinator
continues. No parent request can affect any other waiting request.

This patch also includes a dep bump for everything but klauspost/compress;
klauspost/compress changed go.mod to require go1.19, while this repo still
requires 1.18. v1.16 will change to require 1.19 and then this repo will bump
klauspost/compress.

There were multiple additions to the yet-unversioned kfake package, so that an
advanced "test" could be written to trigger the behavior for this patch and
then ensure it is fixed. To see the test, please check the comment on PR
650.

  • 7d050fc kgo: do not cancel FindCoordinator if the parent context cancels

v1.15.3

Compare Source

===

This patch release fixes one minor bug, reduces allocations on gzip and lz4
decompression, and contains a behavior improvement when OffsetOutOfRange is
received while consuming.

For the bugfix: previously, if the client was using a fetch session (as is the
default when consuming), and all partitions for a topic transfer to a different
broker, the client would not properly unregister the topic from the prior
broker's fetch session. This could result in more data being consumed and
discarded than necessary (although, it's possible the broker just reset the
fetch session anyway, I'm not entirely positive).

  • fdf371c use bytes buffer instead of ReadAll (thanks @​kalbhor!)
  • e6ed69f consuming: reset to nearest if we receive OOOR while fetching
  • 1b6a721 bugfix kgo source: use the proper topic-to-id map when forgetting topics

v1.15.2

Compare Source

===

This patch release fixes two bugs and changes Mark functions to be no-ops when
not using AutoCommitMarks to avoid confusion. This also includes a minor commit
further improving the sticky balancer. See the commits for more details.

  • 72778cb behavior change kgo: no-op mark functions when not using AutoCommitMarks
  • e209bb6 bugfix kgo: pin AddPartitionsToTxn to v3 when using one transaction
  • 36b4437 sticky: further improvements
  • af5bc1f bugfix kgo: be sure to use topics when other topics are paused

v1.15.1

Compare Source

===

This patch release contains a bunch of internal improvements to kgo and
includes a bugfix for a very hard to encounter logic race. Each improvement
is a bit focused on a specific use case, so I recommend reading any relevant-to-you
commit message below.

As well, the kversion package now detects Kafka 3.6, and the kgo package now
handles AddPartitionsToTxn v4 (however, you will probably not be issuing this
request).

Lastly, this release is paired with a minor kadm release, which adds the
ErrMessage field CreateTopicsResponse and DeleteTopicsResponse, and,
importantly, fixes a data race in the ApiVersions request.

franz-go
  • 2a3b6bd improvement kversion: detect 3.6
  • fe5a660 improvement kgo: add sharding for AddPartitionsToTxn for KIP-890
  • b2ccc2f improvement kgo: reintroduce random broker iteration
  • 54a7418 improvement kgo: allow PreTxnCommitFnContext to modify empty offsets
  • c013050 bugfix kgo: avoid rare panic
  • 0ecb52b improvement kgo: do not rotate the consumer session when pausing topics/partitions
  • 1429d47 improvement sticky balancer: try for better topic distribution among members
kadm
  • 1955938 bugfix kadm: do not reuse ApiVersions in many concurrent requests
  • 66974e8 feature kadm: include ErrMessage in topic response

v1.15.0

Compare Source

===

This release comes 74 days (just over two months) since the last minor release.
This mostly contains new features, with one relatively minor but important bug
addressed (and one very minor bug fixed).

Bug fixes

  • Fetch sessions now properly send forgotten topics if we forget the entire
    topic (not just individual partitions while other partitions are still
    consumed on the broker). For long-running clients where partitions move
    around the cluster a bunch over time, this ensures we are not sending requests
    with null topics / null topic IDs. See #​535
    for more details.
  • If the client talks to an http endpoint, we now properly detect the bytes
    'HTTP' and send a more relevant error message. This previously existed, but
    used the wrong int32 internally so 'HTTP' was not properly detected (thanks
    @​alistairking!).

Features

  • RecordReader now supports %v{json} to parse json inputs
  • RecordReader now supports %v{}, an empty no-op formatting directive
  • Adds PurgeTopicsFromProducing and PurgeTopicsFromConsuming to purge either
    the producing or consuming half of the client, if you are producing to and
    consuming from the same topics in the same client.
  • The new ConsiderMissingTopicDeletedAfter option allows finer grained
    control for how long a regexp-discovered topic can be missing on the cluster
    before the topic is considered deleted (and then internally purged in the
    client)
  • Adds NewErrFetch to create a single-error Fetches, for use in end-user
    tests / stubs.
  • The new MaxBufferedBytes option can control how many bytes are buffered
    when producing, an option similar to MaxBufferedRecords.
  • Adds BufferedFetchBytes and BufferedProduceBytes to return the total
    bytes in records buffered (note this counts only keys, values, and headers).
  • Adds PreTxnCommitFnContext to allow custom Metadata annotation for
    transactional commits.
  • Adds LeaveGroupContext to control how long leaving a group can take, as
    well as to return any leave group error.

Relevant commits

franz-go
  • 4dcfb06 feature kgo: add LeaveGroupContext
  • 910e91d and 60b601a feature kgo: add PreTxnCommitFnContext
  • c80d6f4 feature kgo: add Buffered{Fetch,Produce}Bytes
  • 304559f feature kgo: support MaxBufferedBytes
  • 310a5da feature kgo: add NewErrFetch
  • 504a9d7 feature kgo: expose ConsiderMissingTopicDeletedAfter
  • 253e1a9 feature kgo: add PurgeTopicsFrom{Producing,Consuming}
  • 055e2d8 improvement kgo record formatter: accept %v{} to be a no-op (plain read/format)
  • 37edfb9 improvement kgo.RecordReader: support %v{json} to read json values
  • 8a9a459 bugfix kgo: track topic IDs in the fetch session
  • 9d25d3a bugfix kgo: fix typo in parseReadSize to properly detect and warn about talking to HTTP endpoints (thanks @​alistairking!)
kadm

This release comes with a corresponding kadm release that contains a few
behavior changes and improvements.

  • bfd07b2 kadm: fix occasionally empty topic/partitions in Lag
  • 00ac608 kadm: change FetchOffsetsForTopics to only return requested topics by default

v1.14.4

Compare Source

===

This small patch fixes kversion.VersionGuess to properly guess versions against
zookeeper broker versions v2.7 through 3.4. See the commit for details.

  • 5978156 bugfix kversion: fix version detection for Kafka v2.7 through 3.4

v1.14.3

Compare Source

===

This patch fixes regex consuming a deleted topic causing an unending internal
loop of metadata reloading (trying to discover where the topic went).

  • 627d39a bugfix kgo: fix / improve handling deleted topics while regex consuming

v1.14.2

Compare Source

===

This patch fixes an internal logic race that can be easily encountered when
specifying exact offsets to consume from. If you encountered this bug, your
consumer could just stop consuming for an indeterminite amount of time. This
bug has existed for a long time and relies on both the client being slow and
the broker being fast to hit.

  • 1f696ca bugfix kgo: avoid a consumer logic race where the consumer stops consuming

v1.14.1

Compare Source

===

This patch release, quick on the heels of v1.14.0, fixes a race condition
introduced in v1.14 in the PauseFetchTopics and PauseFetchPartitions
functions, a second race condition that can occur when purging a topic, and
fully addresses [#​493][#​493] which was
not completely addressed in v1.14.0.

  • 8c785fa bugfix kgo: fix race between client closing and purging
  • dc5283e kgo: re-fix #​493, supporting other buggy clients, and add a test
  • 32ac27f bugfix kgo: ensure assignPartitions is locked when pausing topics/partitions

v1.14.0

Compare Source

===

This release contains a few new APIs, one behavior change, and one minor bugfix.

Bug fixes

Previously, HookBrokerRead and HookBrokerE2E could not be used at the same
time. This has been fixed.

Behavior changes

PauseFetch{Topics,Partitions} now causes the client to drop all buffered
fetches and kill all in-flight fetch requests. Importantly, this also means
that once you pause, it is no longer possible for what you paused to be
returned while polling. Previously, the client made no attempt to clear
internal buffers / in flight requests, meaning you could receive paused data
for a while.

Seed brokers now show up in logs as seed_### rather than seed ### (an
underscore has been added).

Features

  • kgo.Offset now has an EpochOffset getter function that allows access
    to the actual epoch and offset that are inside the opaque Offset type.
  • AddConsumePartitions allows adding individual partitions to consume, and
    the new counterpart RemoveConsumePartitions allows removing individual
    partitions from being consumed. Removing is different from purging, please
    see the docs.
  • KeepRetryableFetchErrors bubbles up retryable errors to the end user that
    are encountered while fetching. By default, these errors are stripped.
  • kversion now supports Kafka 3.5
  • kversion now supports version guessing against KRaft by default
  • kgo.DialTLS now exists to even more easily opt into TLS.
  • kgo.Client.Opts now exists to return the original options that were used
    to configure the client, making initializing new clients easier.
  • kgo.NodeName returns a string form of a broker node name. Internally, seed
    brokers use math.MinInt32 for node IDs, which shows up as massively negative
    numbers in logs sometimes. NodeName can help convert that to seed_<#>.

Relevant commits

  • c3b083b improvement kgo: do not returned paused topics/partitions after pausing
  • e224e90 bugfix kgo: allow HookBrokerRead and HookBrokerE2E to both be called
  • 875761a feature kgo Offset: add EpochOffset getter field
  • c5d0fc5 kgo: add a debug log for stripping retryable errors from fetches
  • b45d663 kgo: add more context to opportunistic metadata loads while fetching
  • 9dae366 kgo: allow retries on dial timeouts
  • 00e4e76 kgo: tolerate buggy v1 group member metadata
  • 34c8b3d feature kgo: add AddConsumePartitions and RemoveConsumePartitions
  • b5cafba sasl: validate non-empty user/pass/token
  • 76d2e71 feature kgo: add KeepRetryableFetchErrors
  • 0df3ec0 kgo: fix new niche CI problem against Kafka 3.5
  • 8ff1d0d feature pkg/kversion: attempt to guess KRaft by default as well
  • e92f5d9 feature pkg/kversion: detect v3.5
  • f1b923e feature kgo: add DialTLS option
  • [9667967](https://redirect.github.com/twmb/franz-g

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

Copy link
Author

renovate bot commented Apr 6, 2025

ℹ Artifact update notice

File name: go.mod

In order to perform the update(s) described in the table above, Renovate ran the go get command, which resulted in the following additional change(s):

  • 3 additional dependencies were updated
  • The go directive was updated for compatibility reasons

Details:

Package Change
go 1.17 -> 1.23.8
github.com/twmb/franz-go/pkg/kmsg v0.0.0-20220114004744-91b30863ac2f -> v1.11.2
github.com/klauspost/compress v1.13.6 -> v1.18.0
github.com/pierrec/lz4/v4 v4.1.11 -> v4.1.22

@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch 2 times, most recently from 322b45e to e3fcfce Compare May 8, 2025 19:30
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.18.1 fix(deps): update module github.com/twmb/franz-go to v1.19.0 May 8, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from e3fcfce to f5e604a Compare May 9, 2025 16:13
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.19.0 fix(deps): update module github.com/twmb/franz-go to v1.19.1 May 9, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from f5e604a to 458d93a Compare May 15, 2025 06:15
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.19.1 fix(deps): update module github.com/twmb/franz-go to v1.19.2 May 15, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from 458d93a to 1564705 Compare May 15, 2025 22:31
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.19.2 fix(deps): update module github.com/twmb/franz-go to v1.19.3 May 15, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from 1564705 to 7a8ca5f Compare May 20, 2025 17:24
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.19.3 fix(deps): update module github.com/twmb/franz-go to v1.19.4 May 20, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from 7a8ca5f to 4b72c22 Compare June 3, 2025 00:55
@renovate renovate bot changed the title fix(deps): update module github.com/twmb/franz-go to v1.19.4 fix(deps): update module github.com/twmb/franz-go to v1.19.5 Jun 3, 2025
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from 4b72c22 to ac89016 Compare August 10, 2025 12:37
@renovate renovate bot force-pushed the renovate/github.com-twmb-franz-go-1.x branch from ac89016 to 0cfda7e Compare October 9, 2025 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants