Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test: TestTenantLogic_drop_role_with_default_privileges failed #140494

Closed
github-actions bot opened this issue Feb 5, 2025 · 6 comments · Fixed by #140607
Assignees
Labels
branch-master Failures and bugs on the master branch. branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 branch-release-25.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@github-actions
Copy link

github-actions bot commented Feb 5, 2025

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_drop_role_with_default_privileges failed on master @ d276833ff08ef11d7894b7b0124f1883d90a3377:

=== RUN   TestTenantLogic_drop_role_with_default_privileges
    test_log_scope.go:165: test logs captured to: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges3370584135
    test_log_scope.go:76: use -show-logs to present logs inline
[05:11:25] setting distsql_workmem='34597B';
[05:11:25] setting distsql_workmem='34597B';
[05:11:34] --- done: /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges with config 3node-tenant: 25 tests, 0 failures
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: error while processing
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): "sql txn" meta={id=ec5e6a90 key=/Tenant/10/Table/23/1/"testuser2"/"root"/0 iso=Serializable pri=0.00968929 epo=0 ts=1738732288.015870128,1 min=1738732288.015870128,0 seq=17} lock=true stat=PENDING rts=1738732288.015870128,0 wto=false gul=1738732288.515870128,0
    panic.go:626: -- test log scope end --
test logs left over in: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges3370584135
--- FAIL: TestTenantLogic_drop_role_with_default_privileges (10.55s)

Parameters:

  • attempt=1
  • run=1
  • shard=39
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

Jira issue: CRDB-47188

@github-actions github-actions bot added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-queries SQL Queries Team labels Feb 5, 2025
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Feb 5, 2025
@yuzefovich yuzefovich added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-queries SQL Queries Team labels Feb 5, 2025
@yuzefovich yuzefovich removed this from SQL Queries Feb 5, 2025
@yuzefovich
Copy link
Member

@rafiss looks like this happened after #140400 merged

@rafiss rafiss self-assigned this Feb 5, 2025
@rafiss rafiss removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Feb 5, 2025
@cockroach-teamcity
Copy link
Member

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_drop_role_with_default_privileges failed on master @ d276833ff08ef11d7894b7b0124f1883d90a3377:

=== RUN   TestTenantLogic_drop_role_with_default_privileges
    test_log_scope.go:165: test logs captured to: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges3859805891
    test_log_scope.go:76: use -show-logs to present logs inline
[18:56:19] --- progress: /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges: 24 statements
[18:56:25] --- done: /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges with config 3node-tenant: 25 tests, 0 failures
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: error while processing
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): "sql txn" meta={id=8318a158 key=/Tenant/10/Table/23/1/"testuser3"/"root"/0 iso=Serializable pri=0.01498107 epo=0 ts=1738781779.389503497,1 min=1738781779.389503497,0 seq=17} lock=true stat=PENDING rts=1738781779.389503497,0 wto=false gul=1738781779.889503497,0
    panic.go:626: -- test log scope end --
test logs left over in: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges3859805891
--- FAIL: TestTenantLogic_drop_role_with_default_privileges (10.58s)

Parameters:

  • attempt=1
  • run=17
  • shard=39
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

Copy link
Author

github-actions bot commented Feb 5, 2025

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_drop_role_with_default_privileges failed on master @ f18cf53e67c1c6eab453eb7be40743b86cdbf349:

=== RUN   TestTenantLogic_drop_role_with_default_privileges
    test_log_scope.go:165: test logs captured to: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges1939526212
    test_log_scope.go:76: use -show-logs to present logs inline
[21:58:29] --- done: /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges with config 3node-tenant: 25 tests, 0 failures
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: error while processing
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_ABORT_SPAN): "sql txn" meta={id=15aac21a key=/Tenant/10/Table/23/1/"testuser3"/"root"/0 iso=Serializable pri=0.01574010 epo=0 ts=1738792705.859997708,2 min=1738792702.863063315,0 seq=17} lock=true stat=ABORTED rts=1738792702.863063315,0 wto=false gul=1738792703.363063315,0
    panic.go:626: -- test log scope end --
test logs left over in: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges1939526212
--- FAIL: TestTenantLogic_drop_role_with_default_privileges (10.57s)

Parameters:

  • attempt=1
  • run=1
  • shard=39
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@rafiss
Copy link
Collaborator

rafiss commented Feb 6, 2025

I found another cause of this kind of flake. Similar to my discoveries in #140400, there's another background job that is contending with the foreground schema change transactions.

This time it's the removeClaimsFromDeadSessions query that runs in a loop in the job registry:

// removeClaimsFromDeadSessions queries the jobs table for non-terminal
// jobs and nullifies their claims if the claims are owned by known dead sessions.
removeClaimsFromDeadSessions := func(ctx context.Context, s sqlliveness.Session) {
if err := r.db.Txn(ctx, func(ctx context.Context, txn isql.Txn) error {
// Run the expiration transaction at low priority to ensure that it does
// not contend with foreground reads. Note that the adoption and cancellation
// queries also use low priority so they will interact nicely.
if err := txn.KV().SetUserPriority(roachpb.MinUserPriority); err != nil {
return errors.WithAssertionFailure(err)
}
_, err := txn.ExecEx(
ctx, "expire-sessions", txn.KV(),
sessiondata.NodeUserSessionDataOverride,
removeClaimsForDeadSessionsQuery,
s.ID().UnsafeBytes(),
cancellationsUpdateLimitSetting.Get(&r.settings.SV),
)
return err
}); err != nil {
log.Errorf(ctx, "error expiring job sessions: %s", err)
}
}
// servePauseAndCancelRequests queries tho pause-requested and cancel-requested

This issue seems to only be a problem in multitenant configs, since that seems to put the TestCluster into an overloaded state. In this situation, the foreground schema change transaction doesn't heartbeat quickly enough, which allows the background job to abort the foreground transaction even though the background transaction is running at MinUserPriority.

Copy link
Author

github-actions bot commented Feb 6, 2025

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_drop_role_with_default_privileges failed on master @ 15a3216d965f778f64345a56ef2d606a4dc4b311:

=== RUN   TestTenantLogic_drop_role_with_default_privileges
    test_log_scope.go:165: test logs captured to: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges1934119231
    test_log_scope.go:76: use -show-logs to present logs inline
[17:47:20] setting distsql_workmem='82269B';
[17:47:20] setting distsql_workmem='82269B';
[17:47:28] --- done: /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges with config 3node-tenant: 25 tests, 0 failures
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: error while processing
    logic.go:4251: 
        /var/lib/engflow/worker/work/0/exec/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/drop_role_with_default_privileges:88: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_ABORT_SPAN): "sql txn" meta={id=cb5cbf19 key=/Tenant/10/Table/23/1/"testuser3"/"root"/3/1 iso=Serializable pri=0.00941576 epo=0 ts=1738864042.597867977,1 min=1738864042.597867977,0 seq=21} lock=true stat=ABORTED rts=1738864042.597867977,1 wto=false gul=1738864043.097867977,0
    panic.go:626: -- test log scope end --
test logs left over in: outputs.zip/logTestTenantLogic_drop_role_with_default_privileges1934119231
--- FAIL: TestTenantLogic_drop_role_with_default_privileges (10.47s)

Parameters:

  • attempt=1
  • run=1
  • shard=39
Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@craig craig bot closed this as completed in 215d16b Feb 6, 2025
Copy link

blathers-crl bot commented Feb 6, 2025

Based on the specified backports for linked PR #140607, I applied the following new label(s) to this issue: branch-release-24.1, branch-release-24.2, branch-release-24.3, branch-release-25.1. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 branch-release-25.1 labels Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 branch-release-25.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants