Skip to content

Commit 3671865

Browse files
craig[bot]renatolabserikgrinakeraadityasondhimaryliag
committed
118293: roachtest: add multi-region mixed-version test r=srosenberg,herkolategan a=renatolabs This commit adds a `multi-region/mixed-version` test that creates a larger-sized cluster (4 regions, 10 nodes per region) and runs a constant background TPCC worload (with `tolerate-errors`), along with a short TPCC workload that runs in mixed-version state (without `tolerate-errors`). With this test, we exercise our ability to perform several cluster upgrades in MR clusters. In the future, we plan to extend this test to include other kinds of randomized testing as well. Fixes: #114803 Release note: None **roachtest: change TPCC functions to take a logger instance** This makes it easier to organize logs for a complex test by allowing the caller to inject the logger instance that should be used in a particular call to `runTPCC`. More immediately, we make use of this change in the recently introduced `multi-region/mixed-version` test, and pass the step's logger to those functions. 118827: roachpb: stop reading `Lease.DeprecatedStartStasis` r=erikgrinaker a=erikgrinaker This will allow removing the field in a later version. We still have to populate the field for backwards compatibility with 23.2. Epic: none Release note: None 118850: server: enable continuous CPU profiler r=aadityasondhi a=aadityasondhi This patch enables the CPU profile with a threshold of 75% and a max frequency of 20min. The motivation for this change is that recently, I have noticed a few cases during investigations where CPU utilization spikes but we lack profiles after the fact. This hinders our ability to dig deeper into the source of high CPU usage. Having profiles can help inform us of future AC integrations that we may need, or other performance improvements we can do elsewhere. Informs #97699. Release note (ops change): CRDB will now automatically generate CPU profiles if there is an increase in CPU utilization. This can help investigate possible issues after the fact. 118865: technotes: SQL statistics r=maryliag a=maryliag Tech notes on SQL Statistics. Part Of [CRDB-35839](https://cockroachlabs.atlassian.net/browse/CRDB-35839) Images of how the diagrams generated using mermaid will be rendered: <img width="814" alt="Screenshot 2024-02-06 at 4 10 05 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/9dfbd2b1-4454-41f1-8840-dc39d41059e5"> <img width="285" alt="Screenshot 2024-02-06 at 4 42 39 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/bf570d55-1f5d-454b-b693-fbcf5e082a39"> Release note: None 118905: pgwire: increase timeout and add Error() call r=rafiss a=rickystewart This test seems to deadlock sporadically under remote execution. I believe the connection is timing out, which due to the way this test was written causes it to deadlock. I'm increasing the timeout to attempt to increase reliability, and separately, we also add a call to `Error()` to mark the test as failed should the connection fail. Closes #118741. Epic: None Release note: None 118914: roachtest: print issue number after test failure r=srosenberg a=renatolabs This commit updates the GitHub issue poster so that information about the issue is returned when an issue is created or a comment added. The roachtest test runner uses this information in the TeamCity output so that we can easily see the issue corresponding to a test failure directly in the TC overview page. Epic: none Release note: None 118928: backupccl: skip TestDataDriven_multiregion under duress r=rickystewart a=msbutler Fixes #118567 Release note: none 118960: cli: update docs url for sql shell r=lunevalex a=lunevalex The SQL shell help function redirects the user to use-the-built-in-sql-client.html this page no longer exists. Instead the SQL shell should point to cockroach-sql.html. Epic: None Release note (cli change): Change the SQL shell help URL to point to cockroach-sql.html. 118970: explain: fix overflow when printing estimated row count r=yuzefovich a=yuzefovich Previously, when the optimizer estimated a very large row count (which is float64), we would first cast it to uint64 (which would work ok in case the count is large - we'd get `max int64 + 1`), and then we would cast it to int64 which can result in a negative number. This is now fixed by adding a check before casting to int64 to cap the value at max int64. Epic: None Release note: None 118975: changefeedccl: skip flaky tests for pulsar r=wenyihu6 a=wenyihu6 This patch skips flaky tests for pulsar sinks. We will add them back when pulsar is fully supported. Release note: none Fixes: #118938, #118937, #118936, #118935 Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Aaditya Sondhi <[email protected]> Co-authored-by: maryliag <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Alex Lunev <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Wenyi Hu <[email protected]>
11 parents 6657a85 + 480791e + b468d1a + 9036430 + 7b52d39 + 63a324a + fa69450 + fe070a6 + 1c84dba + 10cad92 + 3ce4741 commit 3671865

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+750
-180
lines changed

docs/tech-notes/observability/sql_stats.md

Lines changed: 226 additions & 0 deletions
Large diffs are not rendered by default.

pkg/ccl/backupccl/datadriven_test.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -512,6 +512,10 @@ func runTestDataDriven(t *testing.T, testFilePathFromWorkspace string) {
512512
skip.WithIssue(t, issue)
513513
return ""
514514

515+
case "skip-under-duress":
516+
skip.UnderDuress(t)
517+
return ""
518+
515519
case "reset":
516520
ds.cleanup(ctx, t)
517521
ds = newDatadrivenTestState()

pkg/ccl/backupccl/testdata/backup-restore/multiregion

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# disabled to run within tenant because multiregion primitives are not supported within tenant
22

3+
skip-under-duress
4+
----
5+
36
new-cluster name=s1 allow-implicit-access disable-tenant localities=us-east-1,us-west-1,eu-central-1
47
----
58

pkg/ccl/changefeedccl/helpers_test.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -881,7 +881,7 @@ func randomSinkTypeWithOptions(options feedTestOptions) string {
881881
"pubsub": 1,
882882
"sinkless": 2,
883883
"cloudstorage": 0,
884-
"pulsar": 1,
884+
"pulsar": 0,
885885
}
886886
if options.externalIODir != "" {
887887
sinkWeights["cloudstorage"] = 3

pkg/cli/clisqlshell/sql.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,7 @@ func (c *cliState) printCliHelp() {
276276
fmt.Fprintf(c.iCtx.stdout, helpMessageFmt,
277277
demoHelpStr,
278278
docs.URL("sql-statements.html"),
279-
docs.URL("use-the-built-in-sql-client.html"),
279+
docs.URL("cockroach-sql.html"),
280280
)
281281
fmt.Fprintln(c.iCtx.stdout)
282282
}

pkg/cmd/bazci/githubpost/githubpost.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,8 @@ func DefaultIssueFilerFromFormatter(
104104
}
105105
req.ExtraParams["stress"] = "true"
106106
}
107-
return issues.Post(ctx, log.Default(), fmter, req, opts)
107+
_, err := issues.Post(ctx, log.Default(), fmter, req, opts)
108+
return err
108109
}
109110

110111
}

pkg/cmd/internal/issues/issues.go

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,34 @@ func buildIssueQueries(
337337
return existingIssueQuery, relatedIssuesQuery
338338
}
339339

340-
func (p *poster) post(origCtx context.Context, formatter IssueFormatter, req PostRequest) error {
340+
type TestFailureType string
341+
342+
const (
343+
TestFailureNewIssue = TestFailureType("new_issue")
344+
TestFailureIssueComment = TestFailureType("comment")
345+
)
346+
347+
// TestFailureIssue encapsulates data about an issue created or
348+
// changed in order to report a test failure.
349+
type TestFailureIssue struct {
350+
Type TestFailureType
351+
ID int
352+
}
353+
354+
func (tfi TestFailureIssue) String() string {
355+
switch tfi.Type {
356+
case TestFailureNewIssue:
357+
return fmt.Sprintf("created new GitHub issue #%d", tfi.ID)
358+
case TestFailureIssueComment:
359+
return fmt.Sprintf("commented on existing GitHub issue #%d", tfi.ID)
360+
default:
361+
return fmt.Sprintf("[unrecognized test failure type %q, ID=%d]", tfi.Type, tfi.ID)
362+
}
363+
}
364+
365+
func (p *poster) post(
366+
origCtx context.Context, formatter IssueFormatter, req PostRequest,
367+
) (*TestFailureIssue, error) {
341368
ctx := &postCtx{Context: origCtx}
342369
data := p.templateData(
343370
ctx,
@@ -402,6 +429,7 @@ func (p *poster) post(origCtx context.Context, formatter IssueFormatter, req Pos
402429
createLabels := []string{RobotLabel}
403430
createLabels = append(createLabels, req.labels()...)
404431
createLabels = append(createLabels, releaseLabel(p.Branch))
432+
var result TestFailureIssue
405433
if foundIssue == nil {
406434
issueRequest := github.IssueRequest{
407435
Title: &title,
@@ -411,11 +439,13 @@ func (p *poster) post(origCtx context.Context, formatter IssueFormatter, req Pos
411439
}
412440
issue, _, err := p.createIssue(ctx, p.Org, p.Repo, &issueRequest)
413441
if err != nil {
414-
return errors.Wrapf(err, "failed to create GitHub issue %s",
442+
return nil, errors.Wrapf(err, "failed to create GitHub issue %s",
415443
github.Stringify(issueRequest))
416444
}
417445

418-
p.l.Printf("created GitHub issue #%d", *issue.Number)
446+
result.Type = TestFailureNewIssue
447+
result.ID = *issue.Number
448+
p.l.Printf("%s", result)
419449
if req.ProjectColumnID != 0 {
420450
_, _, err := p.createProjectCard(ctx, int64(req.ProjectColumnID), &github.ProjectCardOptions{
421451
ContentID: *issue.ID,
@@ -433,14 +463,16 @@ func (p *poster) post(origCtx context.Context, formatter IssueFormatter, req Pos
433463
comment := github.IssueComment{Body: github.String(body)}
434464
if _, _, err := p.createComment(
435465
ctx, p.Org, p.Repo, *foundIssue, &comment); err != nil {
436-
return errors.Wrapf(err, "failed to update issue #%d with %s",
466+
return nil, errors.Wrapf(err, "failed to update issue #%d with %s",
437467
*foundIssue, github.Stringify(comment))
438468
} else {
439-
p.l.Printf("created comment on existing GitHub issue (#%d)", *foundIssue)
469+
result.Type = TestFailureIssueComment
470+
result.ID = *foundIssue
471+
p.l.Printf("%s", result)
440472
}
441473
}
442474

443-
return nil
475+
return &result, nil
444476
}
445477

446478
func (p *poster) teamcityURL(tab, fragment string) *url.URL {
@@ -559,9 +591,9 @@ type Logger interface {
559591
// will be returned.
560592
func Post(
561593
ctx context.Context, l Logger, formatter IssueFormatter, req PostRequest, opts *Options,
562-
) error {
594+
) (*TestFailureIssue, error) {
563595
if !opts.CanPost() {
564-
return errors.Newf("GITHUB_API_TOKEN env variable is not set; cannot post issue")
596+
return nil, errors.Newf("GITHUB_API_TOKEN env variable is not set; cannot post issue")
565597
}
566598

567599
client := github.NewClient(oauth2.NewClient(ctx, oauth2.StaticTokenSource(

pkg/cmd/internal/issues/issues_test.go

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -404,15 +404,19 @@ test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTe
404404
// Override the default.
405405
req.Labels = []string{}
406406
}
407-
require.NoError(t, p.post(context.Background(), UnitTestFormatter, req))
407+
issue, err := p.post(context.Background(), UnitTestFormatter, req)
408+
require.NoError(t, err)
409+
require.Equal(t, issueNumber, issue.ID)
408410

409411
switch foundIssue {
410412
case foundNoIssue, foundOnlyRelatedIssue:
411413
require.True(t, createdIssue)
412414
require.False(t, createdComment)
415+
require.Equal(t, TestFailureNewIssue, issue.Type)
413416
case foundOnlyMatchingIssue, foundMatchingAndRelatedIssue:
414417
require.False(t, createdIssue)
415418
require.True(t, createdComment)
419+
require.Equal(t, TestFailureIssueComment, issue.Type)
416420
default:
417421
t.Errorf("unhandled: %s", foundIssue)
418422
}
@@ -460,7 +464,8 @@ func TestPostEndToEnd(t *testing.T) {
460464
HelpCommand: UnitTestHelpCommand(""),
461465
}
462466

463-
require.NoError(t, Post(context.Background(), log.Default(), UnitTestFormatter, req, opts))
467+
_, err := Post(context.Background(), log.Default(), UnitTestFormatter, req, opts)
468+
require.NoError(t, err)
464469
}
465470

466471
// setEnv overrides the env variables corresponding to the input map. The

pkg/cmd/roachtest/github.go

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ type githubIssues struct {
3434
disable bool
3535
cluster *clusterImpl
3636
vmCreateOpts *vm.CreateOpts
37-
issuePoster func(context.Context, issues.Logger, issues.IssueFormatter, issues.PostRequest, *issues.Options) error
37+
issuePoster func(context.Context, issues.Logger, issues.IssueFormatter, issues.PostRequest, *issues.Options) (*issues.TestFailureIssue, error)
3838
teamLoader func() (team.Map, error)
3939
}
4040

@@ -301,11 +301,13 @@ func (g *githubIssues) createPostRequest(
301301
}, nil
302302
}
303303

304-
func (g *githubIssues) MaybePost(t *testImpl, l *logger.Logger, message string) error {
304+
func (g *githubIssues) MaybePost(
305+
t *testImpl, l *logger.Logger, message string,
306+
) (*issues.TestFailureIssue, error) {
305307
doPost, skipReason := g.shouldPost(t)
306308
if !doPost {
307309
l.Printf("skipping GitHub issue posting (%s)", skipReason)
308-
return nil
310+
return nil, nil
309311
}
310312

311313
var metamorphicBuild bool
@@ -319,7 +321,7 @@ func (g *githubIssues) MaybePost(t *testImpl, l *logger.Logger, message string)
319321
}
320322
postRequest, err := g.createPostRequest(t.Name(), t.start, t.end, t.spec, t.failures(), message, metamorphicBuild, t.goCoverEnabled)
321323
if err != nil {
322-
return err
324+
return nil, err
323325
}
324326
opts := issues.DefaultOptionsFromEnv()
325327

pkg/cmd/roachtest/test_runner.go

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -750,7 +750,7 @@ func (r *testRunner) runWorker(
750750
handleClusterCreationFailure := func(err error) {
751751
t.Error(errClusterProvisioningFailed(err))
752752

753-
if err := github.MaybePost(t, l, t.failureMsg()); err != nil {
753+
if _, err := github.MaybePost(t, l, t.failureMsg()); err != nil {
754754
shout(ctx, l, stdout, "failed to post issue: %s", err)
755755
}
756756
}
@@ -1016,6 +1016,17 @@ func (r *testRunner) runTest(
10161016
}
10171017
output := fmt.Sprintf("%s\ntest artifacts and logs in: %s", failureMsg, t.ArtifactsDir())
10181018

1019+
issue, err := github.MaybePost(t, l, output)
1020+
if err != nil {
1021+
shout(ctx, l, stdout, "failed to post issue: %s", err)
1022+
}
1023+
1024+
// If an issue was created (or comment added) on GitHub,
1025+
// include that information in the output so that it can be
1026+
// easily inspected on the TeamCity overview page.
1027+
if issue != nil {
1028+
output += "\n" + issue.String()
1029+
}
10191030
if roachtestflags.TeamCity {
10201031
// If `##teamcity[testFailed ...]` is not present before `##teamCity[testFinished ...]`,
10211032
// TeamCity regards the test as successful.
@@ -1024,10 +1035,6 @@ func (r *testRunner) runTest(
10241035
}
10251036

10261037
shout(ctx, l, stdout, "--- FAIL: %s (%s)\n%s", testRunID, durationStr, output)
1027-
1028-
if err := github.MaybePost(t, l, output); err != nil {
1029-
shout(ctx, l, stdout, "failed to post issue: %s", err)
1030-
}
10311038
} else {
10321039
shout(ctx, l, stdout, "--- PASS: %s (%s)", testRunID, durationStr)
10331040
}

0 commit comments

Comments
 (0)