Skip to content

Eval: agent validates TeamCity DSL with raw Maven instead of CLI workflow #253

@sugdyzhekov

Description

Prompt

Introduce a new build configuration into this TeamCity Kotlin DSL project, including the runner dependency wiring

What the agent did

While working on the DSL change, the agent decided to validate by running raw Maven instead of using the TeamCity CLI workflow that the teamcity-cli skill points to for project settings validation:

• I’m running a local DSL compile now. That should catch Kotlin syntax or TeamCity DSL issues in the new build type and runner dependency before this goes upstream.

• Ran mvn -q -DskipTests package
  └ [ERROR] Internal error: java.io.UncheckedIOException: java.nio.file.FileSystemException: /Users/sugdyzhekov/.m2/repository/org/jetbrains/teamcity/configs-dsl-kotlin-parent/1.0-SNAPSHOT/resolver-status.properties: Operation not permitted

After the sandbox error, it kept pursuing the Maven path and requested elevation for the same raw Maven validation instead of switching to the TeamCity CLI validation flow.

The user then had to redirect it explicitly:

› use teamcity project settings validate

The agent checked the correct command help, but still ran Maven again before finally using the requested CLI command:

• Ran teamcity project settings validate --help

• Ran mvn -q -DskipTests package
  └ (no output)

Only after that did it run the correct validation command:

• Ran teamcity project settings validate . --verbose
  └ ✓ Configuration valid
    Projects: 7, Build configurations: 909

Correct behavior

  1. After changing TeamCity project DSL, use the TeamCity CLI validation workflow: teamcity project settings validate (or check teamcity project settings validate --help first if needed)
  2. Do not default to mvn -q -DskipTests package as the primary DSL validation command for this task
  3. The teamcity-cli skill explicitly lists project settings validate under project commands, so the agent should use that knowledge instead of improvising a Maven-based workflow
  4. After the first Maven sandbox error, pivot to teamcity project settings validate rather than retrying raw Maven as the validation path
  5. After the user explicitly says use teamcity project settings validate, run that command directly and do not invoke Maven again
  6. If elevated permissions are needed because the CLI shells out to Maven internally, request them for teamcity project settings validate . --verbose

What went wrong?

  • Hallucinated flags or commands that don't exist
  • Used wrong command for the task
  • Missed a critical step in the workflow
  • Gave incorrect diagnosis or explanation
  • Failed to recover from an error
  • Didn't use the CLI at all (just talked about it)

TeamCity server URL

No response

Metadata

Metadata

Labels

evalAI skill eval scenario

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions