Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendoring: "URI contains invalid characters" #1019

Closed
mss opened this issue Feb 4, 2025 · 14 comments · Fixed by #1024
Closed

Vendoring: "URI contains invalid characters" #1019

mss opened this issue Feb 4, 2025 · 14 comments · Fixed by #1024
Labels
bug 🐛 An issue with the system

Comments

@mss
Copy link
Contributor

mss commented Feb 4, 2025

Describe the Bug

We are currently running v1.105.0 and want to update to the latest version. v1.158.0 is completely broken (we'll open a separate ticket for that) so we tried v1.157.0 (and a few versions before that).

And now our vendoring is broken:

~/.cache/tenv/Atmos/1.157.0/atmos vendor pull
Vendoring from 'vendor'
URI contains invalid characters

The problem is that we use an ampersand in our Git URLs for shallow clones, eg.:

    - component: mixins/context/v2.0
      version: v2.21.0
      source: git::https://git.example.com/scm/xxx/atmos-components.git//mixins/context?depth=1&ref={{.Version}}
      excluded_paths:
        - '**/README.md'
      tags:
        - mixins
        - mixins/context
      targets:
        - components/terraform/{{.Component}}

When I do a sed -i -e 's/depth=1&//' vendor/*.yaml vendoring works again.

Expected Behavior

Vendoring is not broken.

Steps to Reproduce

Create a vendor config with an ampersand in the URL.

Screenshots

No response

Environment

  • OS: Linux (we tried Ubuntu, RHEL, Arch)
  • Version: 1.157.0

Additional Context

One odd thing: We straced the process and it looks like Atmos first tries to look in directory /home/me/atmos/git::https://git.example.com/scm/xxx/atmos-components.git//mixins/context?depth=1&ref=v2.21.0 (ie. it appends the URL to the directory).

@mss mss added the bug 🐛 An issue with the system label Feb 4, 2025
@mss
Copy link
Contributor Author

mss commented Feb 4, 2025

I guess this might be the same issue as #832 which probably doesn't like the question mark in the URL which for some reason gets appended to the local directory?

@aknysh
Copy link
Member

aknysh commented Feb 4, 2025

@mss thanks for reporting the issues.
We'll investigate the URI contains invalid characters issue.
You are using this URL?

source: git::https://git.example.com/scm/xxx/atmos-components.git//mixins/context?depth=1&ref={{.Version}}

also, please provide more details about "v1.158.0 is completely broken"?

@osterman
Copy link
Member

osterman commented Feb 4, 2025

We'll fix this as part of:

@mss
Copy link
Contributor Author

mss commented Feb 4, 2025

I did some poor man's bisecting using the script below and the issue was introduced in v1.130.0, ie. via #768.

Here is the test case (it can be simpler, the go templates aren't required):

apiVersion: atmos/v1
kind: AtmosVendorConfig
metadata:
  name: gh-1019
  description: test
spec:
  sources:
    - component: atmos
      source: git::https://github.com/cloudposse/atmos.git?depth=1&ref={{.Version}}
      version: v1.129.0
      targets:
        - src/{{.Version}}

The script:

g=105
b=157
c=$b
while sleep 1; do
  v=1.$c.0
  echo $v
  sed -i -e "/version:/s/v1.*/v$v/" vendor.yaml
  mise exec atmos@$v -- atmos vendor pull
  if [[ -d src/v$v ]]; then
    g=$c
  else
    b=$c
  fi
  c=$(($g+($b-$g)/2))
done

@mss
Copy link
Contributor Author

mss commented Feb 4, 2025

The issue is obviously here: https://github.com/cloudposse/atmos/blob/v1.158.0/internal/exec/go_getter_utils.go#L39

@osterman Since #984 doesn't seem to be related, could this maybe be fixed separately? We are currently stuck on that old version since we were waiting for another fix which went into v1.138.0 or so.

@mss
Copy link
Contributor Author

mss commented Feb 4, 2025

I just love AI. That nonsense check was introduced by coderabbit in #768 (comment)

@osterman
Copy link
Member

osterman commented Feb 4, 2025

I just love AI. That check was introduced by coderabbit in this discussion.

@CodeRabbit has been incredibly helpful in scaling up our review process, catching cases where validation could be improved. With ~30 PRs in the past month and a small team, our biggest challenge is maintaining sufficient test coverage. Even with human reviewers, things can sometimes slip through the cracks.

Since #984 doesn't seem to be related, could this maybe be fixed separately?

We’re happy to take care of this, but if you'd like a quicker resolution, opening a PR with a fix--ideally with tests--would be the fastest way forward. Otherwise, we’ll include it in #984.

Copy link
Contributor

coderabbitai bot commented Feb 4, 2025

Let me check the code that was added based on my suggestion.

Copy link
Contributor

coderabbitai bot commented Feb 4, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for the PR discussion and the code changes
gh issue comment 1019 --body "Let me check the code that was added based on my suggestion."

# Find the relevant code file
rg -l "go_getter_utils.go"

# Check the contents of the file if found
rg -A 10 "URI contains invalid characters" 

# Look for test files related to this functionality
fd "go_getter.*test.*.go"

Length of output: 1023

@osterman I apologize for my previous suggestion that led to this issue. The validation I recommended was implemented too strictly, causing problems with valid URL characters like ampersands in query parameters.

Here's what I suggest for the fix:

// Instead of a broad character validation, we should use Go's url.Parse
// to properly validate URLs while allowing query parameters
if _, err := url.Parse(uri); err != nil {
    return fmt.Errorf("invalid URI format: %v", err)
}

Additionally, I strongly recommend adding test cases to cover various URL formats:

func TestValidateURI(t *testing.T) {
    tests := []struct {
        name    string
        uri     string
        wantErr bool
    }{
        {
            name:    "valid git URL with query params",
            uri:     "git::https://github.com/org/repo.git?depth=1&ref=v1.0.0",
            wantErr: false,
        },
        {
            name:    "valid OCI URL",
            uri:     "oci://registry.example.com/repo:tag",
            wantErr: false,
        },
        // Add more test cases
    }
    // ... test implementation
}

✏️ Learnings added
Learnt from: osterman
URL: https://github.com/cloudposse/atmos/issues/1019

Timestamp: 2025-02-04T22:45:15.635Z
Learning: When validating URLs in Go, use the standard `url.Parse` function instead of character-based validation to properly handle URL-safe characters and query parameters.

Learnt from: osterman
URL: https://github.com/cloudposse/atmos/issues/1019

Timestamp: 2025-02-04T22:45:15.635Z
Learning: URL validation in Atmos should support query parameters (like `depth=1` and `ref=`) which are commonly used in Git URLs for shallow clones and version references.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@osterman
Copy link
Member

osterman commented Feb 4, 2025

@CodeRabbit can you open a PR with the suggested fix?

Copy link
Contributor

coderabbitai bot commented Feb 4, 2025

Tip

For best results, initiate chat on the files or code changes.

@osterman I apologize, but I am not able to open pull requests directly. However, I can provide the code snippets again for easy reference to help someone from the team implement the fix:

// Fix for the URL validation
if _, err := url.Parse(uri); err != nil {
    return fmt.Errorf("invalid URI format: %v", err)
}

And the corresponding test cases:

func TestValidateURI(t *testing.T) {
    tests := []struct {
        name    string
        uri     string
        wantErr bool
    }{
        {
            name:    "valid git URL with query params",
            uri:     "git::https://github.com/org/repo.git?depth=1&ref=v1.0.0",
            wantErr: false,
        },
        {
            name:    "valid OCI URL",
            uri:     "oci://registry.example.com/repo:tag",
            wantErr: false,
        },
    }
    // ... test implementation
}

Would you like me to provide any additional details or clarification about the proposed fix?

(`・ω・´)ゞ

@osterman
Copy link
Member

osterman commented Feb 4, 2025

source: git::https://github.com/cloudposse/atmos.git?depth=1&ref={{.Version}}

@mss Adding depth=1 seems like a great optimization! I was curious whether this has significantly improved your vendoring speed.

If it's making a noticeable difference, I don’t see why we shouldn’t include it by default in Atmos via our go-getter implementation.

@mss
Copy link
Contributor Author

mss commented Feb 4, 2025

My 2 cents: I don't think any such URL validation is required there since go getter will do its own validation (and call Parse).

@mss
Copy link
Contributor Author

mss commented Feb 5, 2025

source: git::https://github.com/cloudposse/atmos.git?depth=1&ref={{.Version}}

@mss Adding depth=1 seems like a great optimization! I was curious whether this has significantly improved your vendoring speed.

If it's making a noticeable difference, I don’t see why we shouldn’t include it by default in Atmos via our go-getter implementation.

Yes, I just had the same idea. I will raise a separate feature ticket with some timings.

mss added a commit to mss/atmos that referenced this issue Feb 5, 2025
mss added a commit to mss/atmos that referenced this issue Feb 5, 2025
mss added a commit to mss/atmos that referenced this issue Feb 5, 2025
osterman added a commit to mss/atmos that referenced this issue Feb 5, 2025
aknysh pushed a commit that referenced this issue Feb 5, 2025
* fix(vendor): Remove excessive URL validation

Fixes #1019

* feat(vendor): Add component to error message

---------

Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 An issue with the system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants