Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add tags to most Glue tables #2402

Merged
merged 3 commits into from
Feb 10, 2025
Merged

Conversation

pdecat
Copy link
Contributor

@pdecat pdecat commented Feb 6, 2025

Integration test logs

Logs
Add passing integration test logs here

Example query results

Results
> SELECT
  name,
  akas,
  tags
FROM
  aws_glue_catalog_database
+----------------------------+-----------------------------------------------------------------------------+-------------------+
| name                       | akas                                                                        | tags              |
+----------------------------+-----------------------------------------------------------------------------+-------------------+
| default                    | ["arn:aws:glue:eu-west-3:012345678901:database/default"]                    | {}                |
| ec2_marketplace_offers     | ["arn:aws:glue:eu-west-3:012345678901:database/ec2_marketplace_offers"]     | {}                |
| 3cr-data                   | ["arn:aws:glue:eu-west-3:012345678901:database/3cr-data"]                   | {"test":"pdecat"} |
| proxy_metrics_lab          | ["arn:aws:glue:eu-west-3:012345678901:database/proxy_metrics_lab"]          | {}                |
| aws_usage_queries_database | ["arn:aws:glue:eu-west-3:012345678901:database/aws_usage_queries_database"] | {}                |
| carbon_emissions           | ["arn:aws:glue:eu-west-3:012345678901:database/carbon_emissions"]           | {}                |
| aws_backup_reporting       | ["arn:aws:glue:eu-west-3:012345678901:database/aws_backup_reporting"]       | {}                |
| default                    | ["arn:aws:glue:eu-west-1:012345678901:database/default"]                    | {}                |
| 3cr-data                   | ["arn:aws:glue:eu-west-1:012345678901:database/3cr-data"]                   | {}                |
+----------------------------+-----------------------------------------------------------------------------+-------------------+
> SELECT
  name,
  akas,
  tags
FROM
  aws_glue_connection
+-------------+----------------------------------------------------------------+-------------------+
| name        | akas                                                           | tags              |
+-------------+----------------------------------------------------------------+-------------------+
| test-pdecat | ["arn:aws:glue:eu-west-3:012345678901:connection/test-pdecat"] | {"test":"pdecat"} |
+-------------+----------------------------------------------------------------+-------------------+
> SELECT
  name,
  akas,
  tags
FROM
  aws_glue_crawler
+---------------------------+---------------------------------------------------------------------------+-------------------+
| name                      | akas                                                                      | tags              |
+---------------------------+---------------------------------------------------------------------------+-------------------+
| test-pdecat               | ["arn:aws:glue:eu-west-3:012345678901:crawler/test-pdecat"]               | {"test":"pdecat"} |
| 3cr-data-crawler          | ["arn:aws:glue:eu-west-1:012345678901:crawler/3cr-data-crawler"]          | {}                |
| ec2_marketplace_ri_offers | ["arn:aws:glue:eu-west-3:012345678901:crawler/ec2_marketplace_ri_offers"] | {}                |
+---------------------------+---------------------------------------------------------------------------+-------------------+
> SELECT
  akas,
  tags
FROM
  aws_glue_job where region = 'eu-west-3'
+----------------------------------------------------------------+-------------------+
| akas                                                           | tags              |
+----------------------------------------------------------------+-------------------+
| ["arn:aws:glue:eu-west-3:706303552994:job/3cr-data-transform"] | {"test":"pdecat"} |
+----------------------------------------------------------------+-------------------+

@misraved misraved requested a review from ParthaI February 6, 2025 16:04
@misraved
Copy link
Contributor

misraved commented Feb 6, 2025

Thanks @pdecat for the contribution 👍!!
Could you please add example queries for other tables where tags have been added?

@pdecat
Copy link
Contributor Author

pdecat commented Feb 6, 2025

Hi @misraved, I've added examples for all tables except aws_glue_dev_endpoint for which I have no idea how to create items 😅 and aws_glue_security_configuration for which it does not work, maybe ARN construction is broken.

I've removed the tags from aws_glue_security_configuration, should I also remove it from aws_glue_dev_endpoint?

@ParthaI
Copy link
Contributor

ParthaI commented Feb 7, 2025

Hey @pdecat,

I really appreciate your contributions to Steampipe! I noticed that the aws_glue_* tables include tag details, and I had a couple of thoughts regarding this:

  1. Using ARN Instead of akas

    • Would it be possible to use the ARN column instead of akas?
    • The akas column follows Steampipe's standard format, but in some cases, AWS does not treat certain resources as having an ARN, which may cause mismatches.
    • Additionally, akas is an array of strings, and we're currently selecting only the first index value.
    • Although I don't have a concrete example of a resource with multiple akas values, I'm curious—how would the implementation handle a scenario where multiple values exist?
  2. Adding tags_src Column for Consistency

    • Can we add the tags_src column to maintain consistency across tables?
    • We include the tags_src column when the API response returns tags in the following structure:
      [{ "Key": "Tag1", "Value": "Value1" }, { "Key": "Tag2", "Value": "Value2" }]
    • However, we do not include tags_src when the API response is structured like:
      { "Tag1": "Value1", "Tag2": "Value2" }

Let me know if you need any further clarification. Looking forward to your thoughts!

Thanks!

@pdecat
Copy link
Contributor Author

pdecat commented Feb 7, 2025

Hi @ParthaI,

Hey @pdecat,

I really appreciate your contributions to Steampipe! I noticed that the aws_glue_* tables include tag details, and I had a couple of thoughts regarding this:

  1. Using ARN Instead of akas

    • Would it be possible to use the ARN column instead of akas?
    • The akas column follows Steampipe's standard format, but in some cases, AWS does not treat certain resources as having an ARN, which may cause mismatches.
    • Additionally, akas is an array of strings, and we're currently selecting only the first index value.
    • Although I don't have a concrete example of a resource with multiple akas values, I'm curious—how would the implementation handle a scenario where multiple values exist?

None of the APIs return the ARN so it has to be constructed just like it is for AKAs.
It seems to me that only aws_glue_connection is having its ARN constructing function named after AKAs, maybe getGlueCatalogDatabaseAkas should be renamed to getGlueCatalogDatabaseArn?

  1. Adding tags_src Column for Consistency

    • Can we add the tags_src column to maintain consistency across tables?
    • We include the tags_src column when the API response returns tags in the following structure:
      [{ "Key": "Tag1", "Value": "Value1" }, { "Key": "Tag2", "Value": "Value2" }]
    • However, we do not include tags_src when the API response is structured like:
      { "Tag1": "Value1", "Tag2": "Value2" }

All tags are returned by the API in the second form { "Tag1": "Value1", "Tag2": "Value2" } so I guess tags_src is not needed here.

@ParthaI
Copy link
Contributor

ParthaI commented Feb 7, 2025

Thanks, @pdecat, for the quick response!

None of the APIs return the ARN so it has to be constructed just like it is for AKAs.
It seems to me that only aws_glue_connection is having its ARN constructing function named after AKAs, maybe getGlueCatalogDatabaseAkas should be renamed to getGlueCatalogDatabaseArn?

That makes sense! It would be great if you could incorporate these changes as part of your updates. It looks like we might have initially constructed it incorrectly. Here’s what needs to be updated:

  • Rename the getGlueCatalogDatabaseAkas function to getGlueCatalogDatabaseArn.
  • Modify getGlueCatalogDatabaseArn to return a string instead of []string, aligning with the approach used for other tables. (Reference: AWS Glue Crawler Table
  • Update the transform field to use:
    Transform:   transform.FromValue().Transform(transform.EnsureStringArray),
  • Use the getGlueCatalogDatabaseArn function to retrieve the ARN value wherever needed.

All tags are returned by the API in the second form { "Tag1": "Value1", "Tag2": "Value2" } so I guess tags_src is not needed here.

Yes, we don't need the tags_src column anymore.

Thanks again for your help!😊

…nd add arn column to aws_glue_catalog_database
@pdecat
Copy link
Contributor Author

pdecat commented Feb 7, 2025

Done!

Copy link
Contributor

@ParthaI ParthaI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@misraved misraved merged commit 1002c1e into turbot:main Feb 10, 2025
1 check passed
@pdecat pdecat deleted the feat/glue_tags branch February 10, 2025 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants