-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Search before asking
- I had searched in the feature and found no similar feature requirement.
Description
In Apache SeaTunnel's task configuration, sensitive information such as data source usernames and passwords is directly written into task scripts. This approach presents the following issues:
- Security Risks: Sensitive information exposed in scripts may lead to data source leaks.
- Maintenance Challenges: When data source configurations change, manual modifications are required across all related task scripts, resulting in inefficiency and a high risk of errors.
To address these problems, this issue aims to integrate metalake to centralize the storage and management of data source information. By introducing a data source ID mapping mechanism, users can easily update and manage configurations. The goal is to support mainstream data catalogs like Apache Gravitino while providing extensible interfaces for future integration with third-party services (e.g., Unity Catalog or DataHub).
- Metalake Configuration Adaptation:
Define metalake comfigurantion in seatunnel-env.sh and load metalake configurations during startup.
- Source/Sink Configuration Refactoring:
Add sourceId in source/sink config, and use placeholders in the sensitive info(e.g. username:${username}); dynamically fetch the config info from metalake and replace placeholders.
- Plugin-Based Metalake Support:
Define a generic MetalakeClient interface and implement GravitinoClient which fetch data source info by HTTP.
Usage Scenario
No response
Related issues
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct