Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom IAM Role for Driver and Executor Pods? #10

Open
batCoder95 opened this issue Aug 24, 2020 · 3 comments
Open

Custom IAM Role for Driver and Executor Pods? #10

batCoder95 opened this issue Aug 24, 2020 · 3 comments

Comments

@batCoder95
Copy link

batCoder95 commented Aug 24, 2020

Hi all,

I wanted to check if it is possible to define an AWS IAM role that should be attached to driver and worker pods in Spark application YAML file. As per my understanding, currently these pods inherit the role from EKS node-group, but I would like to specify my own custom role in the YAML file. Can somebody please suggest on the possibility ?

I thought of passing IAM Role / OIDC Service Account in below manner but did not work. It fails saying S3 access denied (403 error).

apiVersion: "sparkoperator.k8s.io/v1beta2" kind: SparkApplication metadata: name: test-pyspark-app namespace: dev spec: type: Python pythonVersion: "3" mode: cluster image: "coqueirotree/spark-py" imagePullPolicy: Always mainApplicationFile: s3a://my-bucket/TestFile.py sparkVersion: "3.0.0" sparkConf: "spark.kubernetes.driverEnv.serviceAccount": "svc-spark-account" "spark.kubernetes.driverEnv.serviceAccountName": "svc-spark-account" "spark.kubernetes.authenticate.driver.serviceAccount": "svc-spark-account" "spark.kubernetes.authenticate.driver.serviceAccountName": "svc-spark-account" "spark.kubernetes.executorEnv.serviceAccount": "svc-spark-account" "spark.kubernetes.executorEnv.serviceAccountName": "svc-spark-account" "spark.kubernetes.authenticate.executor.serviceAccount": "svc-spark-account" "spark.kubernetes.authenticate.executor.serviceAccountName": "svc-spark-account" driver: cores: 1 coreLimit: "1200m" memory: "512m" labels: version: 3.0.0 serviceAccount: svc-spark-account serviceAccountName: svc-spark-account executor: cores: 1 instances: 1 memory: "512m" labels: version: 3.0.0 serviceAccount: svc-spark-account serviceAccountName: svc-spark-account

Can someone please advise if I should do something differently in this case? Thanks in advance :)

@Jeffwan
Copy link
Contributor

Jeffwan commented Sep 17, 2020

IRSA requires AWS SDK support with assume-web-identity-role. I have not check dependencies. If you have time to help, feel free to take a look. Additionaly, S3 is a little bit different, because spark use hadoop-s3 package. You'd better check mapping and find the version.

@Jeffwan
Copy link
Contributor

Jeffwan commented Sep 17, 2020

@batCoder95

@hiro-o918
Copy link

I was able to run SparkApplications, setting an iAM role for the service account

dependencies

  • Spark: 3.0.0
  • Hadoop: 3.2.1
  • AWS SDK: 1.11.82

I did not check other combinations of versions, but I think Hadoop requires 3.2.1 or upper because hadoop-aws 3.2.0 is built by older SDK which does not support assume-web-identity-role

references

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants