-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve NPE Caused by Missing EXECUTOR_INSTANCES in Spark Settings. #422
Conversation
@carsonwang @pang-wu Could you please take a look at this PR? Thank you very much! |
can you share the program that can reproduce the issue? |
@cawangyz Thank you for your response! Our use case is that currently, a large number of Ray tasks are primarily used for running XGBoost. However, some data preparation work has not yet been integrated into the RayJob, and we are trying to include these ETL tasks into the RayJob process. These tasks consist of some pre-written scripts, and we would like to directly execute these scripts using RayDpSubmit, similar to how SparkSubmit is used on YARN. The scripts we submit are similar to this.
Previously, when we ran the scripts this way, we encountered an exception if the To validate this, we can directly use the example case. If we remove raydp/examples/raydp-submit.py Lines 1 to 28 in 374f858
From a user's perspective, for newly written code, we would follow the official documentation to create new applications. The examples provided by RayDp are already quite comprehensive. However, for older code, we would prefer to run it directly in a compatible way. In practice, we might forget to set |
My two cents: I think our behavior should align with what Spark's default is, which should be 2 on YARN? |
@pang-wu Thank you for your suggestion! The default value of 2 is indeed reasonable. Regarding the approach to get the number of executors, I also agree that we should first retrieve the value from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good, can we also write a test?
Thank you for your response! I will add a unit test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thank you all! Merging this.
During our testing, we found that when using RayDP and the
spark.executor.instances
parameter is not specified, a null pointer exception is thrown.I believe we should set a default value for this parameter. If the user does not specify it, we should use the default value instead of throwing an exception. This would improve the user experience.
The error message is as follows: