diff --git a/doc/mpi.md b/doc/mpi.md index 1e2b4b85..342c0d7d 100644 --- a/doc/mpi.md +++ b/doc/mpi.md @@ -79,7 +79,7 @@ with create_mpi_job(job_name="example", ### Specify the MPI script and environments -You could customize the MPI job environments and MPI scritps with `mpi_script_prepare_fn` argument. +You could customize the MPI job environments and MPI scripts with `mpi_script_prepare_fn` argument. ```python def script_prepare_fn(context: MPIJobContext): diff --git a/doc/spark_on_ray.md b/doc/spark_on_ray.md index 85b1b0f1..59d6a17c 100644 --- a/doc/spark_on_ray.md +++ b/doc/spark_on_ray.md @@ -1,7 +1,7 @@ ### Spark master actors node affinity RayDP will create a ray actor called `RayDPSparkMaster`, which will then launch the java process, -acting like a Master in a tradtional Spark cluster. +acting like a Master in a traditional Spark cluster. By default, this actor could be scheduled to any node in the ray cluster. If you want it to be on a particular node, you can assign some custom resources to that node, and request those resources when starting `RayDPSparkMaster` by setting @@ -83,7 +83,7 @@ spark = raydp.init_spark(app_name='RayDP Oversubscribe Example', ### External Shuffle Service & Dynamic Resource Allocation -RayDP supports External Shuffle Serivce. To enable it, you can either set `spark.shuffle.service.enabled` to `true` in `spark-defaults.conf`, or you can provide a config to `raydp.init_spark`, as shown below: +RayDP supports External Shuffle Service. To enable it, you can either set `spark.shuffle.service.enabled` to `true` in `spark-defaults.conf`, or you can provide a config to `raydp.init_spark`, as shown below: ```python raydp.init_spark(..., configs={"spark.shuffle.service.enabled": "true"}) @@ -144,13 +144,13 @@ with open(conf_path, "w") as f: 3. Run your application, such as `raydp-submit --ray-conf /path/to/ray.conf --class org.apache.spark.examples.SparkPi --conf spark.executor.cores=1 --conf spark.executor.instances=1 --conf spark.executor.memory=500m $SPARK_HOME/examples/jars/spark-examples.jar`. Note that `--ray-conf` must be specified right after raydp-submit, and before any spark arguments. ### Placement Group -RayDP can leverage Ray's placement group feature and schedule executors onto spcecified placement group. It provides better control over the allocation of Spark executors on a Ray cluster, for example spreading the spark executors onto seperate nodes or starting all executors on a single node. You can specify a created placement group when init spark, as shown below: +RayDP can leverage Ray's placement group feature and schedule executors onto specified placement group. It provides better control over the allocation of Spark executors on a Ray cluster, for example spreading the spark executors onto separate nodes or starting all executors on a single node. You can specify a created placement group when init spark, as shown below: ```python raydp.init_spark(..., placement_group=pg) ``` -Or you can just specify the placement group strategy. RayDP will create a coreesponding placement group and manage its lifecycle, which means the placement group will be created together with SparkSession and removed when calling `raydp.stop_spark()`. Strategy can be "PACK", "SPREAD", "STRICT_PACK" or "STRICT_SPREAD". Please refer to [Placement Groups document](https://docs.ray.io/en/latest/placement-group.html#pgroup-strategy) for details. +Or you can just specify the placement group strategy. RayDP will create a corresponding placement group and manage its lifecycle, which means the placement group will be created together with SparkSession and removed when calling `raydp.stop_spark()`. Strategy can be "PACK", "SPREAD", "STRICT_PACK" or "STRICT_SPREAD". Please refer to [Placement Groups document](https://docs.ray.io/en/latest/placement-group.html#pgroup-strategy) for details. ```python raydp.init_spark(..., placement_group_strategy="SPREAD") diff --git a/docker/README.md b/docker/README.md index 4c07757b..695c145d 100644 --- a/docker/README.md +++ b/docker/README.md @@ -35,7 +35,7 @@ image: You can also change other fields in this file to specify number of workers, etc. -Then, you need to deploy the KubeRay operator first, plese refer to [here](https://docs.ray.io/en/latest/cluster/kubernetes/getting-started.html#kuberay-quickstart) for instructions. You can now deploy a Ray cluster with RayDP installed via `helm install ray-cluster PATH_to_CHART`. +Then, you need to deploy the KubeRay operator first, please refer to [here](https://docs.ray.io/en/latest/cluster/kubernetes/getting-started.html#kuberay-quickstart) for instructions. You can now deploy a Ray cluster with RayDP installed via `helm install ray-cluster PATH_to_CHART`. ## Access the cluster Check here [here](https://docs.ray.io/en/master/cluster/kubernetes/getting-started.html#running-applications-on-a-ray-cluster) to see how to run applications on the cluster you just deployed.