revise documentation for plugins requiring flyte backend setup (#1062)

* revise documentation for plugins requiring flyte backend setup Signed-off-by: Samhita Alla <[email protected]> * nit Signed-off-by: Samhita Alla <[email protected]> * update docs Signed-off-by: Samhita Alla <[email protected]> * update requirements Signed-off-by: Samhita Alla <[email protected]> * downgrade pydantic Signed-off-by: Samhita Alla <[email protected]> * revert requirements update Signed-off-by: Samhita Alla <[email protected]> * revert requirements update Signed-off-by: Samhita Alla <[email protected]> * generate requirements in databricks Signed-off-by: Samhita Alla <[email protected]> * add envd Signed-off-by: Samhita Alla <[email protected]> * ray isort Signed-off-by: Samhita Alla <[email protected]> * sort imports Signed-off-by: Samhita Alla <[email protected]> * lint Signed-off-by: Samhita Alla <[email protected]> * add envd to plugin requirements Signed-off-by: Samhita Alla <[email protected]> * lint Signed-off-by: Samhita Alla <[email protected]> * add requirements.txt file Signed-off-by: Samhita Alla <[email protected]> * fix requirements Signed-off-by: Samhita Alla <[email protected]> * modify imagespec registry Signed-off-by: Samhita Alla <[email protected]> * lint Signed-off-by: Samhita Alla <[email protected]> * tf dependencies Signed-off-by: Samhita Alla <[email protected]> * modify deps and add registry Signed-off-by: Samhita Alla <[email protected]> * modify registry Signed-off-by: Samhita Alla <[email protected]> * mpi protobuf Signed-off-by: Samhita Alla <[email protected]> * add placeholder Signed-off-by: Samhita Alla <[email protected]> * incorporate suggestions Signed-off-by: Samhita Alla <[email protected]> --------- Signed-off-by: Samhita Alla <[email protected]>
flyteorg · Aug 18, 2023 · 20a7f1f · 20a7f1f
1 parent bcd5624
commit 20a7f1f
Show file tree

Hide file tree

Showing 41 changed files with 2,019 additions and 2,770 deletions.
diff --git a/docs/conf.py b/docs/conf.py
@@ -20,7 +20,7 @@
 # -- Project information -----------------------------------------------------
 
 project = "Flytesnacks"
-copyright = "2022, Flyte"
+copyright = "2023, Flyte"
 author = "Flyte"
 
 # The full version, including alpha/beta/rc tags

diff --git a/docs/index.md b/docs/index.md
@@ -170,8 +170,8 @@ of data between tasks and, more generally, the dependencies between tasks 🔀.
 :animate: fade-in-slide-down
 
 Flyte `@task` and `@workflow` decorators are designed to work seamlessly with
-your code-base, provided that the *decorated function is at the top-level scope
-of the module*.
+your code-base, provided that the _decorated function is at the top-level scope
+of the module_.
 
 This means that you can invoke tasks and workflows as regular Python methods and
 even import and use them in other Python modules or scripts.
@@ -189,15 +189,15 @@ only supports a subset of Python's semantics. Learn more in the
 
 ## Running Flyte Workflows in Python
 
-You can run the workflow in ``example.py`` on a local Python by using `pyflyte`,
+You can run the workflow in `example.py` on a local Python by using `pyflyte`,
 the CLI that ships with `flytekit`.
 
 ```{prompt} bash $
 pyflyte run example.py training_workflow \
     --hyperparameters '{"C": 0.1}'
 ```
 
-:::::{dropdown} {fa}`info-circle`  Running into shell issues?
+:::::{dropdown} {fa}`info-circle` Running into shell issues?
 :title: text-muted
 :animate: fade-in-slide-down
 
@@ -212,32 +212,30 @@ set -gx PATH $PATH ~/.local/bin
 :::
 :::::
 
-
-
 :::::{dropdown} {fa}`info-circle` Why use `pyflyte run` rather than `python example.py`?
 :title: text-muted
 :animate: fade-in-slide-down
 
 `pyflyte run` enables you to execute a specific workflow using the syntax
 `pyflyte run <path/to/script.py> <workflow_or_task_function_name>`.
 
-Keyword arguments can be supplied to ``pyflyte run`` by passing in options in
-the format ``--kwarg value``, and in the case of ``snake_case_arg`` argument
-names, you can pass in options in the form of ``--snake-case-arg value``.
+Keyword arguments can be supplied to `pyflyte run` by passing in options in
+the format `--kwarg value`, and in the case of `snake_case_arg` argument
+names, you can pass in options in the form of `--snake-case-arg value`.
 
 ::::{note}
 If you want to run a workflow with `python example.py`, you would have to write
 a `main` module conditional at the end of the script to actually run the
 workflow:
 
-:::{code-block} python
+```python
 if __name__ == "__main__":
     training_workflow(hyperparameters={"C": 0.1})
-:::
+```
 
 This becomes even more verbose if you want to pass in arguments:
 
-:::{code-block} python
+```python
 if __name__ == "__main__":
     import json
     from argparse import ArgumentParser
@@ -248,7 +246,7 @@ if __name__ == "__main__":
 
     args = parser.parse_args()
     training_workflow(hyperparameters=args.hyperparameters)
-:::
+```
 
 ::::
 
@@ -315,7 +313,6 @@ Where ``<execution_name>`` is a unique identifier for the workflow execution.
 
 ````
 
-
 ## Inspect the Results
 
 Navigate to the URL produced by `pyflyte run`. This will take you to
@@ -324,7 +321,6 @@ workflows, and executions.
 
 ![getting started console](https://github.com/flyteorg/static-resources/raw/main/flytesnacks/getting_started/getting_started_console.gif)
 
-
 ```{note}
 There are a few features about FlyteConsole worth pointing out in the GIF above:
 
@@ -337,13 +333,12 @@ There are a few features about FlyteConsole worth pointing out in the GIF above:
 
 ## Summary
 
-🎉  **Congratulations! In this introductory guide, you:**
+🎉 **Congratulations! In this introductory guide, you:**
 
 1. 📜 Created a Flyte script, which trains a binary classification model.
 2. 🚀 Spun up a demo Flyte cluster on your local system.
 3. 👟 Ran a workflow locally and on a demo Flyte cluster.
 
-
 ## What's Next?
 
 Follow the rest of the sections in the documentation to get a better
@@ -439,33 +434,33 @@ flyte_lab
 :hidden:
 
 Integrations <integrations>
-auto_examples/sql_plugin/index
-auto_examples/greatexpectations_plugin/index
-auto_examples/papermill_plugin/index
-auto_examples/pandera_plugin/index
-auto_examples/modin_plugin/index
-auto_examples/dolt_plugin/index
+auto_examples/airflow_plugin/index
+auto_examples/athena_plugin/index
+auto_examples/aws_batch_plugin/index
+auto_examples/sagemaker_pytorch_plugin/index
+auto_examples/sagemaker_training_plugin/index
+auto_examples/bigquery_plugin/index
+auto_examples/k8s_dask_plugin/index
+auto_examples/databricks_plugin/index
 auto_examples/dbt_plugin/index
-auto_examples/whylogs_plugin/index
-auto_examples/mlflow_plugin/index
-auto_examples/onnx_plugin/index
+auto_examples/dolt_plugin/index
 auto_examples/duckdb_plugin/index
+auto_examples/greatexpectations_plugin/index
+auto_examples/hive_plugin/index
 auto_examples/k8s_pod_plugin/index
-auto_examples/k8s_dask_plugin/index
-auto_examples/k8s_spark_plugin/index
-auto_examples/kfpytorch_plugin/index
-auto_examples/kftensorflow_plugin/index
+auto_examples/mlflow_plugin/index
+auto_examples/modin_plugin/index
 auto_examples/kfmpi_plugin/index
+auto_examples/onnx_plugin/index
+auto_examples/papermill_plugin/index
+auto_examples/pandera_plugin/index
+auto_examples/kfpytorch_plugin/index
 auto_examples/ray_plugin/index
-auto_examples/sagemaker_training_plugin/index
-auto_examples/sagemaker_pytorch_plugin/index
-auto_examples/athena_plugin/index
-auto_examples/aws_batch_plugin/index
-auto_examples/hive_plugin/index
 auto_examples/snowflake_plugin/index
-auto_examples/databricks_plugin/index
-auto_examples/bigquery_plugin/index
-auto_examples/airflow_plugin/index
+auto_examples/k8s_spark_plugin/index
+auto_examples/sql_plugin/index
+auto_examples/kftensorflow_plugin/index
+auto_examples/whylogs_plugin/index
 ```
 
 ```{toctree}

diff --git a/examples/basics/basics/named_outputs.py b/examples/basics/basics/named_outputs.py
@@ -58,6 +58,7 @@ def say_hello() -> hello_output:
 # which are tuples that need to be de-referenced.
 # :::
 
+
 # %%
 @workflow
 def my_wf() -> wf_outputs:

diff --git a/examples/databricks_plugin/Dockerfile b/examples/databricks_plugin/Dockerfile
@@ -1,6 +1,5 @@
-FROM databricksruntime/standard:11.3-LTS
+FROM databricksruntime/standard:12.2-LTS
 LABEL org.opencontainers.image.source=https://github.com/flyteorg/flytesnacks
-# To build this dockerfile, run "make docker_build".
 
 ENV VENV /opt/venv
 ENV LANG C.UTF-8
@@ -11,12 +10,6 @@ USER 0
 
 RUN sudo apt-get update && sudo apt-get install -y make build-essential libssl-dev git
 
-# Install custom package
-RUN /databricks/python3/bin/pip install awscli
-WORKDIR /opt
-RUN curl https://sdk.cloud.google.com > install.sh
-RUN bash /opt/install.sh --install-dir=/opt
-
 # Install Python dependencies
 COPY ./requirements.txt /databricks/driver/requirements.txt
 RUN /databricks/python3/bin/pip install -r /databricks/driver/requirements.txt
@@ -27,6 +20,6 @@ WORKDIR /databricks/driver
 COPY . /databricks/driver/
 
 # This tag is supplied by the build script and will be used to determine the version
-# when registering tasks, workflows, and launch plans
+# when registering tasks, workflows and launch plans.
 ARG tag
 ENV FLYTE_INTERNAL_IMAGE $tag
diff --git a/examples/databricks_plugin/README.md b/examples/databricks_plugin/README.md
@@ -4,52 +4,47 @@
 .. tags:: Spark, Integration, DistributedComputing, Data, Advanced
 ```
 
-Flyte backend can be connected with Databricks service. Once enabled it can allow you to submit a spark job to Databricks platform.
-This section will provide how to use the Databricks Plugin using flytekit python.
+Flyte can be seamlessly integrated with the [Databricks](https://www.databricks.com/) service,
+enabling you to effortlessly submit Spark jobs to the Databricks platform.
 
-## Installation
+## Install the plugin
 
-The flytekit Databricks plugin is bundled into its Spark plugin, so to use, simply run the following:
+The Databricks plugin comes bundled with the Spark plugin.
+To execute it locally, run the following command:
 
-```{eval-rst}
-.. prompt:: bash
-
-    pip install flytekitplugins-spark
+```
+pip install flytekitplugins-spark
 ```
 
-## How to Build Your Dockerfile for Spark on Databricks
+If you intend to run the plugin on the Flyte cluster, you must first set it up on the backend.
+Please refer to the
+{std:ref}`Databricks plugin setup guide <flyte:deployment-plugin-setup-webapi-databricks>`
+for detailed instructions.
 
-Using Spark on Databricks is extremely easy and provides full versioning using the custom-built Spark container. The built container can also execute regular Spark tasks.
-For Spark, the image must use a base image built by Databricks and the workflow code must copy to `/databricks/driver`
+## Run the example on the Flyte cluster
 
-```{literalinclude} ../../../examples/databricks_plugin/Dockerfile
-:emphasize-lines: 20-32
-:language: docker
-:linenos: true
+To run the provided example on the Flyte cluster, use the following command:
+
+```
+pyflyte run --remote \
+  --image ghcr.io/flyteorg/flytecookbook:databricks_plugin-latest \
+  https://raw.githubusercontent.com/flyteorg/flytesnacks/master/examples/databricks_plugin/databricks_plugin/databricks_job.py \
+  my_databricks_job
 ```
 
-## Configuring the backend to get Databricks plugin working
+:::{note}
+Using Spark on Databricks is incredibly simple and offers comprehensive versioning through a
+custom-built Spark container. This built container also facilitates the execution of standard Spark tasks.
 
-1. Make sure to add "databricks" in `tasks.task-plugins.enabled-plugin` in [enabled_plugins.yaml](https://github.com/flyteorg/flyte/blob/master/deployment/sandbox/flyte_generated.yaml#L2296)
-2. Add Databricks access token to Flytepropeller. [here](https://docs.databricks.com/administration-guide/access-control/tokens.html#enable-or-disable-token-based-authentication-for-the-workspace) to see more detail to create Databricks access token.
+To utilize Spark, the image should employ a base image provided by Databricks,
+and the workflow code must be copied to `/databricks/driver`.
 
-```bash
-kubectl edit secret -n flyte flyte-propeller-auth
+```{literalinclude} ../../../examples/databricks_plugin/Dockerfile
+:language: docker
+:emphasize-lines: 1,7-8,20
 ```
 
-Configuration will be like below
-
-```bash
-apiVersion: v1
-data:
-  FLYTE_DATABRICKS_API_TOKEN: <ACCESS_TOKEN>
-kind: Secret
-metadata:
-  annotations:
-    meta.helm.sh/release-name: flyte
-    meta.helm.sh/release-namespace: flyte
-...
-```
+:::
 
 ```{auto-examples-toc}
 databricks_job
-Original file line number
+Diff line change
@@ Expand Up / @@ -58,6 +58,7 @@ def say_hello() -> hello_output: @@
     # which are tuples that need to be de-referenced.
     # :::
     # %%
     @workflow
     def my_wf() -> wf_outputs:
@@ Expand Down @@