Skip to content

Commit 4d3c840

Browse files
authored
added documentation on file run output to S3 storage and logging. Mentioned new runtime env variable ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3 (#123)
Signed-off-by: shalberd <[email protected]>
1 parent 81d3224 commit 4d3c840

File tree

4 files changed

+69
-4
lines changed
  • pipelines
    • run-generic-pipelines-on-apache-airflow
    • run-generic-pipelines-on-kubeflow-pipelines
    • run-pipelines-on-apache-airflow
    • run-pipelines-on-kubeflow-pipelines

4 files changed

+69
-4
lines changed

Diff for: pipelines/run-generic-pipelines-on-apache-airflow/README.md

+18-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,24 @@ Elyra currently supports Apache Airflow deployments that utilize GitHub or GitHu
5353
- Branch in named repository, e.g. `test-dags`. This branch must exist.
5454
- [Personal access token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) that Elyra can use to push DAGs to the repository, e.g. `4d79206e616d6520697320426f6e642e204a616d657320426f6e64`
5555

56-
Elyra utilizes S3-compatible cloud storage to make data available to notebooks and Python scripts while they are executed. Any kind of cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab is running and the Apache Airflow cluster. Collect the following information:
56+
Elyra utilizes S3-compatible cloud storage to make data available to Jupyter notebooks and R or Python scripts while they are executed. Any kind of cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab is running and the Apache Airflow cluster.
57+
58+
Elyra also puts the STDOUT (including STDERR) run output into a file when env var `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3` is set to `true` or not present in the runtime container, which is the default.
59+
This happens in addition to logging and writing to STDOUT and STDERR at runtime.
60+
61+
`ipynb` file execution run/STDOUT output is written to S3-compatible object storage in the following files:
62+
- `<notebook name>-output.ipynb`
63+
- `<notebook name>.html`
64+
65+
.r and .py file execution run/STDOUT output is written to to S3-compatible object storage in the following files:
66+
- `<r or python filename>.log`
67+
68+
Note: If you prefer to use S3-compatible storage for transfer of files between pipeline steps only and **not for logging information / run output of R, Python and Jupyter Notebook files**,
69+
either set env var **`ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`** to **`false`** in runtime container builds or pass that env value explicitely in the env section of the pipeline editor,
70+
either at Pipeline Properties - Generic Node Defaults - Environment Variables or at
71+
Node Properties - Additional Properties - Environment Variables.
72+
73+
Collect the following information:
5774
- S3 compatible object storage endpoint, e.g. `http://minio-service.kubernetes:9000`
5875
- S3 object storage username, e.g. `minio`
5976
- S3 object storage password, e.g. `minio123`

Diff for: pipelines/run-generic-pipelines-on-kubeflow-pipelines/README.md

+18-1
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,24 @@ Collect the following information for your Kubeflow Pipelines installation:
4747
- Password, for a multi-user, auth-enabled Kubeflow installation, e.g. `passw0rd`
4848
- Workflow engine type, which should be `Argo` or `Tekton`. Contact your administrator if you are unsure which engine your deployment utilizes.
4949

50-
Elyra utilizes S3-compatible cloud storage to make data available to notebooks and scripts while they are executed. Any kind of cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab is running and from the Kubeflow Pipelines cluster. Collect the following information:
50+
Elyra utilizes S3-compatible cloud storage to make data available to Jupyter notebooks and R or Python scripts while they are executed. Any kind of cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab is running and from the Kubeflow Pipelines cluster.
51+
52+
Elyra also puts the STDOUT (including STDERR) run output into a file when env var `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3` is set to `true` or not present in the runtime container, which is the default.
53+
This happens in addition to logging and writing to STDOUT and STDERR at runtime.
54+
55+
`ipynb` file execution run/STDOUT output is written to S3-compatible object storage in the following files:
56+
- `<notebook name>-output.ipynb`
57+
- `<notebook name>.html`
58+
59+
.r and .py file execution run/STDOUT output is written to to S3-compatible object storage in the following files:
60+
- `<r or python filename>.log`
61+
62+
Note: If you prefer to use S3-compatible storage for transfer of files between pipeline steps only and **not for logging information / run output of R, Python and Jupyter Notebook files**,
63+
either set env var **`ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`** to **`false`** in runtime container builds or pass that env value explicitely in the env section of the pipeline editor,
64+
either at Pipeline Properties - Generic Node Defaults - Environment Variables or at
65+
Node Properties - Additional Properties - Environment Variables.
66+
67+
Collect the following information:
5168
- S3 compatible object storage endpoint, e.g. `http://minio-service.kubernetes:9000`
5269
- S3 object storage username, e.g. `minio`
5370
- S3 object storage password, e.g. `minio123`

Diff for: pipelines/run-pipelines-on-apache-airflow/README.md

+16-1
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,22 @@ Collect the following information for your Apache Airflow installation:
5252

5353
Detailed instructions for setting up a DAG repository and generating an access token can be found in [the User Guide](https://elyra.readthedocs.io/en/latest/recipes/configure-airflow-as-a-runtime.html#setting-up-a-dag-repository-on-github).
5454

55-
Elyra utilizes S3-compatible cloud storage to make data available to notebooks and scripts while they are executed. Any kind of S3-based cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab/Elyra is running and from the Apache Airflow cluster.
55+
Elyra utilizes S3-compatible cloud storage to make data available to Jupyter notebooks and R or Python scripts while they are executed. Any kind of S3-based cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab/Elyra is running and from the Apache Airflow cluster.
56+
57+
Elyra also puts the STDOUT (including STDERR) run output into a file when env var `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3` is set to `true` or not present in the runtime container, which is the default.
58+
This happens in addition to logging and writing to STDOUT and STDERR at runtime.
59+
60+
`ipynb` file execution run/STDOUT output is written to S3-compatible object storage in the following files:
61+
- `<notebook name>-output.ipynb`
62+
- `<notebook name>.html`
63+
64+
.r and .py file execution run/STDOUT output is written to to S3-compatible object storage in the following files:
65+
- `<r or python filename>.log`
66+
67+
Note: If you prefer to use S3-compatible storage for transfer of files between pipeline steps only and **not for logging information / run output of R, Python and Jupyter Notebook files**,
68+
either set env var **`ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`** to **`false`** in runtime container builds or pass that env value explicitely in the env section of the pipeline editor,
69+
either at Pipeline Properties - Generic Node Defaults - Environment Variables or at
70+
Node Properties - Additional Properties - Environment Variables.
5671

5772
Collect the following information:
5873
- S3 compatible object storage endpoint, e.g. `http://minio-service.kubernetes:9000`

Diff for: pipelines/run-pipelines-on-kubeflow-pipelines/README.md

+17-1
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,23 @@ Collect the following information for your Kubeflow Pipelines installation:
5252
- Password, for a multi-user, auth-enabled Kubeflow installation, e.g. `passw0rd`
5353
- Workflow engine type, which should be `Argo` or `Tekton`. Contact your administrator if you are unsure which engine your deployment utilizes.
5454

55-
Elyra utilizes S3-compatible cloud storage to make data available to notebooks and scripts while they are executed. Any kind of S3-based cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab/Elyra is running and from the Kubeflow Pipelines cluster.
55+
Elyra utilizes S3-compatible cloud storage to make data available to Jupyter notebooks and R or Python scripts while they are executed. Any kind of S3-based cloud storage should work (e.g. IBM Cloud Object Storage or Minio) as long as it can be accessed from the machine where JupyterLab/Elyra is running and from the Kubeflow Pipelines cluster.
56+
57+
Elyra also puts the STDOUT (including STDERR) run output into a file when env var `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3` is set to `true` or not present in the runtime container, which is the default.
58+
This happens in addition to logging and writing to STDOUT and STDERR at runtime.
59+
60+
`ipynb` file execution run/STDOUT output is written to S3-compatible object storage in the following files:
61+
- `<notebook name>-output.ipynb`
62+
- `<notebook name>.html`
63+
64+
.r and .py file execution run/STDOUT output is written to to S3-compatible object storage in the following files:
65+
- `<r or python filename>.log`
66+
67+
Note: If you prefer to use S3-compatible storage for transfer of files between pipeline steps only and **not for logging information / run output of R, Python and Jupyter Notebook files**,
68+
either set env var **`ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`** to **`false`** in runtime container builds or pass that env value explicitely in the env section of the pipeline editor,
69+
either at Pipeline Properties - Generic Node Defaults - Environment Variables or at
70+
Node Properties - Additional Properties - Environment Variables.
71+
5672

5773
Collect the following information:
5874
- S3 compatible object storage endpoint, e.g. `http://minio-service.kubernetes:9000`

0 commit comments

Comments
 (0)