You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* First pass through top level files under sky/
* WIP working through sky/spot/ & other files
* README: name change
* Name change in docs/source/.
* env var rename: 'SKYPILOT_DISABLE_USAGE_COLLECTION'
* env var rename to SKYPILOT_DEV
* setup.py
* Fix wheel update problem: specify <pkg>-*.whl in many places.
* pylint, constants.py, format, fix tests
* Remove debug remnants
* Markdown changes: ag -w sky **/*.md
* More renaming, mostly examples/.
Sky is a framework to run any workload seamlessly across different cloud providers through a unified interface. No knowledge of cloud offerings is required or expected – you simply define the workload and its resource requirements, and Sky will automatically execute it on AWS, Google Cloud Platform or Microsoft Azure.
5
+
SkyPilot is a framework to run any workload seamlessly across different cloud providers through a unified interface. No knowledge of cloud offerings is required or expected – you simply define the workload and its resource requirements, and SkyPilot will automatically execute it on AWS, Google Cloud Platform or Microsoft Azure.
6
6
7
7
<!-- TODO: We need a logo here -->
8
8
## Getting Started
9
9
Please refer to our [documentation](https://sky-proj-sky.readthedocs-hosted.com/en/latest/).
IMPORTANT: Please `export SKY_DEV=1` before running the sky commands in the terminal, so that the developing log will not pollute the actual user logs.
24
+
IMPORTANT: Please `export SKYPILOT_DEV=1` before running the CLI commands in the terminal, so that developers' usage logs do not pollute the actual user logs.
25
25
26
26
27
27
### Submitting pull requests
@@ -31,15 +31,15 @@ IMPORTANT: Please `export SKY_DEV=1` before running the sky commands in the term
31
31
- Follow the [Google style guide](https://google.github.io/styleguide/pyguide.html).
32
32
33
33
34
-
### Environment Variable Options
35
-
-`export SKY_DEV=1` to send the logs to dev space.
36
-
-`export SKY_DEBUG=1` to show debugging logs (logging.DEBUG).
37
-
-`export SKY_DISABLE_USAGE_COLLECTION=1` to disable usage logging.
38
-
-`export SKY_MINIMIZE_LOGGING=1` to minimize the sky outputs for demo purpose.
34
+
### Environment variables for developers
35
+
-`export SKYPILOT_DEV=1` to send usage logs to dev space.
36
+
-`export SKYPILOT_DISABLE_USAGE_COLLECTION=1` to disable usage logging.
37
+
-`export SKYPILOT_DEBUG=1` to show debugging logs (use logging.DEBUG level).
38
+
-`export SKYPILOT_MINIMIZE_LOGGING=1` to minimize the logging for demo purpose.
39
39
40
40
### Dump timeline
41
41
42
-
Timeline is useful for performance analysis and debugging in Sky.
42
+
Timeline is useful for performance analysis and debugging in SkyPilot.
43
43
44
44
Here are the APIs:
45
45
@@ -67,12 +67,12 @@ with timeline.FileLockEvent(lockpath):
67
67
pass
68
68
```
69
69
70
-
To dump the timeline, set environment variable `SKY_TIMELINE_FILE_PATH` to a file path.
70
+
To dump the timeline, set environment variable `SKYPILOT_TIMELINE_FILE_PATH` to a file path.
71
71
72
72
View the dumped timeline file using `Chrome` (chrome://tracing) or [Perfetto](https://ui.perfetto.dev/).
73
73
74
-
### Updating the sky docker image
75
-
1. Authenticate with sky ECR repository. Contact [email protected] for access:
74
+
### Updating the SkyPilot docker image
75
+
1. Authenticate with SkyPilot ECR repository. Contact [email protected] for access:
Copy file name to clipboardexpand all lines: docs/source/examples/spot-jobs.rst
+8-8
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,10 @@
3
3
Managed Spot Jobs
4
4
================================================
5
5
6
-
Sky supports managed spot jobs that can **automatically recover from preemptions**.
6
+
SkyPilot supports managed spot jobs that can **automatically recover from preemptions**.
7
7
This feature **saves significant cost** (e.g., up to 70\% for GPU VMs) by making preemptible spot instances practical for long-running jobs.
8
8
9
-
To maximize availability, Sky automatically finds available spot resources across regions and clouds.
9
+
To maximize availability, SkyPilot automatically finds available spot resources across regions and clouds.
10
10
Here is an example of BERT training job failing over different regions across AWS and GCP.
11
11
12
12
.. image:: ../images/spot-training.png
@@ -15,17 +15,17 @@ Here is an example of BERT training job failing over different regions across AW
15
15
16
16
Below are requirements for using managed spot jobs:
17
17
18
-
(1) **Mounting code and datasets**: Local file mounts/workdir are not supported. Cloud buckets should be used to hold code and datasets, which can be satisfied by using :ref:`Sky Storage <sky-storage>`.
18
+
(1) **Mounting code and datasets**: Local file mounts/workdir are not supported. Cloud buckets should be used to hold code and datasets, which can be satisfied by using :ref:`SkyPilot Storage <sky-storage>`.
19
19
20
-
(2) **Saving and loading checkpoints**: (For ML jobs) Application code should save checkpoints periodically to a :ref:`Sky Storage <sky-storage>`-mounted cloud bucket. For job recovery, the program should try to reload a latest checkpoint from that path when it starts.
20
+
(2) **Saving and loading checkpoints**: (For ML jobs) Application code should save checkpoints periodically to a :ref:`SkyPilot Storage <sky-storage>`-mounted cloud bucket. For job recovery, the program should try to reload a latest checkpoint from that path when it starts.
21
21
22
22
We explain them in detail below.
23
23
24
24
25
25
Mounting code and datasets
26
26
--------------------------------
27
27
28
-
To launch a spot job, users should upload their codebase and data to cloud buckets through :ref:`Sky Storage <sky-storage>`.
28
+
To launch a spot job, users should upload their codebase and data to cloud buckets through :ref:`SkyPilot Storage <sky-storage>`.
29
29
Note that the cloud buckets can be mounted to VMs in different regions/clouds and thus enable enables transparent job relaunching without user's intervention.
30
30
The YAML below shows an example.
31
31
@@ -51,7 +51,7 @@ The YAML below shows an example.
51
51
.. note::
52
52
53
53
Currently :ref:`workdir <sync-code-artifacts>` and :ref:`file mounts with local files <sync-code-artifacts>` are not
54
-
supported for spot jobs. You can convert them to :ref:`Sky Storage <sky-storage>`.
54
+
supported for spot jobs. You can convert them to :ref:`SkyPilot Storage <sky-storage>`.
55
55
56
56
Saving and loading checkpoints
57
57
--------------------------------
@@ -66,7 +66,7 @@ Below is an example of mounting a bucket to :code:`/checkpoint`.
66
66
name: # NOTE: Fill in your bucket name
67
67
mode: MOUNT
68
68
69
-
The :code:`MOUNT` mode in :ref:`Sky Storage <sky-storage>` ensures the checkpoints outputted to :code:`/checkpoint` are automatically synced to a persistent bucket.
69
+
The :code:`MOUNT` mode in :ref:`SkyPilot Storage <sky-storage>` ensures the checkpoints outputted to :code:`/checkpoint` are automatically synced to a persistent bucket.
70
70
Note that the application code should save program checkpoints periodically and reload those states when the job is restarted.
71
71
This is typically achieved by reloading the latest checkpoint at the beginning of your program.
72
72
@@ -146,7 +146,7 @@ With the above changes, you are ready to launch a spot job with ``sky spot launc
146
146
147
147
$ sky spot launch -n bert-qa bert_qa.yaml
148
148
149
-
Sky will launch and start monitoring the spot job. When a preemption happens, Sky will automatically
149
+
SkyPilot will launch and start monitoring the spot job. When a preemption happens, SkyPilot will automatically
150
150
search for resources across regions and clouds to re-launch the job.
Sky currently supports three major cloud providers: AWS, GCP, and Azure. If you
20
+
SkyPilot currently supports three major cloud providers: AWS, GCP, and Azure. If you
21
21
only have access to certain clouds, use any combination of
22
22
:code:`".[aws,azure,gcp]"` (e.g., :code:`".[aws,gcp]"`) to reduce the
23
23
dependencies installed.
24
24
25
25
.. note::
26
26
27
-
For Macs, macOS >= 10.15 is required to install Sky. Apple Silicon-based devices (e.g. Apple M1) must run :code:`conda install grpcio` prior to installing Sky.
27
+
For Macs, macOS >= 10.15 is required to install SkyPilot. Apple Silicon-based devices (e.g. Apple M1) must run :code:`conda install grpcio` prior to installing SkyPilot.
28
28
29
29
.. note::
30
30
31
-
As an alternative to installing Sky on your laptop, we also provide a Docker image as a quick way to try out Sky. See instructions below on running Sky:ref:`in a container <docker-image>`.
31
+
As an alternative to installing SkyPilot on your laptop, we also provide a Docker image as a quick way to try out SkyPilot. See instructions below on running SkyPilot:ref:`in a container <docker-image>`.
32
32
33
33
.. _cloud-account-setup:
34
34
@@ -89,12 +89,12 @@ This will produce a summary like:
89
89
90
90
.. code-block:: text
91
91
92
-
Checking credentials to enable clouds for Sky.
92
+
Checking credentials to enable clouds for SkyPilot.
93
93
AWS: enabled
94
94
GCP: enabled
95
95
Azure: enabled
96
96
97
-
Sky will use only the enabled clouds to run tasks. To change this, configure cloud credentials, and run sky check.
97
+
SkyPilot will use only the enabled clouds to run tasks. To change this, configure cloud credentials, and run sky check.
0 commit comments