From 90aec72e95c8c7b7ea7c24ad58222ffadff560c8 Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Mon, 26 Aug 2024 22:07:22 +0900
Subject: [PATCH 1/6] translate until line 45

---
 beginner_source/ddp_series_theory.rst | 33 ++++++++++++++-------------
 1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index bd77ed13c..a88d8785f 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -1,13 +1,14 @@
-`Introduction <ddp_series_intro.html>`__ \|\| **What is DDP** \|\|
-`Single-Node Multi-GPU Training <ddp_series_multigpu.html>`__ \|\|
-`Fault Tolerance <ddp_series_fault_tolerance.html>`__ \|\|
-`Multi-Node training <../intermediate/ddp_series_multinode.html>`__ \|\|
-`minGPT Training <../intermediate/ddp_series_minGPT.html>`__
+`소개 <ddp_series_intro.html>`__ \|\| **분산 데이터 병렬 처리 (DDP) 란 무엇인가?** \|\|
+`단일 노드 다중-GPU 학습 <ddp_series_multigpu.html>`__ \|\|
+`결함 내성 <ddp_series_fault_tolerance.html>`__ \|\|
+`다중 노드 학습 <../intermediate/ddp_series_multinode.html>`__ \|\|
+`minGPT 학습 <../intermediate/ddp_series_minGPT.html>`__
 
-What is Distributed Data Parallel (DDP)
+분산 데이터 병렬 처리 (DDP) 란 무엇인가?
 =======================================
 
 Authors: `Suraj Subramanian <https://github.com/suraj813>`__
+번역: `박지은 <https://github.com/rumjie>`__
 
 .. grid:: 2
 
@@ -22,7 +23,7 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 
       * Familiarity with `basic non-distributed training  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ in PyTorch
 
-Follow along with the video below or on `youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__.
+아래의 영상이나 `유투브 영상 youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__을 따라 진행하세요.
 
 .. raw:: html
 
@@ -30,17 +31,17 @@ Follow along with the video below or on `youtube <https://www.youtube.com/watch/
      <iframe width="560" height="315" src="https://www.youtube.com/embed/Cvdhwx-OBBo" frameborder="0" allow="accelerometer; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
    </div>
 
-This tutorial is a gentle introduction to PyTorch `DistributedDataParallel <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__ (DDP)
-which enables data parallel training in PyTorch. Data parallelism is a way to
-process multiple data batches across multiple devices simultaneously
-to achieve better performance. In PyTorch, the `DistributedSampler <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`__
-ensures each device gets a non-overlapping input batch. The model is replicated on all the devices;
-each replica calculates gradients and simultaneously synchronizes with the others using the `ring all-reduce
-algorithm <https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/>`__.
+이 튜토리얼은 파이토치에서 분산 데이터 병렬 학습을 가능하게 하는 `분산 데이터 병렬 <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__ (DDP)
+에 대해 소개합니다. 데이터 병렬 처리란 더 높은 성능을 달성하기 위해
+여러 개의 디바이스에서 여러 데이터 배치들을 동시에 처리하는 방법입니다. 
+파이토치에서, `분산 샘플러 <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`__ 는 
+각 디바이스가 서로 다른 입력 배치를 받는 것을 보장합니다.
+모델은 모든 디바이스에 복제되며, 각 사본은 변화도를 계산하는 동시에 `링 올-리듀스
+알고리즘 <https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/>`__ 을 사용해 다른 사본과 동기화됩니다.
 
-This `illustrative tutorial <https://tutorials.pytorch.kr/intermediate/dist_tuto.html#>`__ provides a more in-depth python view of the mechanics of DDP.
+`예시 튜토리얼 <https://tutorials.pytorch.kr/intermediate/dist_tuto.html#>`__ 에서 DDP 메커니즘에 대해 파이썬 관점에서 심도 있는 설명을 볼 수 있습니다. 
 
-Why you should prefer DDP over ``DataParallel`` (DP)
+``데이터 병렬 DataParallel`` (DP) 보다 DDP가 나은 이유
 ----------------------------------------------------
 
 `DataParallel <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__

From 1f42fa225def079ea7ae30c038cc0bde1413cfc6 Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Tue, 3 Sep 2024 23:54:58 +0900
Subject: [PATCH 2/6] draft PR complete

---
 beginner_source/ddp_series_theory.rst | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index a88d8785f..d2c8e701e 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -44,28 +44,27 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 ``데이터 병렬 DataParallel`` (DP) 보다 DDP가 나은 이유
 ----------------------------------------------------
 
-`DataParallel <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__
-is an older approach to data parallelism. DP is trivially simple (with just one extra line of code) but it is much less performant.
-DDP improves upon the architecture in a few ways:
+`DP <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__ 는 데이터 병렬 처리의 이전 접근 방식입니다.
+DP 는 간단하지만, (한 줄만 추가하면 됨) 성능은 훨씬 떨어집니다. DDP는 아래와 같은 방식으로 아키텍처를 개선합니다.
 
 +---------------------------------------+------------------------------+
 | ``DataParallel``                      | ``DistributedDataParallel``  |
 +=======================================+==============================+
-| More overhead; model is replicated    | Model is replicated only     |
-| and destroyed at each forward pass    | once                         |
+| 작업 부하가 큼; 전파될 때마다                | 모델이 한 번만 복제됨             |
+| 모델이 복제 및 삭제됨                      |                              |
 +---------------------------------------+------------------------------+
-| Only supports single-node parallelism | Supports scaling to multiple |
-|                                       | machines                     |
+| 단일 노드 병렬 처리만 가능                  | 여러 머신으로 확장 가능            |
+|                                       |                              |
 +---------------------------------------+------------------------------+
-| Slower; uses multithreading on a      | Faster (no GIL contention)   |
-| single process and runs into Global   | because it uses              |
-| Interpreter Lock (GIL) contention     | multiprocessing              |
+| 느림; 단일 프로세스에서 멀티 스레딩            | 빠름 (no GIL contention)     |
+| (multithreading)을 사용하기 때문에 Global  | 멀티 프로세싱을 사용하기 때문에 GIL |
+| Interpreter Lock (GIL) 충돌이 발생       | 충돌 없음                      |
 +---------------------------------------+------------------------------+
 
-Further Reading
+읽을거리
 ---------------
 
--  `Multi-GPU training with DDP <ddp_series_multigpu.html>`__ (next tutorial in this series)
+-  `Multi-GPU training with DDP <ddp_series_multigpu.html>`__ (이 시리즈의 다음 튜토리얼)
 -  `DDP
    API <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__
 -  `DDP Internal

From d525f2113d6bfc9aa2fb370bb4d6b7e0cb108586 Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Wed, 4 Sep 2024 00:14:42 +0900
Subject: [PATCH 3/6] table edtied

---
 beginner_source/ddp_series_theory.rst | 28 ++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index d2c8e701e..3cf9853d6 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -47,19 +47,21 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 `DP <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__ 는 데이터 병렬 처리의 이전 접근 방식입니다.
 DP 는 간단하지만, (한 줄만 추가하면 됨) 성능은 훨씬 떨어집니다. DDP는 아래와 같은 방식으로 아키텍처를 개선합니다.
 
-+---------------------------------------+------------------------------+
-| ``DataParallel``                      | ``DistributedDataParallel``  |
-+=======================================+==============================+
-| 작업 부하가 큼; 전파될 때마다                | 모델이 한 번만 복제됨             |
-| 모델이 복제 및 삭제됨                      |                              |
-+---------------------------------------+------------------------------+
-| 단일 노드 병렬 처리만 가능                  | 여러 머신으로 확장 가능            |
-|                                       |                              |
-+---------------------------------------+------------------------------+
-| 느림; 단일 프로세스에서 멀티 스레딩            | 빠름 (no GIL contention)     |
-| (multithreading)을 사용하기 때문에 Global  | 멀티 프로세싱을 사용하기 때문에 GIL |
-| Interpreter Lock (GIL) 충돌이 발생       | 충돌 없음                      |
-+---------------------------------------+------------------------------+
++-------------------------------------+------------------------------+
+| ``DataParallel``                    | ``DistributedDataParallel``  |
++=====================================+==============================+
+| 작업 부하가 큼; 전파될 때마다       | 모델이 한 번만 복제됨         |
+| 모델이 복제 및 삭제됨               |                              |
++-------------------------------------+------------------------------+
+| 단일 노드 병렬 처리만 가능          | 여러 머신으로 확장 가능       |
+|                                     |                              |
++-------------------------------------+------------------------------+
+| 느림; 단일 프로세스에서 멀티 스레딩  | 빠름 (no GIL contention)      |
+| (multithreading)을 사용하기 때문에  | 멀티 프로세싱을 사용하기 때문에 GIL  |
+| Global Interpreter Lock (GIL)       | 충돌 없음                    |
+| 충돌이 발생                        |                              |
++-------------------------------------+------------------------------+
+
 
 읽을거리
 ---------------

From a6d27e10632de4da491fec0f55af986e4f401e8f Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Wed, 4 Sep 2024 00:24:41 +0900
Subject: [PATCH 4/6] card translation

---
 beginner_source/ddp_series_theory.rst | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index 3cf9853d6..60b3fdc6e 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -12,16 +12,16 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 
 .. grid:: 2
 
-   .. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn
+   .. grid-item-card:: :octicon:`mortar-board;1em;` 이 장에서 배우는 것
 
-      *  How DDP works under the hood
-      *  What is ``DistributedSampler``
-      *  How gradients are synchronized across GPUs
+      *  DDP 의 내부 작동 원리
+      *  ``DistributedSampler`` 이란 무엇인가?
+      *  GPU 간 기울기들이 동기화되는 방법
 
 
-   .. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites
+   .. grid-item-card:: :octicon:`list-unordered;1em;` 필요 사항
 
-      * Familiarity with `basic non-distributed training  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ in PyTorch
+      * 파이토치 `비분산 학습  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ 에 익숙할 것
 
 아래의 영상이나 `유투브 영상 youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__을 따라 진행하세요.
 

From ba40331d3e3a907815c46d6c94fcfae23083db69 Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Wed, 4 Sep 2024 00:36:29 +0900
Subject: [PATCH 5/6] minor correction - space/table update

---
 beginner_source/ddp_series_theory.rst | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index 60b3fdc6e..9906dc297 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -23,7 +23,7 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 
       * 파이토치 `비분산 학습  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ 에 익숙할 것
 
-아래의 영상이나 `유투브 영상 youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__을 따라 진행하세요.
+아래의 영상이나 `유투브 영상 youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__ 을 따라 진행하세요.
 
 .. raw:: html
 
@@ -47,20 +47,17 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 `DP <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__ 는 데이터 병렬 처리의 이전 접근 방식입니다.
 DP 는 간단하지만, (한 줄만 추가하면 됨) 성능은 훨씬 떨어집니다. DDP는 아래와 같은 방식으로 아키텍처를 개선합니다.
 
-+-------------------------------------+------------------------------+
-| ``DataParallel``                    | ``DistributedDataParallel``  |
-+=====================================+==============================+
-| 작업 부하가 큼; 전파될 때마다       | 모델이 한 번만 복제됨         |
-| 모델이 복제 및 삭제됨               |                              |
-+-------------------------------------+------------------------------+
-| 단일 노드 병렬 처리만 가능          | 여러 머신으로 확장 가능       |
-|                                     |                              |
-+-------------------------------------+------------------------------+
-| 느림; 단일 프로세스에서 멀티 스레딩  | 빠름 (no GIL contention)      |
-| (multithreading)을 사용하기 때문에  | 멀티 프로세싱을 사용하기 때문에 GIL  |
-| Global Interpreter Lock (GIL)       | 충돌 없음                    |
-| 충돌이 발생                        |                              |
-+-------------------------------------+------------------------------+
+.. list-table::
+   :header-rows: 1
+
+   * - ``DataParallel``
+     - ``DistributedDataParallel``
+   * - 작업 부하가 큼; 전파될 때마다 모델이 복제 및 삭제됨
+     - 모델이 한 번만 복제됨
+   * - 단일 노드 병렬 처리만 가능
+     - 여러 머신으로 확장 가능
+   * - 느림; 단일 프로세스에서 멀티 스레딩을 사용하기 때문에 Global Interpreter Lock (GIL) 충돌이 발생
+     - 빠름; 멀티 프로세싱을 사용하기 때문에 GIL 충돌 없음
 
 
 읽을거리

From f3edc12c3a518e5ae94d8db3cfb3727d38bd6d90 Mon Sep 17 00:00:00 2001
From: rumjie <diwoon95@gmail.com>
Date: Sun, 29 Sep 2024 13:22:11 +0900
Subject: [PATCH 6/6] correction based on comments

---
 beginner_source/ddp_series_theory.rst | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
index 9906dc297..888898b56 100644
--- a/beginner_source/ddp_series_theory.rst
+++ b/beginner_source/ddp_series_theory.rst
@@ -7,7 +7,7 @@
 분산 데이터 병렬 처리 (DDP) 란 무엇인가?
 =======================================
 
-Authors: `Suraj Subramanian <https://github.com/suraj813>`__
+저자: `Suraj Subramanian <https://github.com/suraj813>`__
 번역: `박지은 <https://github.com/rumjie>`__
 
 .. grid:: 2
@@ -16,7 +16,7 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 
       *  DDP 의 내부 작동 원리
       *  ``DistributedSampler`` 이란 무엇인가?
-      *  GPU 간 기울기들이 동기화되는 방법
+      *  GPU 간 변화도가 동기화되는 방법
 
 
    .. grid-item-card:: :octicon:`list-unordered;1em;` 필요 사항
@@ -36,7 +36,7 @@ Authors: `Suraj Subramanian <https://github.com/suraj813>`__
 여러 개의 디바이스에서 여러 데이터 배치들을 동시에 처리하는 방법입니다. 
 파이토치에서, `분산 샘플러 <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`__ 는 
 각 디바이스가 서로 다른 입력 배치를 받는 것을 보장합니다.
-모델은 모든 디바이스에 복제되며, 각 사본은 변화도를 계산하는 동시에 `링 올-리듀스
+모델은 모든 디바이스에 복제되며, 각 사본은 변화도를 계산하는 동시에 `Ring-All-Reduce
 알고리즘 <https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/>`__ 을 사용해 다른 사본과 동기화됩니다.
 
 `예시 튜토리얼 <https://tutorials.pytorch.kr/intermediate/dist_tuto.html#>`__ 에서 DDP 메커니즘에 대해 파이썬 관점에서 심도 있는 설명을 볼 수 있습니다. 
@@ -52,12 +52,12 @@ DP 는 간단하지만, (한 줄만 추가하면 됨) 성능은 훨씬 떨어집
 
    * - ``DataParallel``
      - ``DistributedDataParallel``
-   * - 작업 부하가 큼; 전파될 때마다 모델이 복제 및 삭제됨
+   * - 작업 부하가 큼, 전파될 때마다 모델이 복제 및 삭제됨
      - 모델이 한 번만 복제됨
    * - 단일 노드 병렬 처리만 가능
      - 여러 머신으로 확장 가능
-   * - 느림; 단일 프로세스에서 멀티 스레딩을 사용하기 때문에 Global Interpreter Lock (GIL) 충돌이 발생
-     - 빠름; 멀티 프로세싱을 사용하기 때문에 GIL 충돌 없음
+   * - 느림, 단일 프로세스에서 멀티 스레딩을 사용하기 때문에 Global Interpreter Lock (GIL) 충돌이 발생
+     - 빠름, 멀티 프로세싱을 사용하기 때문에 GIL 충돌 없음
 
 
 읽을거리