Skip to content

添加关于新版本chatglm3-6b模型微调时需要更多显存的解析说明及其解决方法 #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion chatglm/qlora_chatglm3.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,25 @@
"# 使用领域(私有)数据微调 ChatGLM3"
]
},
{
"cell_type": "markdown",
"id": "ff3a3927-59e9-4ea3-abcd-332147eac0ed",
"metadata": {},
"source": [
"<details>\n",
"<summary><b>注意:新版本的 chatglm3-6b 模型在微调过程中需要更多显存</b> (点击查看解决方法)</summary>\n",
"<table>\n",
"<tr><td><b>现象</b></td><td>如果在微调 chatglm3-6b 模型时,执行 trainer.train() 后瞬间出现 “OutOfMemoryError: CUDA out of memory.”(显存不足)错误,请考虑指定一个旧版本进行微调。</td></tr>\n",
"\n",
"<tr><td><b>原因</b></td><td>从2024年2月6日后发布的 chatglm3-6b 模型开始,调用 torch.utils.checkpoint.checkpoint() 方法时,use_reentrant 参数的默认值由原来的 True 改为 False。这导致在训练时保存更多的梯度参数以加快反向传播计算速度,同时占用更多显存。此外,新代码覆盖了基类 PreTrainedModel 中的 gradient_checkpointing_enable() 方法,导致无法通过梯度检查点来减少显存占用。有关详细的代码更改,请查看 <a href=\"https://huggingface.co/THUDM/chatglm3-6b/commit/37fe0008ee928af7f4e5e57693e8c7787d049af8\">Commit 37fe000</a>。</td></tr>\n",
"\n",
"<tr><td><b>方案</b></td><td>在加载 Tokenizer 和 Model 时,通过添加 revision 参数指定一个旧版本。示例代码如下:<br>\n",
"tokenizer = AutoTokenizer.from_pretrained(......, revision='b098244')<br>\n",
"model = AutoModel.from_pretrained(......, revision='b098244')</td></tr>\n",
"</table>\n",
"</details>"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand Down Expand Up @@ -917,7 +936,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.11.7"
}
},
"nbformat": 4,
Expand Down
29 changes: 28 additions & 1 deletion chatglm/qlora_chatglm3_timestamp.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,33 @@
"生成带有 epoch 和 timestamp 的模型文件"
]
},
{
"cell_type": "markdown",
"id": "e63fd22b-b00f-4643-bc8a-a7cac180e943",
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-08T16:19:38.797312Z",
"iopub.status.busy": "2024-02-08T16:19:38.796865Z",
"iopub.status.idle": "2024-02-08T16:19:38.815322Z",
"shell.execute_reply": "2024-02-08T16:19:38.813949Z",
"shell.execute_reply.started": "2024-02-08T16:19:38.797267Z"
}
},
"source": [
"<details>\n",
"<summary><b>注意:新版本的 chatglm3-6b 模型在微调过程中需要更多显存</b> (点击查看解决方法)</summary>\n",
"<table>\n",
"<tr><td><b>现象</b></td><td>如果在微调 chatglm3-6b 模型时,执行 trainer.train() 后瞬间出现 “OutOfMemoryError: CUDA out of memory.”(显存不足)错误,请考虑指定一个旧版本进行微调。</td></tr>\n",
"\n",
"<tr><td><b>原因</b></td><td>从2024年2月6日后发布的 chatglm3-6b 模型开始,调用 torch.utils.checkpoint.checkpoint() 方法时,use_reentrant 参数的默认值由原来的 True 改为 False。这导致在训练时保存更多的梯度参数以加快反向传播计算速度,同时占用更多显存。此外,新代码覆盖了基类 PreTrainedModel 中的 gradient_checkpointing_enable() 方法,导致无法通过梯度检查点来减少显存占用。有关详细的代码更改,请查看 <a href=\"https://huggingface.co/THUDM/chatglm3-6b/commit/37fe0008ee928af7f4e5e57693e8c7787d049af8\">Commit 37fe000</a>。</td></tr>\n",
"\n",
"<tr><td><b>方案</b></td><td>在加载 Tokenizer 和 Model 时,通过添加 revision 参数指定一个旧版本。示例代码如下:<br>\n",
"tokenizer = AutoTokenizer.from_pretrained(......, revision='b098244')<br>\n",
"model = AutoModel.from_pretrained(......, revision='b098244')</td></tr>\n",
"</table>\n",
"</details>"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand Down Expand Up @@ -923,7 +950,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.11.7"
}
},
"nbformat": 4,
Expand Down