Skip to content

Commit 0b75e20

Browse files
TroyGardenfacebook-github-bot
authored andcommitted
set LD_LIBRARY_PATH for fbgemm in validate_binaries.sh
Summary: # context * to address the error when running github test ``` +++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec' +++ local cmd=run +++ case "$cmd" in +++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec' +++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec' ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so) Traceback (most recent call last): File "<string>", line 1, in <module> File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module> _load_library(f"{library}.so") File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library raise error File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library main() File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main run_cmd_or_die(f"docker exec -t {container_name} /exec") File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}") RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1 torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename)) File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library ctypes.CDLL(path) File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so) ``` * the issue was fixed before by D67949409 [#2671](#2671) in for another test * this diff applies the same fix on the validate_binaries test. # details * previous failures {F1974496108} Differential Revision: D68511145
1 parent dd5457c commit 0b75e20

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

.github/scripts/validate_binaries.sh

+14
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,20 @@ elif [[ ${MATRIX_CHANNEL} = 'release' ]]; then
4949
export PYTORCH_URL="https://download.pytorch.org/whl/${CUDA_VERSION}"
5050
fi
5151

52+
53+
echo "CU_VERSION: ${CU_VERSION}"
54+
echo "CHANNEL: ${CHANNEL}"
55+
echo "CONDA_ENV: ${CONDA_ENV}"
56+
57+
if [[ $CU_VERSION = cu* ]]; then
58+
# Setting LD_LIBRARY_PATH fixes the runtime error with fbgemm_gpu not
59+
# being able to locate libnvrtc.so
60+
echo "[NOVA] Setting LD_LIBRARY_PATH ..."
61+
conda env config vars set -p ${CONDA_ENV} \
62+
LD_LIBRARY_PATH="/usr/local/lib:${CUDA_HOME}/lib64:${CONDA_ENV}/lib:${LD_LIBRARY_PATH}"
63+
fi
64+
65+
5266
# install pytorch
5367
# switch back to conda once torch nightly is fixed
5468
# if [[ ${MATRIX_GPU_ARCH_TYPE} = 'cuda' ]]; then

0 commit comments

Comments
 (0)