Skip to content

Commit c4623f0

Browse files
committed
Merge remote-tracking branch 'origin' into 26-add-optimization
2 parents 837a1fe + ac2f19b commit c4623f0

File tree

12 files changed

+377
-22
lines changed

12 files changed

+377
-22
lines changed

README.md

Lines changed: 77 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -4,30 +4,85 @@
44

55
## Installation
66

7-
### Setup python virtual environment
8-
9-
10-
### Development installation
11-
12-
```bash
13-
export GRIDTOOLS_JL_PATH="..."
14-
export GT4PY_PATH="..."
15-
# create python virtual environemnt
16-
# make sure to use a python version that is compatible with GT4Py
17-
python -m venv .venv
18-
# activate virtual env
19-
# this command has be run everytime GridTools.jl is used
20-
source .venv/bin/activate
21-
# clone gt4py
22-
git clone --branch fix_python_interp_path_in_cmake [email protected]:tehrengruber/gt4py.git
23-
#git clone [email protected]:GridTools/gt4py.git $GT4PY_PATH
24-
pip install -r $GT4PY_PATH/requirements-dev.txt
25-
pip install -e $GT4PY_PATH
26-
#
27-
```
7+
### Development Installation
8+
9+
As of August 2024, the recommended Python version for development is **3.10.14**.
10+
11+
**Important Note:** The Python virtual environment must be created in the directory specified by `GRIDTOOLS_JL_PATH/.venv`. Creating the environment in any other location will result in errors.
12+
13+
#### Steps to Set Up the Development Environment
14+
15+
1. **Set Environment Variables:**
16+
Set the environment variables for `GRIDTOOLS_JL_PATH` and `GT4PY_PATH`. Replace `...` with the appropriate paths on your system.
17+
18+
```bash
19+
export GRIDTOOLS_JL_PATH="..."
20+
export GT4PY_PATH="..."
21+
```
22+
23+
2. **Create a Python Virtual Environment:**
24+
Navigate to the `GRIDTOOLS_JL_PATH` directory and create a Python virtual environment named `.venv`. Ensure you are using a compatible Python version (i.e. 3.10.14).
25+
26+
```bash
27+
cd $GRIDTOOLS_JL_PATH
28+
python3.10 -m venv .venv
29+
```
30+
31+
3. **Activate the Virtual Environment:**
32+
Activate the virtual environment. You need to run this command every time you work with GridTools.jl.
33+
34+
```bash
35+
source .venv/bin/activate
36+
```
37+
38+
4. **Clone the GT4Py Repository:**
39+
Clone the GT4Py repository. You can use the specific branch mentioned or the main repository as needed.
40+
41+
```bash
42+
git clone --branch fix_python_interp_path_in_cmake [email protected]:tehrengruber/gt4py.git
43+
# Alternatively, you can clone the main repository:
44+
# git clone [email protected]:GridTools/gt4py.git $GT4PY_PATH
45+
```
46+
47+
5. **Install Required Packages:**
48+
Install the development requirements and the GT4Py package in editable mode.
49+
50+
```bash
51+
pip install -r $GT4PY_PATH/requirements-dev.txt
52+
pip install -e $GT4PY_PATH
53+
```
54+
55+
6. **Build PyCall:**
56+
With the virtual environment activated, run Julia form the `GridTools.jl` folder with the command `julia --project=.` and then build using the following commands:
57+
58+
```julia
59+
using Pkg
60+
Pkg.build()
61+
```
2862

2963
## Troubleshooting
3064

65+
### Common Build Errors
66+
3167
__undefined symbol: PyObject_Vectorcall__
68+
- Make sure to run everything in the same environment that you built `PyCall` with. A common reason for this error is that PyCall was built in a virtual environment and then was not loaded when executing stencils.
69+
70+
__CMake Error: Could NOT find Boost__
71+
- GridTools.jl requires the Boost library version 1.65.1 or higher. If Boost is not installed, you can install it via your system's package manager. For example, on Ubuntu, use:
72+
```bash
73+
sudo apt-get install libboost-all-dev
74+
```
75+
Make sure the installed version meets the minimum required version of 1.65.1. If CMake still cannot find Boost after installation, you may need to manually specify the Boost installation path in the CMake command using the `-DBOOST_ROOT=/path/to/boost` option, where `/path/to/boost` is the directory where Boost is installed.
76+
77+
__Supporting GPU Backend with CUDA__
78+
79+
- To enable GPU acceleration and utilize the GPU backend features of this project, it is essential to have the NVIDIA CUDA Toolkit installed. CUDA provides the necessary compiler (nvcc) and libraries for developing and running applications that leverage NVIDIA GPUs.
3280

33-
Make sure to run everything in the same environment that you have build `PyCall` with. A common reason is you have built PyCall in a virtual environement and then didn't load it when executing stencils.
81+
- If the `LD_LIBRARY_PATH` environment variable is set in your current environment, it is recommended to unset it. This avoids conflicts between the paths managed by CUDA.jl and those already present on the system.
82+
```julia
83+
julia> using CUDA
84+
┌ Warning: CUDA runtime library `...` was loaded from a system path, `/usr/local/cuda/lib64/...`.
85+
86+
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
87+
│ environment variable, or that it does not contain paths to CUDA libraries.
88+
```

ci/cscs.yml

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
stages:
2+
- build_base_stage0_image
3+
- build_base_stage1_image
4+
- build_base_stage2_image
5+
- build_image
6+
- ci_jobs
7+
8+
variables:
9+
GPU_ENABLED: true
10+
CUDA_DRIVER_VERSION: "470.57.02"
11+
PROJECT_NAME: gridtools_jl
12+
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/pasc_kilos/${CONTAINER_RUNNER}/${PROJECT_NAME}_image:$CI_COMMIT_SHORT_SHA
13+
CPU_ARCH: "x86_64_v3" # use a generic architecture here instead of linux-sles15-haswell, such that it can build on zen2
14+
15+
include:
16+
- remote: 'https://gitlab.com/cscs-ci/recipes/-/raw/master/templates/v2/.ci-ext.yml'
17+
18+
.gt-container-builder:
19+
extends: .container-builder
20+
timeout: 2h
21+
before_script:
22+
- DOCKER_TAG=`eval cat $WATCH_FILECHANGES | sha256sum | head -c 16`
23+
- |
24+
if [[ "$CI_COMMIT_MESSAGE" =~ "Trigger container rebuild $ENV_VAR_NAME" ]]; then
25+
echo "Rebuild triggered."
26+
export CSCS_REBUILD_POLICY="always"
27+
fi
28+
- export PERSIST_IMAGE_NAME=$PERSIST_IMAGE_NAME:$DOCKER_TAG
29+
- echo "$ENV_VAR_NAME=$PERSIST_IMAGE_NAME" > build.env
30+
artifacts:
31+
reports:
32+
dotenv: build.env
33+
variables:
34+
# the variables below MUST be set to a sane value. They are mentioned here, to see
35+
# which variables should be set.
36+
DOCKERFILE: ci/docker/Dockerfile.base # overwrite with the real path of the Dockerfile
37+
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/base/my_base_image # Important: No version-tag
38+
WATCH_FILECHANGES: 'ci/docker/Dockerfile.base "path/to/another/file with whitespaces.txt"'
39+
ENV_VAR_NAME: BASE_IMAGE
40+
41+
build_base_stage0_image_job:
42+
stage: build_base_stage0_image
43+
extends: .gt-container-builder
44+
variables:
45+
DOCKERFILE: docker/base/Dockerfile
46+
DOCKER_BUILD_ARGS: '["INSTALL_CUDA_DRIVER=$GPU_ENABLED", "CUDA_DRIVER_VERSION=$CUDA_DRIVER_VERSION", "CPU_ARCH=$CPU_ARCH"]'
47+
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/gridtools_jl_base_image
48+
WATCH_FILECHANGES: 'docker/base/Dockerfile'
49+
ENV_VAR_NAME: BASE_IMAGE_STAGE0
50+
51+
build_base_stage1_image_job:
52+
stage: build_base_stage1_image
53+
extends: .gt-container-builder
54+
variables:
55+
DOCKERFILE: docker/base_spack_deps/Dockerfile
56+
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE0", "PROJECT_NAME=$PROJECT_NAME", "SPACK_ENV_FILE=spack-${CONTAINER_RUNNER}.yaml"]'
57+
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/${PROJECT_NAME}_base_stage1_image
58+
WATCH_FILECHANGES: 'docker/base/Dockerfile docker/base_spack_deps/Dockerfile docker/base_spack_deps/spack-daint-p100.yaml' # TODO: inherit from stage0
59+
ENV_VAR_NAME: BASE_IMAGE_STAGE1
60+
61+
build_base_stage2_image_job:
62+
stage: build_base_stage2_image
63+
extends: .gt-container-builder
64+
variables:
65+
DOCKERFILE: docker/base_deps/Dockerfile
66+
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE1", "PROJECT_NAME=$PROJECT_NAME"]'
67+
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/${PROJECT_NAME}_base_stage2_image
68+
WATCH_FILECHANGES: 'docker/base/Dockerfile docker/base_spack_deps/Dockerfile docker/base_spack_deps/spack-daint-p100.yaml docker/base_deps/Dockerfile' # TODO: inherit from stage1
69+
ENV_VAR_NAME: BASE_IMAGE_STAGE2
70+
71+
build_image:
72+
stage: build_image
73+
extends: .container-builder
74+
variables:
75+
DOCKERFILE: docker/image/Dockerfile
76+
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE2", "PROJECT_NAME=$PROJECT_NAME"]'
77+
78+
run_tests:
79+
stage: ci_jobs
80+
image: $PERSIST_IMAGE_NAME
81+
extends: .container-runner-daint
82+
script:
83+
- . /opt/gridtools_jl_env/setup-env.sh
84+
- cd /opt/GridTools
85+
- julia --project=. -e 'using Pkg; Pkg.test()'
86+
variables:
87+
SLURM_JOB_NUM_NODES: 1
88+
SLURM_NTASKS: 1
89+
SLURM_TIMELIMIT: "00:30:00"

docker/base/Dockerfile

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# just a counter to trigger rebuilds: 3
2+
FROM ubuntu:23.04 as builder
3+
ARG INSTALL_CUDA_DRIVER=false
4+
ARG CUDA_DRIVER_VERSION
5+
ARG CPU_ARCH
6+
7+
SHELL ["/bin/bash", "-c"]
8+
9+
RUN apt-get update \
10+
&& env DEBIAN_FRONTEND=noninteractive TZ=Europe/Zurich apt-get -yqq install --no-install-recommends build-essential ca-certificates coreutils curl environment-modules file gfortran git git-lfs gpg gpg-agent lsb-release openssh-client python3 python3-distutils python3-venv unzip zip
11+
12+
RUN apt-get clean
13+
14+
WORKDIR /opt/gridtools_jl_env
15+
16+
COPY ./docker/base/install_cuda_driver.sh ./install_cuda_driver.sh
17+
RUN if [ "x$INSTALL_CUDA_DRIVER" == "xtrue" ]; then ./install_cuda_driver.sh $CUDA_DRIVER_VERSION; fi
18+
19+
RUN git clone --depth 1 -c feature.manyFiles=true https://github.com/spack/spack.git
20+
21+
# In case the driver is not installed this fixes missing `-lcuda` errors when installing cupy.
22+
#RUN git remote add origin_tehrengruber https://github.com/tehrengruber/spack.git
23+
#RUN git fetch origin_tehrengruber
24+
#RUN git checkout --track origin_tehrengruber/fix_libcuda_not_found
25+
26+
WORKDIR ./spack/bin
27+
28+
# careful: this overrides and will be overriden by other configuration to packages:all:require
29+
RUN ./spack config add packages:all:require:target=$CPU_ARCH
30+
31+
RUN ./spack install gcc@11
32+
33+
# cleanup
34+
RUN ./spack clean --all
35+
RUN ./spack gc -y
36+
37+
# strip all the binaries
38+
RUN find -L /opt/gridtools_jl_env/spack/opt -type f -exec readlink -f '{}' \; | \
39+
xargs file -i | \
40+
grep 'charset=binary' | \
41+
grep 'x-executable\|x-archive\|x-sharedlib' | \
42+
awk -F: '{print $1}' | xargs strip -x || true
43+
44+
WORKDIR /
45+
46+
# flatten image
47+
FROM scratch
48+
COPY --from=builder / /

docker/base/install_cuda_driver.sh

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#!/bin/bash
2+
CUDA_DRIVER_VERSION=$1
3+
4+
echo "Installing CUDA driver version $CUDA_DRIVER_VERSION"
5+
apt-get -yqq install --no-install-recommends kmod wget
6+
wget -q https://us.download.nvidia.com/XFree86/Linux-x86_64/${CUDA_DRIVER_VERSION}/NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run
7+
chmod +x NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run
8+
./NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run -s -q -a \
9+
--no-nvidia-modprobe \
10+
--no-abi-note \
11+
--no-kernel-module \
12+
--no-distro-scripts \
13+
--no-opengl-files \
14+
--no-wine-files \
15+
--no-kernel-module-source \
16+
--no-unified-memory \
17+
--no-drm \
18+
--no-libglx-indirect \
19+
--no-install-libglvnd \
20+
--no-systemd
21+
rm ./NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run

docker/base_deps/Dockerfile

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# rebuild counter 3 # just a counter to increase when we want a new image
2+
ARG BASE_IMAGE=gridtools_jl_spack_deps_image
3+
FROM $BASE_IMAGE as builder
4+
ARG PROJECT_NAME
5+
6+
WORKDIR /opt/${PROJECT_NAME}_env
7+
8+
COPY ./docker/base_deps/setup-env.sh ./setup-env.sh
9+
RUN sed -i "s/%PROJECT_NAME%/$PROJECT_NAME/g" setup-env.sh
10+
11+
WORKDIR /opt/
12+
COPY ./docker/base_deps/install_gt4py.sh ./install_gt4py.sh
13+
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; ./install_gt4py.sh
14+
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; pip cache purge
15+
16+
WORKDIR /opt/gridtools_jl_deps
17+
COPY ./Project.toml ./Project.toml
18+
RUN mkdir src
19+
COPY ./docker/base_deps/dummy_module.jl ./src/GridTools.jl
20+
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; julia --project=. -e "using Pkg; Pkg.instantiate(); Pkg.build(); Pkg.precompile()"
21+
RUN rm -rf /opt/gridtools_jl_deps
22+
23+
# flatten image
24+
FROM scratch
25+
COPY --from=builder / /

docker/base_deps/dummy_module.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
module GridTools
2+
end

docker/base_deps/install_gt4py.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#!/bin/bash
2+
git clone --branch fix_python_interp_path_in_cmake https://github.com/tehrengruber/gt4py.git
3+
pip install -r ./gt4py/requirements-dev.txt
4+
pip install ./gt4py

docker/base_deps/setup-env.sh

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#!/bin/bash
2+
# note: occurrences of %PROJECT_NAME% in this file are replaced when copied into the container
3+
export HOME=/root
4+
5+
. /opt/%PROJECT_NAME%_env/spack/share/spack/setup-env.sh
6+
7+
# gcc is installed outside the env so load it before. In case gcc is not loaded we might run
8+
# into strange errors where partially the spack version and partially the system installed version
9+
# is used.
10+
spack load gcc
11+
12+
spack env activate %PROJECT_NAME%_env
13+
14+
# use this complicated way to load packages in case multiple version are installed
15+
# this was needed as two version of py-pip are installed (one is only a build
16+
# dependency). Since we now run `spack gc -y` this is superfluous (build only
17+
# dependencies are removed before we land here), but we keep it for now.
18+
#PACKAGES_TO_LOAD=("python" "py-pip" "gcc")
19+
#for PKG_NAME in ${PACKAGES_TO_LOAD[@]}; do
20+
# SHORT_SPEC=$(spack find --explicit --format "{short_spec}" $PKG_NAME)
21+
# SHORT_SPEC=${SHORT_SPEC%/*} # remove hash after `/` character
22+
# spack load $SHORT_SPEC
23+
#done
24+
spack load python py-pip boost julia

docker/base_spack_deps/Dockerfile

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# rebuild counter 3 # just a counter to increase when we want a new image
2+
ARG BASE_IMAGE=gridtools_jl_base_image
3+
FROM $BASE_IMAGE as builder
4+
ARG PROJECT_NAME=gridtools_jl
5+
ARG SPACK_ENV_FILE=spack-daint-p100.yaml
6+
7+
# TODO(tehrengruber): Copy spack environment to clean image. Then we don't need to run `spack gc`
8+
# and `spack clean` anymore. See https://spack.readthedocs.io/en/latest/containers.html for
9+
# more information.
10+
11+
WORKDIR /opt/${PROJECT_NAME}_env/spack/bin
12+
13+
COPY ./docker/base_spack_deps/${SPACK_ENV_FILE} ./spack_env_${PROJECT_NAME}.yaml
14+
RUN ./spack env create ${PROJECT_NAME}_env spack_env_${PROJECT_NAME}.yaml
15+
# remove all compilers such that everything is built with the compiler we installed
16+
RUN ./spack compiler remove -a gcc
17+
RUN ./spack -e ${PROJECT_NAME}_env compiler find $(./spack location --install-dir gcc@11)
18+
# using --fresh ensures the concretization does not care about the build cache (untested and not
19+
# used right now as we don't use a build cache yet)
20+
RUN ./spack -e ${PROJECT_NAME}_env concretize --fresh
21+
COPY ./docker/base_spack_deps/run_until_succeed.sh ./run_until_succeed.sh
22+
RUN ./run_until_succeed.sh ./spack -e ${PROJECT_NAME}_env install
23+
24+
# cleanup
25+
RUN ./spack -e ${PROJECT_NAME}_env clean --all
26+
RUN ./spack -e ${PROJECT_NAME}_env gc -y
27+
28+
# strip all the binaries
29+
RUN find -L /opt/${PROJECT_NAME}_env/spack/opt -type f -exec readlink -f '{}' \; | \
30+
xargs file -i | \
31+
grep 'charset=binary' | \
32+
grep 'x-executable\|x-archive\|x-sharedlib' | \
33+
awk -F: '{print $1}' | xargs strip -x || true
34+
35+
WORKDIR /
36+
37+
# flatten image
38+
FROM scratch
39+
COPY --from=builder / /
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!/bin/bash
2+
3+
# Set the maximum number of attempts
4+
max_attempts=10
5+
attempt=0
6+
7+
# Check if a command is provided
8+
if [ $# -eq 0 ]; then
9+
echo "Usage: $0 MY_BASH_COMMAND ARGS..."
10+
exit 1
11+
fi
12+
13+
# Loop until the command succeeds or the maximum attempts are reached
14+
while ! "$@"; do
15+
attempt=$((attempt + 1))
16+
if [ $attempt -ge $max_attempts ]; then
17+
echo "Command failed after $max_attempts attempts."
18+
exit 1
19+
fi
20+
echo "Attempt $attempt/$max_attempts failed. Retrying..."
21+
done
22+
23+
echo "Command succeeded on attempt $attempt."

0 commit comments

Comments
 (0)