DISCONTINUATION OF PROJECT.
This project will no longer be maintained by Intel.
Intel will not provide or guarantee development of or support for this project, including but not limited to, maintenance, bug fixes, new releases or updates. Patches to this project are no longer accepted by Intel. If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the community, please create your own fork of the project.
Contact: [email protected]
Productivity library for distributed and partitioned memory based on C++ Ranges.
Distributed Ranges is a C++ productivity library for distributed and partitioned memory based on C++ ranges. It offers a collection of data structures, views, and algorithms for building generic abstractions and provides interoperability with MPI, SHMEM, SYCL and OpenMP and portability on CPUs and GPUs. NUMA-aware allocators and distributed data structures facilitate development of C++ applications on heterogeneous nodes with multiple devices and achieve excellent performance and parallel scalability by exploiting local compute and data access.
In this model one can:
- create a distributed data structure that work with all our algorithms out of the box
- create an algorithm that works with all our distributed data structures out of the box
Distributed Ranges is a glue that makes this possible.
- Usage:
- Introductory presentation: Distributed Ranges, why you need it, 2024
- Article: Get Started with Distributed Ranges, 2023
- Tutorial: Distributed Ranges Tutorial
- Design / Implementation:
- Conference paper: Distributed Ranges, A Model for Distributed Data Structures, Algorithms, and Views, 2024
- Talk: CppCon 2023; Benjamin Brock; Distributed Ranges, 2023
- Technical presentation: Intel Innovation'23, 2023
- API specification
- Linux
- cmake >=3.20
- OneAPI HPC Toolkit installed
Enable OneAPI by:
source ~/intel/oneapi/setvars.sh
... or by:
source /opt/intel/oneapi/setvars.sh
... or wherever you have oneapi/setvars.sh
script installed in your system.
- CUDA
- OneAPI for NVIDIA GPUs plugin
When enabling OneAPI use --include-intel-llvm
option, e.g. call:
source ~/intel/oneapi/setvars.sh --include-intel-llvm
... instead of source ~/intel/oneapi/setvars.sh
.
All tests and examples can be build by:
CXX=icpx cmake -B build cmake --build build -- -j
Note
- Distributed Ranges library works in two models:
- Multi Process (based on SYCL and MPI)
- Single Process (based on pure SYCL)
On NVIDIA GPU only Multi Process model is currently supported.
To build multi-process tests call:
CXX=icpx cmake -B build -DENABLE_CUDA:BOOL=ON cmake --build build --target mp-all-tests -- -j
Run multi process tests:
ctest --test-dir build --output-on-failure -L MP -j 4
Run single process tests:
ctest --test-dir build --output-on-failure -L SP -j 4
Run all tests:
ctest --test-dir build --output-on-failure -L TESTLABEL -j 4
Two binaries are build for benchmarks:
- mp-bench - for benchmarking Multi-Process model
- sp-bench - for benchmarking Single-Process model
Here are examples of running single benchmarks.
Running GemvEq_DR strong scaling benchmark in Multi-Process model using two GPUs:
ONEAPI_DEVICE_SELECTOR='level_zero:gpu' I_MPI_OFFLOAD=1 I_MPI_OFFLOAD_CELL_LIST=0-11 \ mpiexec -n 2 -ppn 2 build/benchmarks/gbench/mp/mp-bench --vector-size 1000000000 --reps 50 \ --v=3 --benchmark_out=mp_gemv.txt --benchmark_filter=GemvEq_DR/ --sycl
Running Exclusive_Scan_DR weak scaling in Single-Process model using two GPUs:
ONEAPI_DEVICE_SELECTOR='level_zero:gpu' KMP_AFFINITY=compact \ build/benchmarks/gbench/sp/sp-bench --vector-size 1000000000 --reps 50 \ --v=3 --benchmark_out=sp_exclscan.txt --benchmark_filter=Exclusive_Scan_DR/ \ --weak-scaling --device-memory --num-devices 2
Check all options:
./build/benchmarks/gbench/mp/mp-bench --help # see google test options help ./build/benchmarks/gbench/mp/mp-bench --drhelp # see DR specific options
See Distributed Ranges Tutorial for a few well explained examples.
If your project uses CMAKE, add the following to your
CMakeLists.txt
to download the library:
find_package(MPI REQUIRED) include(FetchContent) FetchContent_Declare( dr GIT_REPOSITORY https://github.com/oneapi-src/distributed-ranges.git GIT_TAG main ) FetchContent_MakeAvailable(dr)
The above will define targets that can be included in your project:
target_link_libraries(<application> MPI::MPI_CXX DR::mpi)
See Distributed Ranges Tutorial for a live example of a cmake project that imports and uses Distributed Ranges.
Add below code to your main
function to enable logging.
If using Single-Process model:
std::ofstream logfile("dr.log"); dr::drlog.set_file(logfile);
If using Multi-Process model:
int my_mpi_rank; MPI_Comm_rank(MPI_COMM_WORLD, &my_mpi_rank); std::ofstream logfile(fmt::format("dr.{}.log", my_mpi_rank));
Example of adding custom log statement to your code:
DRLOG("my debug message with varA:{} and varB:{}", a, b);
Contact us by writing a new issue.
We seek collaboration opportunities and welcome feedback on ways to extend the library, according to developer needs.
- CONTRIBUTING
- Fuzz Testing
- Spec Editing - Editing the API document
- Print Type - Print types at compile time:
- Testing - Test system maintenance
- Security - Security policy
- Doxygen