Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Velocity Bench's easywave version for testing #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions easywave_vb/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# easyWave

easyWave is a tsunami wave generator developed by ZIB (Original source code from [here](https://github.com/christgau/easywave-sycl)).

## Supported versions

- CUDA: The original code was obtained from [here](https://git.gfz-potsdam.de/id2/geoperil/easyWave)
- DPC++: Currently works on PVC
- HIP: Currently works on AMD Instinct MI100 and MI250 GPUs

Data files used for testing all versions can be found [here](https://git.gfz-potsdam.de/id2/geoperil/easyWave/-/tree/master/data)

# Build Instructions

## DPC++ on PVC

Use the source files from ```easywave/sycl/src```

```
mkdir build
cd build
CC=/path/to/oneAPI/bin/clang CXX=/path/to/oneAPI/bin/clang++ cmake ..
```
Note: To enable AOT compilation, please use the flag `-DGPU_AOT=pvc` for enabling PVC AOT/JIT compilation

## DPC++ using NVIDIA/AMD backend

To compile the SYCL code on NVIDIA GPUs, please use the following:

`-DUSE_NVIDIA_BACKEND=ON -DUSE_SM={80|90}`

To compile the SYCL code on AMD GPUs, please use the following:

`-DUSE_AMDHIP_BACKEND=gfx90a` for MI250 or `-DUSE_AMDHIP_BACKEND=gfx908` for MI100

# Run instructions

To run the workload, it is suggested to use the following inputs (as per developer's suggestion)

```
./easywave_sycl -grid /path/to//easywave_data/data/grid/e2Asean.grid -source /path/to/easywave_data/data/faults/BengkuluSept2007.flt -time 120
```
# SYCL specific environment variables

PVC-1T: Please export the following variables: `ZE_AFFINITY_MASK=0.0` to force 1-tile execution

PVC-2T: Use `EnableImplicitScaling=1`

The profiled time includes memory transfer between device-to-host and vice versa and kernel compute time. File I/O is not included.

# Validation

This benchmark does not have a validation mechanism. It is suggested to use the `eWave.2D.*` output files generated from the CUDA binary using the same input parameters when performing the validation.

To verify the output, please use the supplied python script from ```easywave/tools``` called ```compare.py``` .

To use this script, you must use python2.7.9. For example:

```python2.7.9 easywave/tools/compare.py /path/to/cuda/build/eWave.2D.XXXXX.ssh /path/to/dpcpp/build/eWave.2D.XXXXX.ssh```

Each ```eWave.2D.XXXXX.ssh``` file represents the wave at a particular time in seconds. The value of `XXXXX` is the timestamp and it must be the same when comparing the two eWave.2D.XXXXX.ssh files

For example, comparing the values at 07200, execute the following:

```
python2.7.9 easywave/tools/compare.py /path/to/dpcpp/cuda/eWave.2D.07200.ssh /path/to/dpcpp/build/eWave.2D.07200.ssh
Differences: 29399
Max difference: 0.000002
```
118 changes: 118 additions & 0 deletions easywave_vb/SYCL/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
###
### Modifications Copyright (C) 2023 Intel Corporation
###
### This Program is subject to the terms of the European Union Public License 1.2
###
### If a copy of the license was not distributed with this file, you can obtain one at
### https://joinup.ec.europa.eu/sites/default/files/custom-page/attachment/2020-03/EUPL-1.2%20EN.txt
###
### SPDX-License-Identifier: EUPL-1.2
###
###

cmake_minimum_required(VERSION 3.10)
project(easyWave_sycl LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 17) # SYCL code requires this
set(CMAKE_CXX_STANDARD_REQUIRED ON) # Enable modern C++ standards
set(CMAKE_CXX_EXTENSIONS OFF) # Use -std, not -gnu

option(ENABLE_KERNEL_PROFILING "Build using kernel profiling" OFF)
option(GPU_AOT "Build AOT for Intel GPU" OFF)
option(USE_INTEL_CPU "Build AOT for Intel CPU" OFF)
option(USE_NVIDIA_BACKEND "Build for NVIDIA backend" OFF)
option(USE_AMDHIP_BACKEND "Build for AMD HIP backend" OFF)
option(SHOW_GRID "Show intermediate grid size during propagation" OFF)
option(ENABLE_GPU_TIMINGS "Show GPU timings at end of execution" OFF)

set(SOURCES
src/EasyWave.cpp
src/ewCudaKernels.cpp
src/ewGpuNode.cpp
${CMAKE_SOURCE_DIR}/../common/cOgrd.cpp
${CMAKE_SOURCE_DIR}/../common/cOkadaEarthquake.cpp
${CMAKE_SOURCE_DIR}/../common/cOkadaFault.cpp
${CMAKE_SOURCE_DIR}/../common/cSphere.cpp
${CMAKE_SOURCE_DIR}/../common/ewGrid.cpp
${CMAKE_SOURCE_DIR}/../common/ewOut2D.cpp
${CMAKE_SOURCE_DIR}/../common/ewParam.cpp
${CMAKE_SOURCE_DIR}/../common/ewPOIs.cpp
${CMAKE_SOURCE_DIR}/../common/ewReset.cpp
${CMAKE_SOURCE_DIR}/../common/ewSource.cpp
${CMAKE_SOURCE_DIR}/../common/ewStep.cpp
${CMAKE_SOURCE_DIR}/../common/okada.cpp
${CMAKE_SOURCE_DIR}/../common/utilits.cpp
${CMAKE_SOURCE_DIR}/../../infrastructure/FileHandler.cpp
${CMAKE_SOURCE_DIR}/../../infrastructure/SYCL.cpp
${CMAKE_SOURCE_DIR}/../../infrastructure/Timer.cpp
${CMAKE_SOURCE_DIR}/../../infrastructure/Utilities.cpp
)

include_directories(${CMAKE_SOURCE_DIR}/../common/ ${CMAKE_SOURCE_DIR}/src/ ${CMAKE_SOURCE_DIR}/../../infrastructure)

if(ENABLE_KERNEL_PROFILING)
message(STATUS "Enabling kernel profiling")
add_compile_options(-DENABLE_KERNEL_PROFILING)
endif()

if(SHOW_GRID)
message(STATUS "Showing grid size during propagation")
add_compile_options(-DSHOW_GRID)
endif()

if(ENABLE_GPU_TIMINGS)
message(STATUS "GPU Timings will be displayed")
add_compile_options(-DENABLE_GPU_TIMINGS)
endif()

# Use either default or user defined CXX flags
# -DCMAKE_CXX_FLAGS=" -blah -blah " overrides the default flags

set(USE_DEFAULT_FLAGS ON)
set(INTEL_GPU_CXX_FLAGS " -O2 -fsycl -ffast-math ")
set(NVIDIA_GPU_CXX_FLAGS " -O3 -fsycl -ffast-math ")
set(AMD_GPU_CXX_FLAGS " -O3 -fsycl -ffast-math ")

if("${CMAKE_CXX_FLAGS}" STREQUAL "")
message(STATUS "Using DEFAULT compilation flags for the application")
string(APPEND CMAKE_CXX_FLAGS "${INTEL_GPU_CXX_FLAGS}") # Default flags for NV backend
else()
message(STATUS "OVERRIDING compilation flags")
set(USE_DEFAULT_FLAGS OFF)
endif()

# JIT compilation
if(GPU_AOT)
if( (${GPU_AOT} STREQUAL "pvc") OR (${GPU_AOT} STREQUAL "PVC") )
message(STATUS "Enabling Intel GPU AOT compilation for ${GPU_AOT}")
string(APPEND CMAKE_CXX_FLAGS " -fsycl-targets=spir64_gen -Xs \"-device 0x0bd5 -revision_id 0x2f\" ")
else()
message(STATUS "Using custom AOT compilation flag ${GPU_AOT}")
string(APPEND CMAKE_CXX_FLAGS " ${GPU_AOT} ") # User should be aware of advanced AOT compilation flags
endif()
elseif(USE_INTEL_CPU)
message(STATUS "Compiling for Intel CPU")
string(APPEND CMAKE_CXX_FLAGS " -ffast-math -mprefer-vector-width=512 -mfma -fsycl-targets=spir64_x86_64 \"-device avx512\" ")
elseif(USE_NVIDIA_BACKEND)
message(STATUS "Enabling NVIDIA backend")
if(USE_DEFAULT_FLAGS)
set(CMAKE_CXX_FLAGS "${NVIDIA_GPU_CXX_FLAGS}") # Default flags for NV backend
endif()
string(APPEND CMAKE_CXX_FLAGS " -fsycl-targets=nvptx64-nvidia-cuda ") # -O3 will be used, even though -O2 was set earlier
if(USE_SM)
message(STATUS "Building for SM_${USE_SM} architecture")
string(APPEND CMAKE_CXX_FLAGS " -Xsycl-target-backend --cuda-gpu-arch=sm_${USE_SM} ")
endif()
elseif(USE_AMDHIP_BACKEND)
message(STATUS "Enabling AMD HIP backend for ${USE_AMDHIP_BACKEND} AMD architecture")
if(USE_DEFAULT_FLAGS)
set(CMAKE_CXX_FLAGS "${AMD_GPU_CXX_FLAGS}")
endif()
string(APPEND CMAKE_CXX_FLAGS " -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=${USE_AMDHIP_BACKEND} ")
endif()

# Output the compiler flags that were constructed for visual inspection
message(STATUS "Compilation flags set to: ${CMAKE_CXX_FLAGS}")

add_executable(${PROJECT_NAME} ${SOURCES})
target_link_libraries(${PROJECT_NAME} sycl stdc++fs)
Loading