Skip to content

Commit f3c18ed

Browse files
authored
[CUDA] Exclude lean attention from linux build (#25203)
### Description Exclude lean attention from linux build. ### Motivation and Context Previously, lean attention was built in Linux but not in Windows. It is not used Gen AI so far, so we disable it in build to reduce binary size and build time.
1 parent 90aaaeb commit f3c18ed

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

cmake/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ option(onnxruntime_BUILD_BENCHMARKS "Build ONNXRuntime micro-benchmarks" OFF)
9393
option(onnxruntime_USE_VSINPU "Build with VSINPU support" OFF)
9494

9595
cmake_dependent_option(onnxruntime_USE_FLASH_ATTENTION "Build flash attention kernel for scaled dot product attention" ON "onnxruntime_USE_CUDA" OFF)
96-
cmake_dependent_option(onnxruntime_USE_LEAN_ATTENTION "Build lean attention kernel for scaled dot product attention" ON "onnxruntime_USE_CUDA; NOT WIN32" OFF)
96+
option(onnxruntime_USE_LEAN_ATTENTION "Build lean attention kernel for scaled dot product attention" OFF)
9797
option(onnxruntime_USE_MEMORY_EFFICIENT_ATTENTION "Build memory efficient attention kernel for scaled dot product attention" ON)
9898

9999
option(onnxruntime_BUILD_FOR_NATIVE_MACHINE "Enable this option for turning on optimization specific to this machine" OFF)

0 commit comments

Comments
 (0)