You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 30, 2024. It is now read-only.
* fixed all UTs
* move sycl benchmark to benchmark project
* add q4 UT for sycl prologue_b
* sycl gemv case
* add UT case
* add to trans storage conversion
* add sycl context to model context. compile ne_layers with dpcpp
* add context sycl memory allocation
* inference with data exchange
* use backend instead of ne type
* new api to assign sycl buffer
* add backend parameter for new tensor
* add sycl int4 to graph compute
* fix sync
* compile without sycl
* sync with main
* fix tensor size bug
* refactor layer config
* sync ut
* support 2 layers of sycl
* sync main
* revert ISA detect for dpcpp
* compile without dpcpp
* fix avxvnni intrin code
* protect crash if it's a CPU SYCL device
* add device mul function
* fix the sync issue
* run model with all FFN layers on SYCL
* fix compile
* clang-format
* revert model config
* fix fun ret
* fix the kernel bug
* remove all grad tensors.
* fix some bugs.
* support llama shapes, add new UT case, update new api of dpcpp
* support all ffn layers
* add sync for CPU Device
* clang-format
* fix warning
* clang-format
* add back f32 model support
* fix typo, remove unused code
* bring more layers to SYCL
* add embedding support and use omp in sycl
* optimize gemv k iteration
* optimize rms_norm, add debug macro for no-mha forward.
* add mha ut
* prepare for SYCL MHA
* add SYCL rope
* all device f32 mha
* remove unused code
* fixed
* fixed
* refactor sycl context for multiple allocation
* support n_gpu_layer
* reuse scratch
* add new mha version
* new version of MHA
* lower malloc size
* compile without sycl
* run llama without sycl build
* clang-format
* fix clang-tidy
* fix py build
* fix warning
* use std header
* update math
* update math
* revert scratch without SYCL
* use cl for c_compiler
* compile on linux
* Revert "compile on linux"
This reverts commit 0ce1574.
* Revert "use cl for c_compiler"
This reverts commit 7f40ae9.
* fix memory leak, set lower extra memory size.
* revert embedding size on CPU
* clang-format
---------
Co-authored-by: luoyu-intel <[email protected]>
0 commit comments