Skip to content
This repository was archived by the owner on Feb 5, 2024. It is now read-only.

Commit 7dcdbe9

Browse files
committed
Meeting notes and slides for 19 September 2023
1 parent e6fa907 commit 7dcdbe9

File tree

2 files changed

+105
-1
lines changed

2 files changed

+105
-1
lines changed

language/README.rst

+105-1
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,114 @@ Potential Topics
2323
* Function pointers revisited
2424
* oneDPL C++ standard library support
2525

26+
2023-09-19
27+
=============
28+
29+
* Ruyman Reyes (Intel/Codeplay)
30+
* Lukas Sommer (Codeplay Software Ltd)
31+
* Benie (Codeplay Software Ltd)
32+
* Hyesun Hong (Samsung SAIT)
33+
* Julian Oppermann (Codeplay Software Ltd)
34+
* Mehdi Goli (Codeplay Software Ltd)
35+
* Lueck, Gregory (Intel)
36+
* Jesus Labarta (BSC) (Guest)
37+
* Brodman, James (Intel)
38+
* Hanwoong Jung (Samsung SAIT)
39+
* Brice Goglin (Invité)
40+
* Plaska, Oskar (Contractor, Cognizant)
41+
* Tom Deakin (Univ. of Bristol)
42+
* Marcin (N/A)
43+
* Victor Lomuller (Codeplay Software Ltd)
44+
* Biagio COSENZA (Università degli Studi di Salerno)
45+
* Voss, Michael J (Intel)
46+
* Kukanov, Alexey (Intel)
47+
* Richards, Alison L (Intel)
48+
* Adam Kuźniar (Mobica)
49+
* Slavova, Gergana S (Intel)
50+
* bongjun kim (Samsung SAIT)
51+
* Keryell, Ronan (XILINX LABS)
52+
* Juan Fumero (University of Manchester)
53+
* Gordon Brown (Codeplay Software Ltd)
54+
* Tim (N/A)
55+
* Kinsner, Michael (Intel)
56+
* Petersen, Paul (Intel)
57+
* Videau, Brice (ANL)
58+
* Holmes, Daniel John (Intel)
59+
* Frank Brill (Cadence)
60+
* Mrozek, Michal (Intel)
61+
* Reble, Pablo (Intel)
62+
* Andrew Richards (Intel/Codeplay)
63+
* Smith, Timmie (Intel)
64+
65+
66+
SYCL Extension Proposal for PIM/PNM
67+
--------------------------------------
68+
69+
Hyesun Hong,
70+
`Slides <presentation/2023-09-19-HS-sycl-pim-extensions.pdf>`
71+
72+
* PIM/PNM technology enables computation directly on memory
73+
* Prevents data movement improving performance and reducing consumption
74+
* PIM operates directly on memory banks by reading and storing on rows and columns
75+
* Aquabolt-XL is the first demonstrator
76+
* Can be drop in on any memory controller
77+
* CXL-PNM is the CXL variant for PNM, can work with multiple PIM
78+
79+
SYCL Extension for PIM/PNM
80+
* Goals
81+
* Seamlessly integrate PIM/PNM operation into SYCL
82+
* Allow combination of xGPU and PIM/PNM in one device kernel
83+
* Not specific to one hardware
84+
* Design
85+
* Vector operation seem like natural fit, but no convergence guarantee and vector size explicit
86+
* Model as special function unit
87+
* Aligns with trends to model special functional units inside accelerators
88+
* Compiler automatic mapping often not possible
89+
* joint_matrix
90+
* Group functions
91+
* Easy to use
92+
* Can easily be combined with device code
93+
* Give necessary convergence guarantees
94+
* Recap of SYCL work-item, work-group and group functions
95+
* Group functions must be encountered in converged control flow
96+
* Extension
97+
* Extended group functions with additional overload of joint_reduce and new joint_transform and joint_inner_product
98+
* Block size as template parameter, number of blocks as runtime parameter -> allows calculation of number of elements to process
99+
* Extension for PNM
100+
* Added new overloads of joint_exclusive_scan, joint_inclusive_scan, reduce_over_group
101+
* PNM standalone has less opportunity for parallelism, also limited by memory controller
102+
* -> Combine PNM and PIM, PNM generates commands for PIM blocks
103+
* Two modes
104+
* PIM mode: PIM blocks can operate independently, can choose number of blocks
105+
* PNM mode: Synchronized execution on multiple PIM blocks
106+
* Mapping
107+
* Every PIM block is one work-item
108+
* PNM with attached PIM blocks forms one work-group
109+
* Execution
110+
* Work-item operations map to PIM operation
111+
* Group functions map to PNM operation
112+
* Example
113+
* work-item execution maps to PIM
114+
* group function maps to PNM
115+
* Conclusion
116+
* Integrate support for PIM/PNM into SYCL
117+
118+
Q&A
119+
* Are the proposed functions specific to PIM or could also be used with other HW?
120+
* Can also be used with other hardware. Semantics not PIM-specific, but translation of C++ to SYCL
121+
* Can also map nicely to other types of hardware, for example vector processor
122+
* Why have the user explicitly specify a block-size?
123+
* Not a hardware detail
124+
* Rather a promise by the user that data-blocks will always be at least that big
125+
* Promise allows device compiler to perform optimizations, efficient looping inside PIM unit
126+
* Could num_blocks runtime parameter be replaced by iterator, requiring to be divisable by block-size
127+
* Yes, that is possible, mainly a design question
128+
* Current version might have additional implications regarding alignment
129+
130+
26131
2023-06-05
27132
==========
28133

29-
30134
* Ruyman Reyes
31135
* Rod Burns
32136
* Cohn, Robert S
Binary file not shown.

0 commit comments

Comments
 (0)