You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 5, 2024. It is now read-only.
A: While it is technically possible, different Python classes would need to be created for every supported buffer data type as the buffer and accessor type definitions require the type of the underlying elements. We can get around the issue by using “untyped” buffers, but that brings its own challenges as partitioning of buffers can lead to loss of precision and incorrect results.
190
+
Q: Why did you not use Buffers?
191
+
192
+
While it is technically possible, different Python classes would
193
+
need to be created for every supported buffer data type as the
194
+
buffer and accessor type definitions require the type of the
195
+
underlying elements. We can get around the issue by using “untyped”
196
+
buffers, but that brings its own challenges as partitioning of
197
+
buffers can lead to loss of precision and incorrect results.
198
+
199
+
Q: Using SPIR V – and using SYCL as the API, is that easier for
200
+
interoperability. Why not use Open Cl? Or go straight down to
201
+
Level Zero of oneAPI?
202
+
203
+
We envision a DPC++ program manager like layer in Numba that will
204
+
allow us to go from the same high-level Python code to possibly
205
+
different types of IRs (SPIR V, NVPTX) and then build
206
+
interoperability kernels that can be launched using a SYCL
207
+
runtime. Targeting OpenCL or Level Zero restricts us to devices
208
+
that support Level Zero. The design may change later as the system
209
+
evolves.
186
210
187
-
Q: Using SPIR V – and using SYCL as the API, is that easier for interoperability. Why not use Open Cl? Or go straight down to Level Zero of oneAPI?
188
-
A: We envision a DPC++ program manager like layer in Numba that will allow us to go from the same high-level Python code to possibly different types of IRs (SPIR V, NVPTX) and then build interoperability kernels that can be launched using a SYCL runtime. Targeting OpenCL or Level Zero restricts us to devices that support Level Zero. The design may change later as the system evolves.
211
+
Q: Using MLIR as well – but you have SPIR V at the bottom? Using MLIR and SPIR V at the bottom? Code level?
189
212
190
-
Q: Using MLIR as well – but you have SPIR V at the bottom? Using MLIR and SPIR V at the bottom? Code level?
191
-
A: The MLIR GPU and SPIR V dialects offer greater flexibility to us than Numba’s current pipeline. We want to move away from using the llvm-spirv translator and hope that the GPU dialect grows into support other types of devices not just GPUs.
213
+
The MLIR GPU and SPIR V dialects offer greater flexibility to us
214
+
than Numba’s current pipeline. We want to move away from using the
215
+
llvm-spirv translator and hope that the GPU dialect grows into
216
+
support other types of devices not just GPUs.
192
217
193
-
Q: Codeplay has done work on MLIR. Would like to connect SYCL dialect and want to focus on top half of the box (SPIR V – GPU- Slide12)
194
-
A: For the Python work we want to primarily focus on the Python to Optimized loops pipeline. If the community takes over the SPIR-V and GPU (and possibly a SYCL dialect), our work for the Python compiler will be greatly benefit.
218
+
Q: Codeplay has done work on MLIR. Would like to connect SYCL dialect and want to focus on top half of the box (SPIR V – GPU- Slide12)
195
219
196
-
Q: What does it mean to make python code look more like SYCL?
197
-
A: Do as a community effort – Anaconda may have responses – will need to involve the NVIDIA engineers who work on Numba?
220
+
For the Python work we want to primarily focus on the Python to
221
+
Optimized loops pipeline. If the community takes over the SPIR-V
222
+
and GPU (and possibly a SYCL dialect), our work for the Python
223
+
compiler will be greatly benefit.
198
224
199
-
Q: SYCL Dialect in the future? Do we have a timeline for that?
200
-
A: SYCL dialect doesn’t exist right now. I am not aware of any timeline, or if anyone is working on it.
225
+
Q: What does it mean to make python code look more like SYCL?
201
226
202
-
Q: Runtime – how much overhead is there from the Python layer?
203
-
A: Library call – oneMKL interface layer – there is not much overhead – did not observe – better than 90%; for the compiler, also we have been evaluating the code we generate through NUMBA DPEX – 75-80% of the execution time as compared to DPC++
227
+
Do as a community effort – Anaconda may have responses – will need
228
+
to involve the NVIDIA engineers who work on Numba?
229
+
230
+
Q: SYCL Dialect in the future? Do we have a timeline for that?
231
+
232
+
SYCL dialect doesn’t exist right now. I am not aware of any
233
+
timeline, or if anyone is working on it.
234
+
235
+
Q: Runtime – how much overhead is there from the Python layer?
236
+
237
+
Library call – oneMKL interface layer – there is not much overhead
238
+
– did not observe – better than 90%; for the compiler, also we have
239
+
been evaluating the code we generate through NUMBA DPEX – 75-80% of
240
+
the execution time as compared to DPC++
204
241
205
242
Metagraph
206
243
---------
207
244
208
-
Q: Graph Neural Net – is it flexible enough for a graph?
Q: Big fan of Graph BLAS - what is happening with that? With MLIR?
250
+
251
+
Reimplement a bunch of things that will need to throw away. When
252
+
added sparse output, that unblocked it. Assuming regular math
253
+
rules – have an internal design that they are translating and
254
+
upstreaming into MLIR. Will be possible to do this. Sparse
255
+
compiler making with a simi ring -
256
+
https://dl.acm.org/doi/abs/10.1145/3485505
257
+
258
+
Can make graph sparse possible – can specify which element can be
259
+
an identity – won’t take
260
+
261
+
Q: Which plugins – should they be written in python only or C++?
210
262
211
-
Q: Big fan of Graph BLAS - what is happening with that? With MLIR?
212
-
A: Reimplement a bunch of things that will need to throw away. When added sparse output, that unblocked it. Assuming regular math rules – have an internal design that they are translating and upstreaming into MLIR. Will be possible to do this. Sparse compiler making with a simi ring - https://dl.acm.org/doi/abs/10.1145/3485505
263
+
Need a thin layer of Python object or wrapper to hand around – then
264
+
python function wrapper. Whatever is happening lower (layers) can
265
+
be – C or C++ - just need enough python code to manipulate from the
266
+
python interpreter
213
267
214
-
Can make graph sparse possible – can specify which element can be an identity – won’t take
268
+
Q: Part of an internal structure of a “type” – capability but hasn’t
269
+
pushed on the type system.
215
270
216
-
Q: Which plugins – should they be written in python only or C++?
217
-
A: Need a thin layer of Python object or wrapper to hand around – then python function wrapper. Whatever is happening lower (layers) can be – C or C++ - just need enough python code to manipulate from the python interpreter
271
+
Type system must be granular enough so they know what the backend
272
+
can handle for any layout.
218
273
219
-
Q: Part of an internal structure of a “type” – capability but hasn’t pushed on the type system.
220
-
A: Type system must be granular enough so they know what the backend can handle for any layout.
274
+
Q: Is that an oneAPI backend for all devices? Graph BLAS on other
275
+
architectures?
221
276
222
-
Q: Is that an oneAPI backend for all devices? Graph BLAS on other architectures?
223
-
A: No catchall solution for graphics (for all devices). Have a solution for people to plug in backends – but people have to implement
277
+
No catchall solution for graphics (for all devices). Have a
278
+
solution for people to plug in backends – but people have to
279
+
implement
224
280
225
281
2021-11-10
226
282
==========
@@ -232,7 +288,7 @@ Agenda
232
288
Overview of oneAPI and SYCL: how all the pieces fit together Andrew Richards, Codeplay 5 min
233
289
Mapping AI software to SYCL and oneAPI: ONNX, Eigen, TensorFlow Mehdi Goli, Codeplay 20 min
234
290
Mapping SYCL to accelerator hardware, using RISC-V as an example Alastair Murray, Codeplay 20 min
235
-
Experience of using SYCL and oneAPI with National Labs Gordon Brown, Codeplay 15 min
291
+
Experience of using SYCL and oneAPI with National Labs Gordon Brown, Codeplay 15 min
0 commit comments