-
Notifications
You must be signed in to change notification settings - Fork 25
[Lazy evaluation] Pytato Array Context with transformations #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Lazy evaluation] Pytato Array Context with transformations #248
Conversation
|
|
||
|
|
||
| @for_each_kernel | ||
| def _single_grid_work_group_transform(kernel, cl_device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@inducer said in a personal meeting that this logic should also handle reductions kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A few quick drive-by thoughts below.
| kernel = lp.split_iname(kernel, iname, | ||
| ngroups * l_zero_size * l_one_size) | ||
| kernel = lp.split_iname(kernel, f"{iname}_inner", | ||
| l_zero_size, inner_tag="l.0") | ||
| kernel = lp.split_iname(kernel, f"{iname}_inner_outer", | ||
| l_one_size, inner_tag="l.1", | ||
| outer_tag="g.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this assume that each iname is only used by exactly one statement?
| if len(insn.within_inames) == 0: | ||
| continue | ||
|
|
||
| if len(insn.within_inames) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Which type of kernels are those?
- I would like it if this were more guided by metadata, along the lines of what the pyopencl actx does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could borrow the unifier from FusionContractorArrayContext, but that would add a dependency on #284 and inducer/pytato#224. If those go in before this PR, I will put that here.
| t_unit = _single_grid_work_group_transform(t_unit, self.queue.device) | ||
| t_unit = lp.set_options(t_unit, "insert_gbarriers") | ||
| t_unit = lp.linearize(lp.preprocess_kernel(t_unit)) | ||
| t_unit = _alias_global_temporaries(t_unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection of our discussion was that we'd do this without aliasing... am I remembering wrong? If not, what made you change your mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we wouldn't alias at the pytato's CodeGenMapper stage, but post-linearization grabbing hold of dead-temporaries would be trivial. I.e. I had thought we would alias the global temporaries not at the pytato-level but downstream as loopy transformation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x-ref: inducer/pytato#122
|
When trying this on the mirgecom examples, I'm getting this type of error: |
|
@matthiasdiener: Sorry, the transformations in this branch depend on some PRs on the pytato/loopy/arraycontext end. I've edited the description to record that. |
32a7d59 to
bd964da
Compare
c166b2f to
0c55627
Compare
9fb8370 to
8c65687
Compare
824caee to
405bb8b
Compare
32b0a78 to
0dcc37c
Compare
0dcc37c to
6bfe6af
Compare
|
Closing this as inducer/arraycontext#216 provides a cleaner implementation. |
Draft because:
ReturnFromKernelat a point in the linearized kernel loopy#476Related/linked PRs: