Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PageAllocation as a handle into a PagedAttentionCache, allowing publishing and releasing an allocation via handle rather than cache #608

Merged
merged 9 commits into from
Dec 2, 2024

Conversation

renxida
Copy link
Contributor

@renxida renxida commented Nov 26, 2024

Deinitialization looks wonky for now. Will test extensively to get deinit right once I merge #600

Closes #607

@stbaione
Copy link
Contributor

A couple small comments on form, but functionally, this looks great. There are a decent bit of changes in critical areas, so some unit tests might be nice to include before merging

@renxida renxida force-pushed the page-allocation-handle branch from d5a8073 to 1a844ec Compare November 27, 2024 17:51
@renxida renxida requested a review from stbaione November 27, 2024 20:13
@renxida renxida force-pushed the page-allocation-handle branch from 4563776 to 2edccfd Compare November 27, 2024 22:56
@renxida renxida changed the title Implement PageAllocation as a handle into a PagedAttentionCache, allowing publishing and releasing an allocation Implement PageAllocation as a handle into a PagedAttentionCache, allowing publishing and releasing an allocation via handle rather than cache Nov 27, 2024
@renxida renxida force-pushed the page-allocation-handle branch from 2edccfd to e27b695 Compare November 28, 2024 01:23
@renxida renxida force-pushed the page-allocation-handle branch from e27b695 to 0966bbe Compare December 2, 2024 17:16
Copy link
Contributor

@stbaione stbaione left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, really nice tests

@renxida renxida force-pushed the page-allocation-handle branch from 0966bbe to d0a77e2 Compare December 2, 2024 18:39
@renxida renxida force-pushed the page-allocation-handle branch from d0a77e2 to 7371ba8 Compare December 2, 2024 19:05
@renxida renxida merged commit 8cd3f85 into nod-ai:main Dec 2, 2024
17 of 19 checks passed
monorimet pushed a commit that referenced this pull request Dec 13, 2024
…wing publishing and releasing an allocation via handle rather than cache (#608)

Deinitialization looks wonky for now. Will test extensively to get
deinit right once I merge #600

Closes #607
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manage page allocations through a PageAllocation object, rather than straight in InferenceExecRequest
2 participants