Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement write instruction compilation #813

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from
Draft

Conversation

reczkok
Copy link
Collaborator

@reczkok reczkok commented Feb 3, 2025

extends #808

closes #801

Copy link

codesandbox-ci bot commented Feb 3, 2025

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@reczkok
Copy link
Collaborator Author

reczkok commented Feb 4, 2025

Looks promising 🧑‍🍳 Need to sort out the actual benchmarks as they don't seem to be accurate for the new method (this results are from a temporary test example)
Screenshot 2025-02-04 at 15 25 33

Here is the code (not exactly a benchmark but just for reference)

async () => {
  // warmup
  for (let i = 0; i < 1000; i++) {
    buffer.write({ x: 0, y: 0, z: 0, index: i });
  }
  root.unwrap(buffer);

  const t0 = performance.now();
  for (let i = 0; i < 100000; i++) {
    buffer.write({ x: 0, y: 0, z: 0, index: i });
  }
  const t1 = performance.now();

  console.log(`100000 writes took ${t1 - t0}ms`);

  const t2 = performance.now();
  for (let i = 0; i < 100000; i++) {
    const hostBuffer = new ArrayBuffer(16);
    const view = new DataView(hostBuffer);
    view.setFloat32(0, 0);
    view.setFloat32(4, 0);
    view.setFloat32(8, 0);
    view.setUint32(12, i);
    device.queue.writeBuffer(root.unwrap(buffer), 0, hostBuffer, 0, 16);
  }
  const t3 = performance.now();

  console.log(`100000 writes (manual ref) took ${t3 - t2}ms`);
},

return path.length === 0 ? 'value' : `value.${path.join('.')}`;
}

const compiledWriters = new Map<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const compiledWriters = new Map<
const compiledWriters = new WeakMap<

@reczkok
Copy link
Collaborator Author

reczkok commented Feb 6, 2025

Turns out a lot of our overhead is the creation of buffers per write that triggers the garbage collector
Screenshot 2025-02-06 at 12 48 51
After changing from buffer per write to buffer per TgpuBuffer we are almost at vanilla performacne
Screenshot 2025-02-06 at 12 53 21

@deluksic
Copy link
Contributor

deluksic commented Feb 7, 2025

Permanently storing a buffer is likely a good approach as a default, but some apps might want to keep the buffer gpu-only to save on memory when the buffer doesn't change often. This becomes especially interesting when you start allowing partial writes, where the update buffer could be tiny compared to the full thing.

But this can be implemented later as an extension of the same API.

@reczkok
Copy link
Collaborator Author

reczkok commented Feb 7, 2025

At the moment it only creates the buffer when you fist write to a schema. Partial writes don't trigger this behaviour so as long as you only partial write the main buffer will never be created. Of course that is subject to change if we want to incorporate compiling to partial write which will be more difficult. There is also a major optimisation to be made for the partial write algorithm since it currently always creates an ArrayBuffer the size of the schema but it should be feasible to only make it as big as necessary without much overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize writing to buffers with struct schemas
3 participants