Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add add_nonblocking, capacity and is_full for streams #790

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

xavierm02
Copy link

Having a non-blocking add is useful when adding from an asynchronous callback from C code (e.g. an event handler or timeout in Lablgtk) because having the fiber yield would result in an Effect.Unhandled exception.

The simplest solution is to make the stream unbounded by creating it with the max int as capacity, but that has the disadvantage of letting the stream take up a potentially unbounded amount of memory.

Another solution, when using a single domain, is to use the is_full (or capacity) added by this commit, as one can do e.g. (if Stream.is_full stream then ignore (Stream.take stream)); Stream.add stream item.

In a context with multiple domain, this may not work as the lock is released between the take and the add, and another domain might add an element between these two operations. This is why add_nonblocking was also added: defining it from within the module allows to do both operations without releasing the lock in between.

I'm not too sure about Sync.put_nonblocking, and it might be reasonable to just raise an Invalid_argument exception when add_nonblocking is called on a stream of capacity 0.

@xavierm02 xavierm02 marked this pull request as draft December 13, 2024 16:04
@xavierm02
Copy link
Author

Hm. The Sync.put_nonblocking is probably wrong since it sets the cell value to Slot while it's a producer so it should set it to Item...

@xavierm02 xavierm02 force-pushed the stream-add-nonblocking branch from db9eade to e359a09 Compare December 13, 2024 17:04
@xavierm02
Copy link
Author

xavierm02 commented Dec 13, 2024

I removed Sync.put_nonblocking. I'm not sure how to handle the In_transition case.
(take_nonblocking sets the cell's value to a Slot that rejects any value to force the producer to retry sending its value in another cell, but in put_nonblocking, we're supposed to set it to an Item so we can't do that)

@xavierm02 xavierm02 marked this pull request as ready for review December 13, 2024 17:15
@talex5
Copy link
Collaborator

talex5 commented Dec 14, 2024

capacity and is_full look useful to me.

add_nonblocking seems a bit complicated, and I'm not sure it's that useful:

  • With lablgtk the consumer will typically be in the same domain as the callback, so the race there isn't a problem.
  • Even with multiple domains, if length < 100 then add t x is good enough to prevent it growing without bound. In fact, since lablgtk is the only producer it will always work perfectly. If there were multiple producers, it could go slightly over, but not by more than the number of producer domains.
  • I'm also not sure what kind of events could be dropped. For e.g. keypresses, I think you'd want to keep everything the user enters. For events where you just need to know you need to update (e.g. pointer motion events), you can use Eio.Condition instead.

@xavierm02
Copy link
Author

@talex5

Sorry for the late reply.

Having capacity and is_full would indeed be enough for my use case.

I nevertheless believe that add_nonblocking also belongs here, and I'll make one attempt below at convincing you by explaining my thought process. If this leaves you unconvinced, I'd be happy with just capacity and is_full being merged.


My main reason for adding add_nonblocking is that its absence, combined with the presence of add, take and take_nonblocking, makes it unclear whether the Stream module is complete, i.e. able to express all operations on the mathematical objects it attempts to model, or not. As you have noted, even if it were incomplete, the missing operations might not be relevant in practice because on can express operations with slightly weaker invariants that suffice for most use cases.

(Though I'm not sure how to express add_nonblocking ~drop_priority:Oldest t x, as (if length t >= capacity t then ignore (take t)); add t x looks like it could lead to problems where the freed slot could be stolen by another domain.)

However, I believe that libraries should aim at making the set of operations clearly complete, meaning that expressing missing operations and convincing oneself of their correctness should be easy, and not require knowing anything about the implementation. For example, if length < 100 then add t x only works as expected because add does not yield unless the stream is full. While any reasonable implementation of course satisfies this property, this is a priori not part of the specification of add (which I expect to be something of the shape "when add t x returns, x has been added to t, and ..."), and fun t x -> Fiber.yield (); add t x breaks the property while still fitting the specification.

For streams (of capacity >= 1), I'd expect a complete set of operations to necessarily contain some way of doing arbitrary operations while holding the lock:

type 'a t

module With_lock : sig
  type 'a t

  val create : int -> 'a t
  val length : 'a t -> int
  val add : 'a t -> 'a -> unit
  val take : 'a t -> 'a option
end

val use_lock : 'a t -> ('a With_lock.t -> 'b) -> 'b

Though complete, this interface of course has a few problems (e.g. it is slightly leaky, and allows breaking invariants by storing the With_lock.t in a reference).

Even an improved version of this complete interface felt a bit too big and opinionated of a change for a first contribution, so I decided to just add add_nonblocking to make the interface feel complete, even if not actually complete.

@xavierm02
Copy link
Author

One use of add_nonblocking ~drop_priority:Oldest could be some server receiving time-sensitive requests: you want to bound the memory a client can use on the server, and if the client is too spammy / the server experiences a slowdown, you want to keep the latest request because the oldest ones are probably no longer relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants