Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add tmp-dir #54

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

feat: Add tmp-dir #54

wants to merge 2 commits into from

Conversation

cdmurph32
Copy link

wit/environment.wit Outdated Show resolved Hide resolved
Use full name.

Co-authored-by: Lann <[email protected]>
@badeend
Copy link

badeend commented Feb 3, 2025

I completely understand the challenge you're facing and I can see why this change might seem like a straightforward improvement. That said, I do have some concerns about its broader implications.

This will prevent the panic from std::env::temp_dir().

The proposed function would effectively require all hosts to support the concept of "temporary directories," even if that doesn't align with their architecture. It also adds a new responsibility for hosts to manage file cleanup, which can be non-trivial—especially for non-POSIX environments like blob stores that don't natively support temp files. I see that the return type of the proposed function is an option, but what happens when Rust actually receives none? If it leads to another panic, then we haven’t truly solved the core issue.

As already mentioned in chat, I believe tools like WASI-Virt should be able to handle temporary files quite effectively for most scenarios. I understand that an in-memory solution isn't viable for you due to file size constraints, but that raises the question: how portable is your specific use-case really? IIUC, it already has some pretty hefty requirements on its host. Does it make sense for WASI to accommodate this particular use case, and does that align with WASI's broader portability goals?


Regarding Joel’s point in chat:

We already have initial-cwd. Seems reasonable to add a temp-dir function to the same interface.

A subtle distinction here is that initial-cwd doesn’t grant new capabilities to the guest or impose additional requirements on the underlying filesystem implementation.


Another, more ideological concern I have with the design as proposed is the implicit assumption of capabilities via file paths. There has been some prior discussion on this at:

The world has shifted a bit in terms of link-time vs. runtime authority since its creation, but I think many of @sunfishcode's points in the wasi-libc issue still hold up today. One idea I like from that thread is to make the guest responsible for creating a temp folder and adding that to its internal preopens list. This turns the security model from "implicitly assumed" to "actively requested".

@oovm
Copy link

oovm commented Feb 5, 2025

wasi seems to lack the ability to clean up before exit, which hinders the practice of implementing temporary files by yourself.

Ideally, when a soft-exit signal is received, the program logic should be stopped and custom cleanup should be performed.
This includes but is not limited to closing the connection, launching the end session, releasing occupied resources, cleaning up downloaded temporary files, etc.

But when a force-exit signal is received, the logic is forced to terminate, and the runtime reclaims invalid resources, which may not be as comprehensive as custom cleanup.

@cdmurph32
Copy link
Author

The way I see things, there are 3 different possibilities:

  1. Make no changes and force everyone who uses tempfile or its equivalent, including other languages, to make large changes to support WASI. While most libraries not export tempfile usage, they do use it in their tests. If you extend this philosophy to similar areas of concern, this essentially means that WASI is a niche ABI for very specific use cases and not for broad adoption. Here is an example: https://github.com/contentauth/c2pa-rs/pull/888/files
  2. Change tempfile and other similar libraries so that it attempts to create a tmp directory within the preopened directory.
  3. Make a change like this or somewhere else in the WASI ecosystem so that no change is necessary in tempfile. Perhaps an optional cap to setup a preopen to temp.

I'd strongly prefer 3, but I'd be happy with whatever is decided to that end. 1 is untenable.

@sunfishcode
Copy link
Member

I agree; it would be nice to have something like this, so that tempfile and mkstemp and other things can work out of the box. I also agree with @badeend, that we should think about environments that don't necessarily want to expose this.

We may want to include this in the cli world, but perhaps we could put it in its own interface, rather than putting it in environmentt, so that users that use custom worlds would have some flexibility, at least.

One question is, how visible are these temporary directories to other programs, or other Wasm instances? Some programs that use temporary directories do so specifically for the ability to obtain a filesystem path string that they can pass to other applications to open. But then we have to ask, how shared should these temporary directories be? And do they require guests to see them with the same path strings, implying a level of namespace sharing? If that needs to be configured, how would that work?

@dicej
Copy link
Contributor

dicej commented Feb 10, 2025

@badeend The way I think of initial-cwd is that it's a way to ask the host "Do you have an opinion on what the CWD should be? If so, tell me what it is; otherwise, return none and I'll use e.g. '/' as the default." Similarly, I would imagine tmp-dir as a way to ask the host "Do you have an opinion on what I should use as a temporary directory? If so, tell me; otherwise return none and I'll use e.g. /tmp (or the first preopen, or whatever) as the default." In both cases, the default might not correspond to any preopen, in which case the guest will have trouble when it tries to use that directory, but that's not a new hazard -- it's been there since WASI has had the concept of a current working directory.

A subtle distinction here is that initial-cwd doesn’t grant new capabilities to the guest or impose additional requirements on the underlying filesystem implementation.

I guess I'm not seeing what new capabilities the guest would get, nor what additional requirements there would be on the implementation. An implementation can always return none and the guest still only has access to whatever preopens the host has made available.

One idea I like from that thread is to make the guest responsible for creating a temp folder and adding that to its internal preopens list. This turns the security model from "implicitly assumed" to "actively requested".

That sounds cool, yes. Then std::env::temp_dir could make such a request lazily. I expect @cdmurph32 just wants something that makes std::env::temp_dir work, so if there's a better way to do that, great.

@programmerjake
Copy link

One idea I like from that thread is to make the guest responsible for creating a temp folder and adding that to its internal preopens list. This turns the security model from "implicitly assumed" to "actively requested".

That sounds cool, yes. Then std::env::temp_dir could make such a request lazily. I expect @cdmurph32 just wants something that makes std::env::temp_dir work, so if there's a better way to do that, great.

so, C would create the temp folder when you ran getenv("TMPDIR")?! that seems like a bad design since getenv should not have side-effects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants