-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to create a validation dataset? #229
Comments
If your client ids are sufficiently randomized (e.g. datasets under fedjax.datasets are; we appended a random id to the original client id when creating these datasets), you can use FederatedData.slice to take either the head or tail of the training dataset for use as validation set. Please see the end of this section on our dataset tutorial https://fedjax.readthedocs.io/en/latest/notebooks/dataset_tutorial.html#id1 for details. |
Sorry, I misread your original question. A ClientDataset can be sliced using the [] operator (only slicing is supported):
This creates a new ClientDataset with the first 1/10 of the original. |
Thanks for the suggestion! This is already a way to sort the problem out. But can I create a new |
In most cases you can use This will require some basic knowledge about the existing preprocessing for your dataset, as well as how examples (raw and preprocessed) are represented in FedJAX (see this part of the tutorial). Preprocessing of all prepackaged datasets are fairly simple, so looking at the corresponding py file under fedjax/datasets should give you a fairly good idea (example). |
Hello!
I may need to split each client's train dataset into train and validation parts for grid search purposes (for example, tuning the stepsizes in a method). How can this be achieved in the framework?
The text was updated successfully, but these errors were encountered: