You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
I have a custom dataset implementation that includes random augmentations for images and returns data in a specific format. The dataset is defined as follows:
Since I am using a custom dataset with random augmentations, I cannot use the CLI - swift sft for training.
Question
How can I modify the training pipeline to support distributed training (e.g., using accelerate(DDP)) ?
The dataset returns data in a custom format (as shown above).
Current Setup
Here is the current training pipeline:
# Define random augmentations for the images
transform = transforms.Compose([
transforms.RandomRotation(10),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
])
# Load the dataset
train_dataset = OCRDataset(csv_file='path_to_train.csv', transform=transform)
val_dataset = OCRDataset(csv_file='path_to_val.csv', transform=transform)
# Retrieve the model and template, and add a trainable LoRA module
model, tokenizer = get_model_tokenizer(model_id_or_path, ...)
template = get_template(model.model_meta.template, tokenizer, ...)
model = Swift.prepare_model(model, lora_config)
# Encode the text into tokens
train_dataset = EncodePreprocessor(template=template)(train_dataset, num_proc=num_proc)
val_dataset = EncodePreprocessor(template=template)(val_dataset, num_proc=num_proc)
# Train the model
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=template.data_collator,
train_dataset=train_dataset,
eval_dataset=val_dataset,
template=template,
)
trainer.train()
The dataset includes random augmentations, I need to use DDP or another distributed training method to scale training across multiple GPUs/nodes.
Could you provide guidance or an example of modifying this pipeline to support distributed training while maintaining the custom dataset and random augmentations? I am not sure on how to apply custom augmentation while using CLI for training.
Suggestions will be highly appreciated. Thanks!
The text was updated successfully, but these errors were encountered:
Description
I have a custom dataset implementation that includes random augmentations for images and returns data in a specific format. The dataset is defined as follows:
Since I am using a custom dataset with random augmentations, I cannot use the CLI -
swift sft
for training.Question
How can I modify the training pipeline to support distributed training (e.g., using accelerate(DDP)) ?
The dataset returns data in a custom format (as shown above).
Current Setup
Here is the current training pipeline:
The dataset includes random augmentations, I need to use DDP or another distributed training method to scale training across multiple GPUs/nodes.
Could you provide guidance or an example of modifying this pipeline to support distributed training while maintaining the custom dataset and random augmentations? I am not sure on how to apply custom augmentation while using CLI for training.
Suggestions will be highly appreciated. Thanks!
The text was updated successfully, but these errors were encountered: