18 Mar 18:54

8fa9266

v1.1.0: Depth Estimation, Segmentation, Class Mapping, Dataset SUNRGBD Latest

Latest

Initial release of the Monocular Depth Estimation Module, adding support for depth estimation, segmentation, and class mapping.
To use this module, configure your workflow and dataset in the config:

SELECTED_WORKFLOW = ["auto_label_mask"]
SELECTED_DATASET = {"name": "SUNRGBD", "n_samples": None}

Each individual workflow can be configured in WORKFLOWS = {...}

Visual Data

As mentioned in V1.0.0, the Mcity Data Engine supports visual data. Leveraging Voxel51, we convert any dataset into their dataset format and perform all operations on it. This way, the Data Engine can be utilized with any dataset as long as it can be converted into the V51 format.

This release expands dataset support by integrating additional dataset

SUNRGB-D

The SUNRGB-D dataset contains 10,335 RGB-D images, each of which has a corresponding RGB image, depth image, and camera intrinsics. It contains images from the NYU depth v2, Berkeley B3DO, and SUN3D datasets.

The integration of SUNRGB-D further strengthens the Mcity Data Engine’s ability to process high-quality depth estimation and segmentation data.

Data Curation

Unlike the other datasets currently available in the Mcity Data Engine, SUNRGB-D requires a manual download.

To download the Dataset, use the following commands:

curl -o sunrgbd.zip https://rgbd.cs.princeton.edu/data/SUNRGBD.zip
unzip sunrgbd.zip

Load the Dataset into Voxel51, using:

python main.py

The dataset should then appear in your Voxel51 environment:

Customize Heatmaps (Optional)

When working with depth maps, adjusting the color scheme and opacity can improve visualization.

To modify these settings:

Click the palette icon at the top-left corner of Voxel51.
Select value for Color annotations by.
Scroll to the bottom to Colorscale and select name with type viridis in the text box for optimal contrast.

Intelligent Class Mapping

The class mapping workflow leverages zero-shot classification models from HuggingFace to facilitate the alignment of class labels between a source dataset and a more granular target dataset. This streamlined workflow facilitates working with multiple datasets by easing data preprocessing, to work on subsequent tasks such as sequential/joint finetuing of pretrained models for tasks like object detection. This workflow serves two primary functions. First, it can be used with Voxel51 to visually assess the performance of various zero-shot classifier models on a designated source-to-target mapping—during which only tags are added while the labels remain unchanged. Once the optimal model is identified, the workflow can then be configured to update the labels in the source dataset by setting the change_labels flag to True.

To use this workflow, configure the models, datasets and the required label-mapping in the config.:

SELECTED_WORKFLOW = ["class_mapping"]
#Ensure the SELECTED_DATASET is the same as Source Dataset, which is used for class mapping
SELECTED_DATASET = {"name": "fisheye8k", "n_samples": None}

    "class_mapping": {
            # get the source and target dataset names from datasets.yaml
            "dataset_source": "fisheye8k",
            "dataset_target": "mcity_fisheye_2000",

            # Set to True to change detection labels in the dataset, Set to False to just add tags without changing the dataset
            "change_labels": True,

         # Choose any number of models from the options below hf_models_zeroshot_classification, by removing the comment in front of the model name. 
        "hf_models_zeroshot_classification": [
            "Salesforce/blip2-itm-vit-g",
            #"openai/clip-vit-large-patch14",
            #"google/siglip-so400m-patch14-384",
            #"google/siglip2-so400m-patch14-384",
            #"kakaobrain/align-base",
            #"BAAI/AltCLIP",
            #"CIDAS/clipseg-rd64-refined"
        ],
        "thresholds": {
            "confidence": 0.2
        },
        "candidate_labels": {
            #Target class(Generalized class) : Source classes(specific categories)
            "car": ["car", "van"],
            "truck": ["truck", "pickup"],
            #One_to_one_mapping
            "bike" : ["motorbike/cycler"]
        }
    }

Merged PRs

Updated teacher to auto_label: naming consistency, config, and pytest @isabelmoore in #127
Updated the class mapping workflow, with new changes @abeyankargiridharan in #180

Full Changelog: https://github.com/mcity/mcity_data_engine/commits/v1.1.0

Contributors

isabelmoore and abeyankargiridharan

Assets 2

28 Feb 15:52

daniel-bogdoll

v1.0.0

58aaac4

v1.0.0: AWS Integration, Data Curation, Model Training, Model Inference

Initial release of the Mcity Data Engine with a focus on the task of object detection for visual datasets. To use the Mcity Engine, simply select your workflows and the dataset in the config:

SELECTED_WORKFLOW = ["embedding_selection", "auto_labeling"]
SELECTED_DATASET = {"name": "fisheye8k", "n_samples": None}

Each individual workflow can be configured in WORKFLOWS = {...}. Start the Mcity Data Engine with python main.py.

Visual Data

The Mcity Data Engine supports visual data. Leveraging Voxel51, we convert any dataset into their dataset format and perform all operations on it. This way, the Data Engine can be utilized with any dataset as long as it can be converted into the V51 format.

Currently connected datasets:

Smart Intersections Project via AWS
Fisheye8K

Data Curation

To select samples of interest, the Data Engine initially provides three workflows:

Selection by Embedding Ensemble

Leveraging the Voxel51 Brain component, the Data Engine computes image embeddings based on an ensemble of models. These are leveraged to select both representative and unique samples.

Selection by Language-Prompted Zero-Shot Ensemble

Leveraging Zero-Shot Object Detection Models from Hugging Face, the Mcity Data Engine identifies images that include n instances of classes of interest. It combines an ensemble of models to reduce both false positives and false negatives.

Selection by Anomaly Detection

Leveraging anomaly detection models from Anomalib, the Data Engine detects frames that contain anomalies. This workflow requires a labeled dataset. During training, a known class is treated as an anomaly and excluded from the training dataset. During inference, samples with a high anomaly score can be selected for inspection.

Data Labeling

The Data Engine is connected to CVAT for manual labeling. Based on the results of the data curation workflows, samples can be filtered, manually inspected, and labeled. Given trained models, auto labeling can be performed through model inference.

Model Training and Inference

The Mcity Data Engine currently supports three model sources for model training and inference:

Trained models are uploaded to Hugging Face and can be used for later inference. Custom models are wrapped in a Container and can be used through Docker or Singularity.

Merged PRs

[pdoc] Updated documentation by @github-actions in #34
Data Download and Extraction from AWS by @daniel-bogdoll in #44
Dataset n samples by @daniel-bogdoll in #45
added requirements_colab.txt by @rajanikant-patnaik in #47
Auto labeling by @daniel-bogdoll in #53
Auto Labeling working and tested for all HF models in config. RT-DETR commented out, as there is a tensor mismatch by @daniel-bogdoll in #54
Test for Anomaly Detection Inference by @daniel-bogdoll in #84
Anomaly detection model coverage by @daniel-bogdoll in #86
Aws Download group specification + test by @daniel-bogdoll in #88
Auto labeling Test for Hugging Face by @daniel-bogdoll in #89
Zero shot stable by @daniel-bogdoll in #95
Add ensemble exploration workflow and enhance logging for detection collection by @daniel-bogdoll in #96
Data-selection-notebook by @daniel-bogdoll in #97
Bump cryptography from 43.0.1 to 44.0.1 by @dependabot in #52
[pdoc] Updated documentation by @github-actions in #35
Updating mask-teacher with changes from main by @daniel-bogdoll in #104
colab changes by @rajanikant-patnaik in #106
Colab by @rajanikant-patnaik in #107
added for colab by @rajanikant-patnaik in #109
added for colab by @rajanikant-patnaik in #110
Add public Docs with GitHub Actions to test by @daniel-bogdoll in #108
added for colab by @rajanikant-patnaik in #111
Merge mask-auto-labeling into main by @daniel-bogdoll in #105
Back to 2-step workflow (tests and docs needs installed pip env) by @daniel-bogdoll in #112
Create dependabot.yml by @daniel-bogdoll in #113
changed 51 layout by @rajanikant-patnaik in #122
Update tests_documentation.yml by @daniel-bogdoll in #123
Bump ultralytics from 8.3.75 to 8.3.78 by @dependabot in #125
Bump huggingface-hub from 0.27.0 to 0.29.1 by @dependabot in #124
Bump transformers from 4.48.1 to 4.49.0 by @dependabot in #121
Bump wandb from 0.19.1 to 0.19.6 by @dependabot in #119
Bump anomalib from 1.1.1 to 1.2.0 by @dependabot in #118
Bump accelerate from 1.1.1 to 1.4.0 by @dependabot in #120
Bump datasets from 3.2.0 to 3.3.1 by @dependabot in #114
Automatic grouping of fields in sidebar by @daniel-bogdoll in #128
Bump timm from 1.0.14 to 1.0.15 by @dependabot in #133
Bump datasets from 3.3.1 to 3.3.2 by @dependabot in #135
Bump wandb from 0.19.6 to 0.19.7 by @dependabot in #134
Zero shot object detection Improvements by @daniel-bogdoll in #129
New group for Ensemble Selection by @daniel-bogdoll in #130
Ultralytics Model Training and Inference by @daniel-bogdoll in #132
Add YouTube video to README by @daniel-bogdoll in #140
Supporting dataset views by @daniel-bogdoll in #141
Integrating CoDETR changes by @daniel-bogdoll in #137
[pdoc] Updated documentation by @github-actions in #103
[pdoc] Updated documentation by @github-actions in #143
Cleanup, Documentation, Evaluation of prior Merges by @daniel-bogdoll in #142

Full Changelog: https://github.com/mcity/mcity_data_engine/commits/v1.0.0

Contributors

daniel-bogdoll, dependabot, and rajanikant-patnaik

Assets 2

06 Dec 19:30

daniel-bogdoll

v0.1

8904c21

v0.1 Pre-release

Pre-release

Release Notes

Parallelized multi-GPU workflow introduced for zero-shot object detection

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Visual Data

Data Curation

Customize Heatmaps (Optional)

Intelligent Class Mapping

Merged PRs

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Visual Data

Data Curation

Selection by Embedding Ensemble

Selection by Language-Prompted Zero-Shot Ensemble

Selection by Anomaly Detection

Data Labeling

Model Training and Inference

Merged PRs

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Release Notes

Uh oh!

Releases: mcity/mcity_data_engine

v1.1.0: Depth Estimation, Segmentation, Class Mapping, Dataset SUNRGBD

Visual Data

Data Curation

Customize Heatmaps (Optional)

Intelligent Class Mapping

Merged PRs

Contributors

Uh oh!

v1.0.0: AWS Integration, Data Curation, Model Training, Model Inference

Visual Data

Data Curation

Selection by Embedding Ensemble

Selection by Language-Prompted Zero-Shot Ensemble

Selection by Anomaly Detection

Data Labeling

Model Training and Inference

Merged PRs

Contributors

Uh oh!

v0.1

Release Notes

Uh oh!