-
Notifications
You must be signed in to change notification settings - Fork 6
Data labeling Specification
Recycling is vital for a sustainable and clean environment. We use just 481.6 billion plastic bottles every year and only about 9% of them are recycled. In this project, we’ll look at how computer vision can be used to identify different types of plastics, glass, and metal from Household waste. Initially we should look at the dataset and label them.
- The dataset spans three main classes: glass, plastic, and metal
- Subclasses should be specified according to the type and color of each main class
Note: plastics have 7 types: PET, HDPE, PVC, LDPE, PP, PS, and Others.
How to label data ?
Before diving into the labeling system, we should take a look at the dataset and its requirements:
- We need to have deformed and broken objects in our dataset in addition to dirty ones.
- Images should be taken from different angles in case they have been deformed or broken.
- We should regulate brightness and contrast.
- To enhance the dataset, using the Augmentation method is a must.
- It should be considered that all the data have the same size and resolution.
- The pictures should be taken by placing the object on a dark background.
- To avoid confusing objects with bright background, we can use one or some of the following methods:
- 7.1. Create a tight bounding box for every object
- 7.2. Omit background
- 7.3. Mask object
- 7.4. Edge Detection
- The initial form of data is an image, but it should be converted to other required formats concerning the detection method.
- We should collect balanced and diverse dataset for each categories.
There are two common methods for labeling:
- Manual
- Automatic
Manual labeling system
Here we utilize Label Studio (https://labelstud.io/) which is one of the most flexible data annotation tools for labeling and exploring multiple types of data. we enjoy performing different types of labeling with many data formats.
Let's see how we can label our dataset:
- Install label studio (https://labelstud.io/guide/install.html)
- Start Label Studio with the label-studio command.
- Sign up with an email address and password that you create.
- Click to create a project and start labeling data.
- Click Data Import and upload the data files that you want to use.
- Click Labeling Setup and choose a template and customize the label names for your use case.
- Click Save to save your project.
According to the above picture, some of the icons are not available. It means that we should consider our project and requirements and then choose the appropriate labeling method.
Here we annotate a bottle with this labeling system:
To know how to use this labeling system, take a look at this guide: https://www.youtube.com/watch?v=UUP_omOSKuc
Automatic labeling
Apart from this method, we prioritize automatic labeling, because as it is blindingly clear for large datasets the above method squanders time as well as energy.