The purpose of data annotation.
Machine learning is embedded in AI and allows machines to perform specific tasks through training. With data annotation, it can learn about pretty much everything. Machine learning techniques can be described into four types: Unsupervised learning, Semi-Supervised Learning, Supervised Learning, and Reinforcement learning
▸Supervised Learning: The supervised learning learns from a set of labeled data. It is an algorithm that predicts the outcome of new data based on previously known labeled data.
▸Unsupervised Learning: In unsupervised machine learning, training is based on unlabeled data. In this algorithm, you don’t know the outcome or the label of the input data.
▸Semi-Supervised Learning: The AI will learn from a dataset that is partly labeled. This is the combination of the two types above.
▸Reinforcement Learning: Reinforcement learning is the algorithm that helps a system determine its behavior to maximize the benefits. Currently, it is mainly applied to Game Theory, where algorithms need to determine the next move to achieve the highest score.
Although there are four types of techniques, the most frequently used are unsupervised and supervised learning. You can see how unsupervised and supervised learning works according to Booz Allen Hamilton description in this picture:
What is labeled data?
Labeled data is a group of samples that have been tagged with one or more labels. Labeling typically takes a set of unlabeled data and augments each piece of it with informative tags. Labeled data will help machine learning “learn” the similar pattern in the input data and then predict another dataset.
How to process data annotation?
Step 1: Data Collection
Data collection is the process of gathering and measuring information from countless different sources. To use the data we collect to develop practical artificial intelligence (AI) and machine learning solutions, it must be collected and stored in a way that makes sense for the business problem at hand.
There are several ways to find data. In classification algorithm cases, it is possible to rely on class names to form keywords and to use crawling data from the Internet to find images. Or you can find photos, videos from social networking sites, satellite images on Google, free collected data from public cameras or cars (Waymo, Tesla), even you can buy data from third parties (notice the accuracy of data). Some of the common datasets can be found on free websites like Common Objects in Context (COCO), ImageNet, and Google’s Open Images.
Some common data types are Image, Video, Text, Audio, and 3D sensor data.
- Image (photographs of people, objects, or animals, etc.)
Image is perhaps the most common data type in the field of data annotation. Since it deals with the most basic type of data there is, it plays an important part in a wide range of applications, namely robotic visions, facial recognition or any kind of application that has to interpret images.
From the raw datasets provided from multiple sources, it is vital for these to be tagged with metadata that contain indentifiers, captions or keywords.
The major fields that require enormous effort for data annotation are healthcare applications (as in our case study of bloodcell annotation), autonomous vehicles (as in our case study of traffic lights and sign annotation). With the effective and accurate annotation of images, the AI applications can work flawlessly with no intervention from human.
To train these solutions, metadata must be assigned to the images in the form of identifiers, captions, or keywords. From computer vision systems used by self-driving vehicles and machines that pick and sort produce, to healthcare applications that auto-identify medical conditions, there are many use cases that require high volumes of annotated images. Image annotationincreases precision and accuracy by effectively training these systems.
- Video (Recorded tape from CCTV or camera, usually divided into scenes)
When compared with image, video is a more complex form of data that demands bigger effort to annotate correctly. To put it simple, a video consists of different frames which can be understood as pictures. For example, a one-minute video can have thousands of frames, and to annotate this video, one must invest a lot of time.
One outstanding feature of video annotation in Artificial Intelligence and Machine Learning model is that it offers great insight on how object moves and its direction.
A video can also inform whether the object is partially obstructed or not while image annotation is limited to this.
- Text: Different types of documents include numbers and words and they can be in multiple languages.
Algorithms use large amounts of annotated data to train AI models, which is part of a larger data labeling workflow. During the annotation process, a metadata tag is used to mark up characteristics of a dataset. With text annotation, that data includes tags that highlight criteria such as keywords, phrases, or sentences. In certain applications, text annotation can also include tagging various sentiments in text, such as “angry” or “sarcastic” to teach the machine how to recognize human intent or emotion behind words.
The annotated data, known as training data, is what the machine processes. The goal? Help the machine understand the natural language of humans. This procedure, combined with data pre-processing and annotation, is known as natural language processing, or NLP.
- Audio: They are sound records from people having dissimilar demographics.
As the market is trending with Voice AI Data Annotation, LTS provides top-notch service in annotating voice data. We have annotators fluent in languages.
All types of sounds recorded as audio files can be annotated with additional keynotes and suitable metadata. The Cogito annotation team is capable of exploring the audio features and annotating the corpus with intelligent audio information. Each word in the audio is carefully listened to by the annotators in order to recognize the speech correctly with our sound annotation service.
The speech in an audio file contains different words and sentences that are meant for the listeners. Making such phrases in the audio files recognizable to machines is possible, using a special data labeling technique while annotating the audio. In NLP or NLU, machine algorithms for speech recognition need audio linguistic annotation to recognize such audios.
- 3D Sensor data: 3D models generated by sensor devices.
No matter what, money is always a factor. 3D capable sensors greatly vary in build complexity and accordingly – in price, ranging from hundreds to thousands of dollars. Choosing them over the standard camera setup is not cheap, especially given that you would usually need multiple units in order to guarantee a large enough field of view.
In many cases, the data gathered by 3D sensors is nowhere as dense or high-resolution as the one from conventional cameras. In the case of LiDARs, a standard sensor discretizes the vertical space in lines (number of lines vary), each having a couple hundreds detection points. This produces approximately 1000 times fewer data points than what is contained in a standard HD picture. Furthermore, the further away the object is located, the fewer samples land on it, due to the conical shape of the laser beams’ spread. Thus the difficulty of detecting objects increases exponentially with their distance from the sensor.”
Step 2: Identify the problem
Knowing what problem you are dealing with will help you to decide the techniques you should use with the input data. In computer vision, there are some tasks such as:
- Image classification: Collect and classify the input data by assigning a class label to an image.
- Object detection & localization: Detect and locate the presence of objects in an image and indicate their location with a bounding box, point, line, or polyline.
- – Object instance / semantic segmentation: In semantic segmentation, you have to label each pixel with a class of objects (Car, Person, Dog, etc.) and non-objects (Water, Sky, Road, etc.). Polygon and masking tools can be used for object semantic segmentation.
Step 3: Data Annotation
After identifying the problems, now you can process the data labeling accordingly. With the classification task, the labels are the keywords used during finding and crawling data. For instance segmentation task, there should be a label for each pixel of the image. After getting the label, you need to use tools to perform image annotation (i.e. to set labels and metadata for images). The popular tools can be named Comma Coloring, Annotorious, LabelMe. You can refer to some of the common data annotation tools and their features in our infographic here.
However, this way is manual and time-consuming. A faster alternative is to use algorithms like Polygon-RNN ++ or Deep Extreme Cut. Polygon-RNN ++ takes the object in the image as the input and gives the output as polygon points surrounding the object to create segments, thus making it more convenient to labeling. The working principle of Deep Extreme Cut is similar to Polygon-RNN ++ but it allows up to 4 polygons.
It is also possible to use the “Transfer Learning” method to label data, by using pre-trained models on large-scale datasets such as ImageNet, Open Images. Since the pre-trained models have learned many features from millions of different images, their accuracy is fairly high. Based on these models, you can find and label each object in the image. It should be noted that these pre-trained models must be similar to the collected dataset to perform feature-extraction or fine-turning.
Types of data annotation
Data Annotation is the process of labelling the training data sets, which can be images, videos or audios. Needless to say, AI Annotation is of paramount importance to Machine Learning (ML), as ML algorithms need (quality) annotated data to process.
In our AI training projects, we use different types of annotation. Choosing what type(s) to use mainly depends on what kind of data and annotation tools you are working on.
- Bounding Box: As you can guess, the target object will be framed by a rectangular box. The data labelled by using bounding boxes are used in various industries, and most used in automotive vehicle, security and e-Commerce industries.
- Polygon: When it comes to irregular shapes like human bodies, logos or street signs, to have more precise outcome, Polygons should be your choice. The boundaries drawn around the objects can give an exact idea about the shape and size, which can help the machine make better predictions.
- Polyline: Polylines usually serve as a solution to reduce the weakness of bounding boxes, which usually contain unnecessary space. It is mainly used to annotate lanes on road images.
- 3D Cuboids: The 3D Cuboids are utilized to measure the volume of objects which can be vehicles, buildings or furniture.
- Segmentation: Segmentation is similar to polygons but more complicated. While polygons just choose some objects of interest, with segmentation, layers of alike objects are labeled until every pixel of the picture is done, which leads to better results of detection.
- Landmark: Landmark annotation comes in handy for facial and emotional recognition, human pose estimation and body detection. The applications using data labeled by landmark can indicate the density of the target object within a specific scene.
Popular tools for data annotation
In machine learning, data processing and analysis are extremely important, so I will introduce to you some Tools for annotating data to make the job simpler.
You can refer more information about Data Annotation here
PixelAnnotationTool – Data Annotation Tools
This tool is suitable for segmentation problems such as finding cars, roads, and cells in medicine to support diagnosis.
This tool is using the watershed marked algorithm of OpenCV. Everyone can go to the binary link to download the tool and use it.
You can change the color in the config file in the source code and then let the number of colors correspond to the regions you want to segment differently. Then you just need to use the mouse “dot” the color and press the “enter” key according to your desired color area.
Data Generator Tool
Text Recognition Data Generator is a tool used to generate text.
With this tool, you can generate different fonts and colors for your text detection problem. You only need to save the cn.txt file in dicts and the font also saved in the cn directory always and run the code according to the following code:
python run.py -l cn -c 1000 -w 1 -t 6 -k 3 -rk -b 3 -bl 1 -rbl
To generate data according to the requirements of the problem, you should study carefully in the documentation
LabelImg is also a tool to annotate data but other than Pixeltool in that LabelImg is used to take out the 4 surrounding corners. To install the tool, you can clone github or use pip.
- pip3 install pyqt5 lxml # Install qt and lxml by pip
- make qt5py3
- python3 labelImg.py
- python3 labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
Who can annotate data?
The data annotators are the ones in charge of labeling the data. There are some ways to allocate them:
The data scientists and AI researchers in your team are the ones who label data. The advantages of this way are easy to manage and having a high accuracy rate. However, it is such a waste of human resources since data scientists will have to spend much time and effort on a manual, repetitive task.
You can find a third party – a company that provides data annotation services. Although this option will cost less time and effort of your team, you need to ensure that the company commits to providing transparent and accurate data.
Online workforce resources
Alternatively, you can use online workforce resources like Amazon Mechanical Turk or Crowdflower. These platforms recruit online workers around the world to do data annotation. However, the accuracy and the organization of the dataset are the issues that you need to consider when purchasing this service.
The data annotation guide described here is basic and straightforward. To build machine learning, besides data scientists who will set the infrastructure and scale for complex machine learning tasks, you still need to find data annotators to label the input data. Lotus Quality Assurance provides professional data annotation services in different domains. With our quality review process, we commit to bring a high-quality and secure service. Contact us for further support!