As we learned in my previous blog post, object detection in images is a great practical example of supervised learning. Object detection is itself an example of computer vision – having AI systems learn patterns and features in images/video to make predictions.
For example, given the following image, which ‘object’ categories are present?
In this case, a computer vision model has identified and located 5 separate objects: 1 person, 1 train, 1 railway, and 2 mountains. This model gained the ability to detect these categories by learning from data with the following components:
- A sufficiently large amount of relevant images containing the categories in question (we will ignore bias for the moment). This means images of mountains, people, trains, combinations of these, etc. These are called the image files.
- A corresponding file for each image encapsulating the specific categories present, and their spatial locations within the image. These are called the label (or annotation) files.
In this example, the computer vision model did not only identify the specific category instances, but it also identified the specific spatial coordinates of each one.
The obvious question is, “How do we get this data?”
Today, most use cases involving custom categories require two teams: a team to collect the image files (following specific instructions), and a labeling team to produce the label files. The process of labeling images – drawing boxes around specific objects and assigning them a category – is currently labor-intensive. For the most part, it requires humans to manually go through each image, draw each bounding box, and assign each category.
There are many things that could go wrong during this process, and it’s important to keep in mind some common best practices and common mistakes. Why? Because the output and performance of machine learning models depends mostly on the quality of the data it learns from. If the data contains mistakes, the models will perform accordingly.
Here are the top 3 mistakes when managing image labeling teams for your projects:
1) Vague Instructions
An image labeling team or workforce needs as much context and information as possible. We simply cannot assume they will know what should be done, no matter how simple the task may appear to be.
There are many questions that arise naturally, such as:
- How much of the object should we include within the bounding box?
- If an object is partially cut off, either by another object or by the edges of the image, should we draw a box around it? What should this cutoff threshold be, if any?
- Under what conditions should we skip an image altogether and not label it at all? If it’s too dark, too bright, or too blurry? If it’s a random image that slipped through that should not be part of the image set?
It is crucial and necessary to provide a clear set of detailed instructions, along with screenshots of labeled images. Again, we cannot assume the job will get done to our satisfaction without them.
2) Not Providing Enough Examples
Edge cases are very common. A particular image or set of images may not be as straightforward to label as others. Often, the labeler will not be sure what to do and make an educated guess. Sometimes this guess is right, and other times it’s not.
In addition to clear and detailed written instructions, it is necessary to provide as many screenshots of labeled images as we can. Aim to provide a handful of correctly labeled image screenshots, and a variety of correctly labeled edge cases. For these ‘special situation’ examples, provide additional written instructions and context.
This may seem like a lot of work just to perform what seem to be simple menial tasks, but they are absolutely critical if you want to avoid wasting time going back and forth between teams, fixing systematic errors, etc. All of this delays the delivery of your dataset to AI engineers, which delays the training, testing, and deployment of machine learning models.
3) Not Assessing Performance Periodically
There have been many times when labeling mistakes went too far before catching them. This has led to a ton of double work, re-work, fixing mistakes, etc. This of course is inefficient, and expensive.
Add a step to your image labeling pipeline. For example, every 100-200 labeled images, have another ‘QA’ team review them. This adds complexity, and also requires training of the labeling QA team, but in practice it helps catch mistakes early and provide feedback. This feedback loop is essential to having a smooth process with clearly defined expectations and outcomes.
From personal experience, and from working with enterprise clients, mistakes are very common even for ‘simple’ labeling tasks. They are usually more apparent in the beginning of a project or use case, and it gets better as you iterate and improve your strategy. It also depends on who the “labelers” are. If they are data scientists or ML engineers (try to avoid it if you can), the performance is noticeably higher because we know what we want, we know how computer vision models learn, and we understand the immense value of pristine data.
However, if most of it is delegated to non-technical members of your company, or outsourced to third-party vendors, it is paramount that you pay extreme attention to details and document everything in as much clarity and detail as possible. It’s also extremely helpful to have data scientists or ML engineers sit down with labeling team members and go through several examples together. Constant communication is key to maximize performance and minimize wasting time fixing labels.
If you need help to accelerate your company’s machine learning efforts, or if you need help getting started with enterprise AI adoption, send me a LinkedIn message or email me at email@example.com and I will be happy to help you.
Subscribe to this blog to get the latest tactics and strategies to thrive in this new era of AI and machine learning.
Subscribe to my YouTube channel for business AI video tutorials and technical hands-on tutorials.
Client case studies and testimonials: https://gradientgroup.ai/enterprise-case-studies/
Follow me on LinkedIn for more content: linkedin.com/in/CarlosLaraAI