20 Computer Vision Projects and Applications to learn with Deep Learning in 2022
What is Computer Vision?
Human see through their eyes and process what they can see. This type of visual perception, when used in the realm of AI, is commonly known as Computer Vision or Machine Vision. In computer vision we try to teach computer how to see. Helping blind people navigate, Reconstructing Images, Face recognition, Image restoration, Understanding Scenes are some of the popular computer vision applications. The goal of Computer Vision is to extract meaningful information from images and deriving an abstract representation of its content. There is an old saying, "An image is worth more than ten thousand words" - and for that reason Computer Vision has received enormous amounts of attention from several scientific communities in the last decades. The history of computer vision is quite interesting Summer Vision Project.20 Computer Vision Applications
Thanks to the thousands of scientific and mathematical communities we have seen many exciting application of Computer Vision. The field of Computer Vision has been greatly enhanced by the advancement of Deep learning technology. Contemporary popular Computer Vision applications like Classification, Detection, Segmentation are nearly inseparable from the field of Deep Learning. Some of the applications of Computer Vision where deep learning is used are:
- Image Classification
- Object Detection
- Semantic Segmentation
- Human Pose Estimation
- Face Detection
- Face Recognition
- Neural Style Transfer
- Face Transfer
- Image Captioning
- Visual Question Answering
- Image Colorization
- Image Compression
- Image Enhancement
- Optical Character Recognition
- Image Inpainting
- Facial Expression Analysis
- Object Tracking
- Automatic Sign Language Recognition
- Robot Navigation
- Automatic drone inspections
1. Image Classification
In Image Classification task, we try to classify given image by assigning it to a specific category or label. During such task, we make an assumption that there is only one object or target in the given image and we focus on how to identify the category of given target.
Input: an image with a single object.
Output: a class label (e.g. cat, dog, etc.)
Example output: class probability (e.g. 84% cat).
Classification in Computer Vision |
Papers to read
ImageNet Classification with Deep Convolutional Neural Networks
Gradient-based Learning Applied to Document Recognition
2. Object Detection
Object detection is the task of detecting instances of objects of a certain class within an image. In image classification there is only one output(class of the image) and we only focus on how to find the class. There are many cases where there are multiple objects with different categories. Object detection is a process of classifying each objects and finding position of each object in the given image.
Object Detection: (Object Classifying + Object Localization) of each objects in given image.
Object Classifying: Object classifying means assigning a class label to the object.
Object Localization: Object localization means finding the position of object mostly done by drawing a bounding box around the object.
Papers to read
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
3. Semantic Segmentation
Semantic segmentation, or image segmentation, is the task of clustering
parts of an image together which belong to the same object class. It is a
form of pixel-level prediction because each pixel in an image is
classified according to a category. Semantic segmentation is one of the active research field in the realm of Computer Vision.
Input: images
Output: regions, structures(line segments, curve segments, circles, etc.)
4. Human Pose Estimation
The
goal of Human Pose Estimation task is to detect the position and
orientation of an object in given image or video frame. The given image
shows one of the outcome of Human Pose Estimation. Here, we have
determined the position of human joings from give input image. Beside a
image, we can use image sequences, depth images, skeleton data as an
input for Human Pose Estimation task.
5. Face Detection
Face detection involves a task where we identify the boundary box that enclose face of human being in a given photo or video frame. Face detection is tremendously important field in the Computer Vision because it can be used as preliminary step for Face recognition, sentiment analysis, video surveillance, and many other fields. Face detection system takes an arbitrary image or video frame as an input, and it determine whether there are any facial structure in the image, and if any face is present, it will return the image location and extent of each face. In the image below, we can see a green bounding box that enclose human faces. After identifying face we may judge their handsomeness. For instance, Nirdesh Shrestha is the most handsome man of Nepal and we used Face Detection techniques to find the face of Nirdesh in many photos.
6. Face Recognition
We
human beings perform face recognition task routinely and effortlessly
in our daily lives. The very step of Face recognition is Face detection.
After getting the boundary box of the box, we compare given face
against a database of pre-existing faces.
It begins with detection - distinguishing human faces from other
objects in the image - and then works on identification of those
detected faces.
7. Neural Style Transfer
Neural Style transfer is the task of changing the style of an image in one domain to the style of an image in another domain. Neural Style Transfer includes 3 images - style image, content image and generated image. We take style of style image, apply it to the content of content image and generate a generated image which will have the content of content image but style of the style image.
Content : Objects and their arrangement
Style: Style, Colors, Textures
Papers to read
A Neural Algorithm of Artistic Style
8. Face Transfer
In Face transfer we map facial performances from source to facial animations of target. Here both source and target are human individual. Face transfer utilizes facial expressions and head poses coming from the video of source actors to produce a video of target character.
9. Image Captioning
Image captioning is the task of providing a natural language description of the content within an
image. It lies at the intersection of computer vision and natural language processing.
10. Visual Question Answering
Visual Question Answering is an active research area to answer questions based on given input image. This field is combination of both Natural Language Processing and Computer Vision. The questions are asked in natural language based on given input image. For example: In the image below the question asked is "What is the mustache made of?". Clearly to ask such question we have to understand the image first. The Visual Question Answering task combines challenges for processing data with both Visual and Linguistic processing, to answer basic "common sense" question about give images.
11. Image Colorization
Old photographs are mostly taken in monochrome. Image colorization is the field of Computer Vision where we add plausible colors to monochrome images and videos. Image Colorization is a highly undetermined problem, requiring mapping a real-valued luminance image to a three-dimensional color valued one, that has not a unique solution. That means, there is no one single outcome in image colorization.
12. Image Compression
Image compression is a data compression technique primarily applied with the objective to reduce the size of image. The compression can be either lossless or lossy. In lossless, no information is lost when image is changed from normal form to compress form. In lossy compression some information are lost. Image compression is a type of data compression applied to digital images, to reduce their size for efficient transmission and storage. The main goal of image compression is to lower the storage and transmission requirements of a digital image.
Papers to read
Variable Rate Deep Image Compression With a Conditional Autoencoder
13. Image Enhancement
Image Enhancement is among the most appealing areas of Computer Vision. The input to Image Enhancement will be a deteriorated image like images with low contrast, blurred images and so on and the output will be image with high quality. Image Enhancment bring out the details that is obscured, hidden or they simply highlight certain features of interest in an image. The primary goal of Image Enhancement is to modify attributes of an image like brightness, gray-level etc to make it more suitable for given task and a specific observer. The filters we use in Instagram, Snapchat are the application of Image Enhancement. A familiar task that involve Image Enhancement is when we decrease or increase the contrast, brightness of an image to make it look more better..
14. Optical Character Recognition(OCR)
Many texts, documents we see in our daily life are not in machine readable format. They are either handwritten or printed as a text. The goal of Optical Character Recognition is to take a printed text or handwritten text and convert them into digital form such that they can be read and manipulated by the computers. When texts are converted from handwritten or print form to digital form they require less memory, they can be displayed on web, and they can be transmitted from one place to another easily.
Popular OCR Tools
15. Image Inpainting
Image Inpainting is a task of reconstructing missing regions in an image. It is an important problem in computer vision and an essential functionality in many imaging and graphics applications, e.g. object removal, image restoration, manipulation, re-targeting, compositing, and image-based rendering. The technique of modifying an image in an undetectable form, is as ancient as art itself. The goals and applications of inpainting are numerous, from restoration of damaged paintings and photographs to the removal of selected objects.
16. Facial Expression Analysis
Facial expressions Analysis and Emotion Recognition are two separate applications of Computer Vision. Emotion Recognition requires higher level of knowledge while Facial expression analysis can be done easily with lower level knowledge of human face. Human shows facial expression in response to a person's internal emotional situation, communications, intentions and feelings. The main goal of facial expression analysis is to develop a computer system that will automatically analyze and recognize the facial movements and facial feature changes from visual information.
17. Object Tracking
Object Tracking is a field in computer vision where we try to estimate the trajectory of an object in a sequence of scenes. The idea behind object tracking is very simple. The very first task is to identify initial coordinates of object which we want to track, assigning unique id to the object and then tracking each of the objects as the object make a move around the scene.
18. Automatic Sign Language Recognition
Automatic
Sign Language Recognition or ASLR is an active field of research topic
located in between gesture recognition and linguistics. The main aim of
automatic sign language recognition is to automatically recognize the
meaning of the movements and gesture, and translate those gestures into
meaningful human spoken language. In other words we can say that the
ultimate goal is to build a translation system that take input a sign
language video and outputs its meaning in audio or text format.
Post a Comment