I've ran Instant-NGP (a nerf technique) that uses MLPs to generate 3D scenes from 2D images only. It was even able to create 3D scenes for areas that were absent in the 2D image. How amazing! This could automate the work of creating 3D contents with Unreal to some extent.
Research Project - Worked on image+depth fusion techniques with transformer networks and with traditional YOLOV7 for segmentation on indoor dataset used for autonomous wheelchair scenario.
Problem - Automate the motorized wheelchair by creating an understanding of the surrounding scene
Solution - Using various segmentation methods, analyze the pros and cons of the outputs based on the model architectures, and improve the performance on the collected dataset.
Problem : Shortage of Image data with labels
Solution : Synthetically generate image data with Generative AI Models
A model built with CNN that detects if a person is happy based on their smiles using Keras in Python. The train accuracy is 99% while test accuracy is 97%.
Training set = 600 pictures of size (600,64,64,3)
and
Test-set = 150 pictures of size (10,64,64,3)
What's the difference between face verification and recognition?
Face verification is a simpler task that involves matching the input image with the claimed person's image. It is a 1:1 matching problem!
However, face recognition is a bit more complex. It is matching a given input image across a database of images to find who the particular individual
is. It is a 1:K matching problem!
Both face recognition and verification are applied here with triplet loss function and a pre-trained model to map face images into 128-dimensional
encodings. Numpy,Keras, Tensorflow, pandas, cv2 are used to build this model.
I absolutely LOVE this project as it generates ART with neural networks.
Neural Style Transfer algorithm is used to merge an image's content with another image's style to output unique styled art. A pre-trained model is used to do this task
Content cost function is used using tensorflow. Style cost is computed using Style Matrix( Gram Matrix) and Style Weights. Total cost is the addition of these costs with added weights and Adam optimizer is used.
Libraries used are matplotlib, numpy, pprint, scipy,tensorflow, PIL, etc.
Here, we are mainly creating object detection on a car detection dataset and dealing with bounding boxes
Libraries used : keras,numpy,scipy,matplotlib, tensorflow, pandas,PIL
"You Only Look Once" (YOLO) performs object detection, and then can be applied it to car detection. As YOLO model is very computationally expensive to train, we have loaded pre-trained weights. It is a popular algorithm because it achieves high accuracy while also being able to run in real-time. This algorithm "only looks once" at the image in the sense that it requires only one forward propagation pass through the network to make predictions. After non-max suppression, it then outputs recognized objects together with the bounding boxes.
The YOLO architecture is: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).
The input is a batch of images, and each image has the shape (m, 608, 608, 3)
The output is a list of bounding boxes along with the recognized classes.
Anchor boxes are chosen by exploring the training data to choose reasonable height/width ratios that represent the different classes.