Welcome to the Applied Singularity blog. Use this Contents post to browse through the full list of articles and Guided Learning Modules we have created or find specific topics of interest.
We all know that if we click a photo from a normal smartphone it will be in 2 Dimension, but can we convert it into a 3 Dimensional photo? The answer to it is 'YES'. The below image is a standard color photo made with a smartphone and hence, as we discussed, it contains only a 2D representation of the world. When we look at it, our brain is able to reconstruct the 3D information from it. Now, it is possible for an AI to do the same, and even go all the way to create a 3D version of this photo.
In today's world, everybody can make Deepfakes by recording a voice sample. So let's understand what this new method can do by example. Let’s watch the below short clip of a speech, and make sure to pay attention to the fact that the louder voice is the English translator. If you pay attention, you can hear the chancellor’s original voice in the background too. So what is the problem here? Honestly, there is no problem here, this is just the way the speech was recorded.
In March 2020, a paper named Neural Radiance Fields, NeRF appeared. With this technique, we could take a bunch of input photos and train a neural network to learn them, and then synthesize new, previously unseen views of not just the materials in the scene but the entire scene itself. We can learn and reproduce entire real-world scenes from only a few views by using neural networks. However, it had some limitations, such as trouble with scenes with variable lighting conditions and occlusions.
The research field of image translation with the aid of learning algorithms is improving at a fast speed. For example, this earlier technique would look at a large number of animal faces and could interpolate between them, or in other words, blend one kind of dog into another breed. It could even transform dogs into cats, or even further transform cats into cheetahs. The results were very good, but it would only work on the domains it was trained on i.e. it could only translate to and from species that it took the time to learn about.
The performance of deep learning models has improved significantly on several computer vision tasks, but yet supervised learning models rely on a large number of labeled images. We know how expensive it is to get high-quality annotations, and this motivates research in other directions, including Active Learning.
In the last blog post about indoor air quality monitoring we discussed the importance and need of monitoring indoor environments for indoor air pollutants and even went into details about what are the most commonly occurring indoor air pollutants. This post will shed light on how we can use IoT to solve the problem of monitoring indoor air quality in real time. The global indoor air quality monitoring market is expected to grow from USD 2.5 billion in 2015 to USD 4.6 billion by 2022. There are many solutions available in the market today from vendors like Honeywell and lesser known startups around this problem.
Indoor air pollution is the lesser known and talked-about problem of our lives today. While we are adequately aware about the problem of air pollution outdoors, what skips our attention is the harmful air that we are exposed to in our indoor environments.
In the previous posts of the 'Clustering in ML Series', we learned about the Connectivity-Based Clustering methods in ML. In this part, we will discuss another couple of clustering methods, Distribution-Based Clustering, and Fuzzy Clustering.
In the previous posts of the 'Clustering in ML Series', we learned about the Density-Based Clustering methods in ML. In this part, we will discuss another clustering method, known as Connectivity-Based Clustering. It is also known as Hierarchical clustering.
In the previous post, we learned about k-means which is easy to understand and implement in practice. That algorithm has no notion of outliers, so all points are assigned to a cluster even if they do not belong in any. In the domain of anomaly detection, this causes problems as anomalous points will be assigned to the same cluster as “normal” data points.
In the previous post of the 'Clustering in ML Series', we looked at what clustering is and some of the different clustering methods in ML. In this part, we will discuss the very first type of clustering method, known as Centroid-based clustering.
Clustering is the task of dividing given data points into different groups such that data points within a group are similar to other data points within the same group and dissimilar to data points in other groups.
At present, the world is using many environmental chemicals to improve economic profit in various sectors such as agriculture, aviation, pharmaceutical, mining, and in many polymer companies, which are resulting in a severe negative impact on human health and the environment. Over the last two decades, people are often exposed to chemicals leading to many disorders such as cancer, neurodegeneration, and congenital diseases.
Baby Spinach-based Minimal Modified Sensor (BSMS) is used for detection of small molecules. It serves as a platform for detection of an array of biomolecules such as small molecules, nucleic acids, peptides, proteins, etc. The compact nature of the BSMS sensor provides flexibility to introduce many different changes in the design. BSMS is an excellent sensitive sensor, detecting targets present in small amounts, even in nano molar (nM) range.
In our last blog post, we went through the Faster R-CNN architecture for Object Detection, which remains one of the State-of-the-Art architectures till date! The Faster R-CNN has a very low inference time per image of just ~0.2s (5 fps), which was a huge improvement from the ~45-50s per image from the R-CNN. So far, we have understood the evolution of R-CNN into Fast R-CNN and Faster R-CNN in terms of simplifying the architecture, reducing training and inference times and increasing the mAP (Mean Average Precision). This article is about taking a step further from Object Detection to Instance Segmentation. Instance Segmentation is the identification of boundaries of the detected objects at pixel levels. It is a step further from Semantic Segmentation, which will group similar entities and give a common mask to differentiate from other objects. Instance segmentation labels each object under the same class as a different instance itself.
In our previous articles, we understood few limitations of R-CNN and how SPP-net & Fast R-CNN have solved the issues to a great extent leading to an enormous decrease in inference time to ~2s per test image, which is an improvement over the ~45-50s of the R-CNN. But even after such a speedup, there are still some flaws as well as enhancements that can be made for deploying it in an exceedingly real-time 30fps or 60fps video feed. As we know from our previous blog, the Fast R-CNN & SPP-net are still multi-stage training and involve the Selective Search Algorithm for generating the regions. This is often a huge bottleneck for the entire system because it takes plenty of time for the Selective Search Algorithm to generate ~2000 region proposals. This problem was solved in Faster R-CNN - the widely used State-of-the-Art version in the R-CNN family of Object Detectors. We’ve seen the evolution of architectures in the R-CNN family where the main improvements were computational efficiency, accuracy, and reduction of test time per image. Let's dive into Faster R-CNN now!
In our recent blog posts on R-CNN and Fast R-CNN, there was one more famous architecture for Image Classification, Object Detection & Localization. It was the first runner-up in Object Detection, 2nd Runner Up in Image Classification, and 5th Place in Localization Task at the ILSVRC 2014! This feat makes it one of the major architectures to study on the subject of Object Detection and Image Classification. The architecture is SPPnet - Spatial Pyramid Pooling network. In this article, we shall delve into SPPnet only from an Object Detection Perspective.
In the previous post, we had an in-depth overview of Region-based Convolutional Neural Networks (R-CNN), which is one of the fundamental architectures in the Two-Stage Object Detection pipeline approach. During the ramp-up of the Deep Learning era in around 2012 when AlexNet was published, the approach of solving the Object Detection problem changed from hand-built features like Haar features and Histogram of Oriented Gradients approaches to the Neural Network-based approach, and in that mainly the CNN-based architecture. Over time it has been solved via a 2-Stage approach, where the first stage will be mainly based on generating Region Proposals, and the second stage deals with classifying each proposed region.
In our last post, we had a quick overview of Object Detection and the various approaches and methods used to tackle this problem in Computer Vision. Now, it's time to dive deep into the popular methods of building a State-of-the-Art Object Detector. In particular, we shall focus on one of the earliest methods - Region-Based Convolutional Neural Network Family of Object Detectors. The reason it is called R-CNN is that with modifications to a CNN architecture in terms of structuring or adding auxiliary networks or layers, the Object Detector was built, albeit not achieving the state of the art performance. R-CNN is the best way to start in the Object Detection space.
Object Detection is one of the most sought after sub-disciplines under Computer Vision. The fact that it's extensively utilized in major real-world applications has made it extremely important. When humans perceive, we have an innate cognitive intelligence trained daily to acknowledge and understand what we see through our eyes. Object detection is one of the advanced methods of how a computer tries to match the power to perceive and understand things around, the primary steps being Image Classification and Localization. Each object will have its own set of varying characteristics that are challenging for a Deep Learning Model/Architecture. It is a different ball game altogether to build an efficient and accurate Object Detector. Let's quickly have a short tour of the extensions and key concepts under Computer Vision before diving in deep on Object Detection.
The beginning of the downfall for Batch Normalization? DeepMind released a new family of state-of-the-art networks for Image Classification that has surpassed the previous best - EfficientNet - by quite a margin. The main idea behind the new architecture is the use of Normalizer Free Neural Nets or NF-Nets to train networks instead of batch … Continue reading NF-Nets: Normalizer Free Nets
Dynamic Sky Replacement and Harmonization in Videos. Through the power of neural network-based learning algorithms today it is possible to perform video to video translation. For instance, it goes, a daytime video, and out comes a nighttime version of the same footage. This work ‘Castle in the Sky’ proposes a vision-based method for video sky … Continue reading Castle in the Sky
Style transfer is an interesting problem in machine learning research where we have two input images, one for content and another for style, and the output is our content image re-imagined with this new style. The content can be a photo straight from our camera, and the style can be a painting, which leads to … Continue reading Interactive Video Stylization Using Few-Shot Patch-Based Training
In the previous posts, we discussed Bagging and Boosting ensemble learning in ML and how they are useful. We also discussed the algorithms which are based on it i.e. Ada Boost and Gradient Boosting. In this part, we will discuss another ensemble learning technique known as Stacking. We also discuss a bit about Blending (another … Continue reading Ensemble Learning in ML – Part 3: Stacking
In the last part, we discussed what ensembling learning in ML is and how it is useful. We also discussed one ensemble learning technique - Bagging - and algorithms that are based on it i.e. Bagging meta-estimator and Random Forest. In this part, we will discuss another ensemble learning technique, which is known as Boosting, … Continue reading Ensemble Learning in ML – Part 2: Boosting
Let’s understand ensemble learning with an example. Suppose you have a startup idea and you wanted to know whether that idea is good to move ahead with it or not. Now, you want to take preliminary feedback on it before committing money and your precious time to it. So you may ask one of your … Continue reading Ensemble Learning in ML – Part 1: Bagging
Membrane lipids play diverse roles in cellular function. On the membrane, they act as structural elements, serve as a pool of secondary messengers and act as a platform for membrane proteins. Phosphatidyinositol (PI) plays an important role in signal transduction and membrane trafficking. The PI on the plasma membrane is phosphorylated to P1 and P2 … Continue reading Workflow and Implications of membrane lipids
Insulin resistance (IR) is a clinical and major pathological condition that occurs due to inappropriate cell response to insulin hormone and abnormal secretion in the body. The decrease in insulin sensitivity leads to the progression of many metabolic disorders such as auto-immune diseases, type-1 diabetes mellitus (T1DM), obesity, atherosclerosis, cardiovascular diseases, etc. In some cases … Continue reading Discovery of novel molecular pathways linked to Insulin Resistance
Today, a variety of techniques exist that can take an image that contains humans and perform pose estimation on it. This gives us interesting skeletons that show us the current posture of the subjects shown in the given images. Having a skeleton opens up the possibility for many cool applications, for instance, it’s great for … Continue reading PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
I recently came across a paper "CLEVRER" ("CoLlision Events for Video REpresentation and Reasoning", by - Kexin Yi, Chuang Gan, Yunzhu Li, Pushmeet Kohli, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum). It intrigued me, so I wanted to share some thoughts about it. With the advancements in NN-based learning algorithms, many of us are wondering … Continue reading CLEVRER: CoLlision Events for Video REpresentation and Reasoning