3D-Convolutions and its Applications
Compared to 2D ConvNet, 3D ConvNet has the ability to model temporal information better owing to 3D convolution and 3D pooling operations. In
3D ConvNets, convolution and pooling operations are performed spatio-temporally while in 2D ConvNets they are done only spatially.
This ability to analyze a series of frames or images in context has led to the use of 3D CNNs as tools for action recognition and evaluation of medical imaging.
Human Action Recognition
action recognition is the process of analyzing the position of objects in a sequence of 2D images, like a video, and classifying it in the context of the surrounding frames to either interpret or predict object movement. Action recognition is being used in the development of assistive technologies, like smart homes, automation of surveillance or security systems, and virtual reality applications, such as creating decentralized meeting spaces.
There are plenty of sub-domains under this topic of action recognition such as,
- Object recognition in a video scene.
- Action similarity labeling between two video clips.
- Fraud detection in a shopping complex and video surveillance.
- Video captioning of any scene, eg: sports commentary, labeling of live video, etc.
Similar to the way CNNs are being used to evaluate video, they can be used to analyze medical imaging, such as CT scans or MRI, for purposes of detection, diagnosis, and development of patient-specific devices. Currently, medical imaging is done by capturing slices of the depth of the tissue to be evaluated but because the body is made of 3D structures that move, all of the images must be viewed in context to be useful. By combining these static images with volume or spatial context, processes such as identification of cancerous cells, evaluation of arterial health, and structural mapping of brain tissue can be initially processed by a 3D CNN, reducing the time needed for human evaluation and allowing faster patient care.
In medical imaging, there are several different areas to explore which can be attributed by 3D convolutions and it’s practice:
- Detect Deficient Coverage in Colonoscopy Screening. (refer above image)
- 3D Convolutional Neural Networks for Alzheimer’s Disease Classification.
In the end, I would say that convolutions are coming back as we are well equipped with better computational capabilities like GPUs, CPU cluster, cloud proper mixture with several handcrafted techniques and benchmark models/architectures we can solve many existing problems with a better understanding of the mathematical aspect of 3D Convolutions.
please keep exploring more about deep learning and it’s applications.