Object detection and classification are fundamental tasks in computer vision, used in applications like facial recognition, autonomous driving, medical imaging, and defense systems.
Object Detection: Identifies objects in images, providing bounding boxes.
Classification: Assigns labels to detected objects.
Modern Deep Learning Approaches
Modern approaches leverage neural networks for improved accuracy and scalability.
Single-Shot Detection (SSD) vs Dual-Shot Detection.
YOLO (You Only Look Once): Ideal for high-speed detection, though less effective for smaller objects.
Region-Based CNNs (R-CNN):High accuracy but slower. Faster R-CNN introduces region proposal networks for better speed.
Mask R-CNN: Adds segmentation capabilities, enhancing shape analysis at a higher computational cost.
Object Detection in SSS
SSS images present unique challenges due to noise, low resolution, and texture variability.
Challenges: High noise levels, shape and texture similarities, and limited annotated data.
Comparison of Methods for SSS
Real-Time Performance: YOLO excels in speed, suitable for real-time detection, while Faster R-CNN is better for accuracy in complex scenes.
Noise Adaptability: Domain adaptation and fine tuning techniques are essential for SSS, addressing noise and data scarcity.
Evaluation Metrics
Classification Metrics:
Accuracy:Proportion of correct predictions.
Precision & Recall: Balance false positives and false negatives.
F1 Score: Harmonic mean of precision and recall.
Object Detection Metrics:
IoU (Intersection over Union): Measures overlap between predicted and true bounding boxes.
mAP (Mean Average Precision): Precision averaged across classes and IoU thresholds.
Detection Speed: Frames per second (FPS), crucial for real-time tasks.
The dataset contains 1170 side-scan sonar images collected using a 900–1800 kHz Marine Sonic dual frequency side-scan sonar of a Teledyne Marine Gavia Autonomous Underwater Vehicle (AUV) .
All the images were carefully analyzed and annotated, including the image coordinates of the Bounding Box (BB) of the detected objects divided into NOn-Mine-like BOttom Objects (NOMBO) and MIne-Like COntacts (MILCO) classes
Dataset
Date
Images
MILCO
NOMBO
2010
345
22
12
2015
120
238
175
2017
93
28
2
2018
564
95
46
2021
48
49
0
Table 1. Summary of the dataset.
Dataset
Non mine image
Mine image
Preliminary Results
Initial experiments on sonar datasets reveal key insights:
Dataset Details: 825 sonar images split into 80% training and 20% testing.
Classification Results:
Before Training: Accuracy of 0.27, indicating poor performance.
After Training: Improved accuracy (0.91) with a ResNet-based classifier.
Detection Challenges: Pretrained YOLOv8 struggled with noisy SSS data, emphasizing the need for domain adaptation and fine-tuning.
Future Work
Future efforts aim to address current limitations:
Enhanced Datasets: Include diverse, annotated sonar images to improve training.
Synthetic Data Generation:Augment datasets with realistic synthetic images.
Advanced Models: Explore fine-tuned YOLO and R-CNN variants for SSS data.
These steps will enhance model reliability and accuracy in real-world applications.