



AAAI2020:
Main concerns:
- Generation of bounding boxes from contour is not clear
- No info about neighboring ROIs in linking module
- FrRCNN+FPN+RA is better than PlotNet in most classes
CVPR2020:
Changes: AAAI paper + detailed explanation on CV method
Main concerns:
- Need more ablation study and discussion about PlotNet
- Are there any other methods for OD over plots?
- Limited contribution, novelty is incremental
ECCV2020:
Changes: CVPR paper + AN-ROI layer (info about neighbouring ROIs) + Custom Loss + Extensive ablation study with IoU based regression losses
Main concerns:
- Contribution & Ablation study of: (i) CV method; (ii) custom loss, (iii) linking module; is missing.
- description of CV based method was very vague
- Why are we evaluating existing OD models on plots?
Our Current State:
- we have ablation study of linking module
- we have comparison with latest IoU based loss functions (CIoU, DIoU, GIoU)
- we can write more about the thinking process which led us to our CV-based method
- Custom loss does not give significant improvement (our main concern)
| PlotNet | FrRCNN+FPN+RA | MaskRCNN | RetinaNet | Yolo-V3 | |
|---|---|---|---|---|---|
| L1 loss | |||||
| -log IoU | |||||
| 1 - IoU | |||||
| Custom Loss | |||||
| GIoU | |||||
| L1 loss - log IoU | |||||
| L1 loss + 1 - IoU | |||||
| L1 loss + Custom loss | |||||
| L1 loss + GIoU |
- Old linking module (v2) + loss
- understand mAP properly
- try different loss functions (log IOU, etc.)
- mAP with an average pool
- visualize why our ground-truth mAP@0.9 is not 100%
- are beta1 and beta2 such that both the losses are in the same range
- Custom: -(1+IOU)**gamma * log(IOU)

| mAP@0.9 | mAP@0.75 | mAP@0.5 |
|---|
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
| 90.95 | 43.43 | 99.12 | - | 98.00 | 0.24 | 99.87 | 98.44 | 99.68 | 99.36 |
| 94.31 | 60.18 | 99.82 | - | 99.49 | 0.19 | 99.99 | 99.61 | 99.67 | 99.88 |
| 81.01% | 95.34% | 98.11% |
|---|
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
GIOU
Smooth L1
Log IOU
| 83.68% | 97.16% | 98.20% |
|---|
| 94.13 | 57.97 | 99.80 | - | 99.39 | 0.19 | 99.99 | 99.60 | 99.67 | 99.88 |
1-IOU
| 83.40% | 97.05% | 98.20% |
|---|
| 77.22% | 94.58% | 97.76% |
|---|
| 91.77% | 44.68% | 99.44% | - | 98.87% | 0.24% | 99.83% | 98.63% | 99.66% | 99.65% |
Custom Loss
gamma=2
| 81.42% | 95.69% | 98.14% |
|---|
| 92.18 | 47.45 | 99.39 | - | 98.71 | 0.21 | 99.90 | 99.26 | 99.74 | 99.69 |
SL1 + - log IOU
| 81.84% | 95.95% | 98.27% |
|---|
| 91.31 | 47.05 | 99.38 | - | 98.96 | 0.21 | 99.92 | 99.17 | 99.77 | 99.72 |
SL1 + GIOU
| 81.72% | 96.02% | 98.23% |
|---|
| 91.47 | 48.64 | 99.44 | - | 98.86 | 0.21 | 99.90 | 99.22 | 99.92 | 99.68 |
SL1 + 1-IOU
| 81.93% | 96.07% | 98.24% |
|---|
| 91.23 | 48.90 | 99.57 | - | 98.95 | 0.24 | 99.94 | 99.29 | 99.67 | 99.72 |
SL1 + custom loss
| 81.95% | 95.94% | 98.20% |
|---|
| mAP |
|---|
| 97.76% |
|---|
| 94.58% |
| 77.22% |
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
FrRCNN_FPN_RA
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 98.11% |
|---|
| 97.21% |
| 83.89% |
| 95.33% | 91.60% | 98.96% | - | 99.27% | 99.76% | 99.02% | 99.77% | 99.69% | 99.57% |
| 91.02% | 31.69% | 97.08% | - | 81.57% | 99.36% | 96.06% | 85.33% | 82.00% | 90.95% |
| 95.04% | 86.46% | 98.64% | - | 99.08% | 99.73% | 97.30% | 99.59% | 99.63% | 99.39% |
PlotNet v2
Analysis of different object detection models (PlotQA)
PlotNet v0
| 95.52% | 91.24% | 99.66% | - | 99.52% | 99.87% | 98.27% | 99.83% | 99.62% | 99.77% |
| 95.52% | 91.24% | 99.45% | - | 99.52% | 99.56% | 97.44% | 99.83% | 99.62% | 99.77% |
| 95.52% | 91.24% | 99.79% | - | 99.52% | 99.97% | 99.68% | 99.83% | 99.62% | 99.78% |
| 98.33% |
|---|
| 98.14% |
| 97.99% |
| 0.5 |
| 0.75 |
| 0.9 |
#FLOPs: 283.85G
#Param: 43.81M
#FLOPs: 100.927G
#Param: 131.944M
Only
Smooth L1
PlotNet v12
| 95.24% | 88.94% | 99.01% | - | 99.42% | 99.83% | 98.06% | 99.79% | 99.75% | 99.64% |
| 92.16% | 61.18% | 98.38% | - | 93.46% | 99.44% | 97.21% | 94.21% | 95.45% | 96.02% |
| 95.40% | 91.06% | 99.29% | - | 99.51% | 99.96% | 99.56% | 99.83% | 99.75% | 99.77% |
| 98.24% |
|---|
| 97.74% |
| 91.94% |
| 0.5 |
| 0.75 |
| 0.9 |
CVPR
Paper
PlotNet v30
| 95.27% | 85.59% | 99.23% | - | 99.48% | 99.89% | 97.91% | 99.77% | 99.80% | 99.31% |
| 92.07% | 56.89% | 92.50% | - | 93.94% | 99.55% | 73.97% | 88.80% | 83.75% | 58.36% |
| 95.43% | 91.09% | 99.54% | - | 99.50% | 99.98% | 99.59% | 99.82% | 99.83% | 99.78% |
| 98.28% |
|---|
| 97.36% |
| 82.20% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v31
| 95.24% | 89.67% | 99.11% | - | 99.44% | 99.25% | 98.09% | 99.77% | 99.87% | 99.63% |
| 92.93% | 67.94% | 98.24% | - | 95.29% | 98.38% | 97.16% | 94.55% | 97.82% | 96.06% |
| 95.40% | 90.98% | 99.33% | - | 99.49% | 99.57% | 99.58% | 99.81% | 99.87% | 99.78% |
| 98.20% |
|---|
| 97.79% |
| 93.15% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 0.85)
L1 + 0.6*Custom Loss (gamma = 0.85)
PlotNet v32
| 95.18% | 84.77% | 99.32% | - | 99.44% | 99.84% | 98.07% | 99.76% | 99.68% | 99.68% |
| 91.88% | 61.44% | 96.44% | - | 95.58% | 99.52% | 97.19% | 90.64% | 97.55% | 87.66% |
| 95.45% | 91.51% | 99.54% | - | 99.45% | 99.95% | 99.63% | 99.82% | 99.68% | 99.78% |
| 98.31% |
|---|
| 97.30% |
| 90.88% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 2.0)
PlotNet v36
L1 + 0.6*Custom Loss (gamma = 2.0)
| 95.19% | 86.04% | 98.32% | - | 99.49% | 97.42% | 94.41% | 99.77% | 99.60% | 99.67% |
| 92.68% | 61.05% | 97.02% | - | 96.05% | 94.93% | 93.62% | 93.29% | 97.13% | 95.37% |
| 95.37% | 91.22% | 98.69% | - | 99.52% | 98.35% | 95.90% | 99.81% | 99.65% | 99.76% |
| 97.59% |
|---|
| 96.66% |
| 91.24% |
| 0.5 |
| 0.75 |
| 0.9 |
Analysis of different object detection models (PlotQA)
Only
Smooth L1
PlotNet v12
| 95.24% | 88.94% | 99.01% | - | 99.42% | 99.83% | 98.06% | 99.79% | 99.75% | 99.64% |
| 92.16% | 61.18% | 98.38% | - | 93.46% | 99.44% | 97.21% | 94.21% | 95.45% | 96.02% |
| 95.40% | 91.06% | 99.29% | - | 99.51% | 99.96% | 99.56% | 99.83% | 99.75% | 99.77% |
| 98.24% |
|---|
| 97.74% |
| 91.94% |
| 0.5 |
| 0.75 |
| 0.9 |
| mAP |
|---|
| 97.76% |
|---|
| 94.58% |
| 77.22% |
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
FrRCNN_FPN_RA
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 0.5 |
| 0.75 |
| 0.9 |
Only
log IOU
PlotNet v28
| 95.24% | 82.69% | 99.08% | - | 99.46% | 99.85% | 97.94% | 99.77% | 99.90% | 98.82% |
| 91.83% | 45.78% | 91.48% | - | 94.15% | 98.95% | 74.24% | 87.19% | 89.34% | 50.11% |
| 95.41% | 91.02% | 99.42% | - | 99.49% | 99.97% | 99.56% | 99.82% | 99.91% | 99.77% |
| 98.26% |
|---|
| 96.97% |
| 80.34% |
| 0.5 |
| 0.75 |
| 0.9 |
Only
1 - IOU
PlotNet v29
| 95.22% | 78.38% | 99.02% | - | 99.44% | 99.71% | 98.10% | 99.71% | 99.64% | 98.21% |
| 91.79% | 41.86% | 93.74% | - | 94.64% | 98.29% | 83.11% | 85.69% | 89.32% | 49.36% |
| 95.41% | 90.61% | 99.35% | - | 99.49% | 99.95% | 99.67% | 99.80% | 99.71% | 99.78% |
| 98.20% |
|---|
| 96.38% |
| 80.87% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v30
| 95.27% | 85.59% | 99.23% | - | 99.48% | 99.89% | 97.91% | 99.77% | 99.80% | 99.31% |
| 92.07% | 56.89% | 92.50% | - | 93.94% | 99.55% | 73.97% | 88.80% | 83.75% | 58.36% |
| 95.43% | 91.09% | 99.54% | - | 99.50% | 99.98% | 99.59% | 99.82% | 99.83% | 99.78% |
| 98.28% |
|---|
| 97.36% |
| 82.20% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 0.85)
PlotNet v32
| 95.18% | 84.77% | 99.32% | - | 99.44% | 99.84% | 98.07% | 99.76% | 99.68% | 99.68% |
| 91.88% | 61.44% | 96.44% | - | 95.58% | 99.52% | 97.19% | 90.64% | 97.55% | 87.66% |
| 95.45% | 91.51% | 99.54% | - | 99.45% | 99.95% | 99.63% | 99.82% | 99.68% | 99.78% |
| 98.31% |
|---|
| 97.30% |
| 90.88% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 2.0)
PlotNet v50
| 0.5 |
| 0.75 |
| 0.9 |
Only GIOU
Analysis of different object detection models (PlotQA)
| mAP |
|---|
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
PlotNet v24
| 95.18% | 85.48% | 98.83% | - | 99.26% | 99.76% | 97.64% | 99.59% | 99.62% | 97.92% |
| 90.69% | 45.15% | 96.30% | - | 86.56% | 99.17% | 95.72% | 83.11% | 85.37% | 51.12% |
| 95.32% | 91.38% | 98.21% | - | 99.41% | 99.88% | 99.50% | 99.81% | 99.89% | 99.75% |
| 98.13% |
|---|
| 97.03% |
| 81.46% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v25
| 95.23% | 86.66% | 98.77% | - | 98.37% | 99.17% | 97.62% | 99.50% | 99.71% | 97.25% |
| 90.89% | 45.56% | 91.78% | - | 86.33% | 99.40% | 89.91% | 79.53% | 88.12% | 50.27% |
| 95.25% | 91.23% | 99.13% | - | 98.80% | 99.87% | 99.44% | 99.80% | 99.91% | 99.44% |
| 98.10% |
|---|
| 96.92% |
| 80.20% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v26
| 95.22% | 87.02% | 98.95% | - | 99.30% | 99.78% | 97.88% | 99.63% | 99.91% | 97.94% |
| 91.83% | 45.95% | 94.36% | - | 89.83% | 99.27% | 95.99% | 83.71% | 87.84% | 51.24% |
| 95.38% | 91.61% | 99.21% | - | 99.73% | 99.91% | 99.55% | 99.87% | 99.93% | 99.78% |
| 98.33% |
|---|
| 97.30% |
| 82.18% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v40
| 95.14% | 86.69% | 98.87% | - | 97.75% | 99.83% | 97.76% | 99.69% | 99.26% | 99.26% |
| 91.65% | 48.10% | 96.51% | - | 86.53% | 99.39% | 96.29% | 86.78% | 87.19% | 72.21% |
| 95.42% | 90.73% | 99.14% | - | 97.83% | 99.96% | 99.24% | 99.75% | 99.35% | 99.76% |
| 97.91% |
|---|
| 97.14% |
| 84.96% |
| 0.5 |
| 0.75 |
| 0.9 |
log IoU
without AN-ROI layer
1 - IoU
Custom Loss (gamma = 0.85)
SL1 + log IoU
SL1 + 1 - IoU
PlotNet v41
| 95.07% | 86.32% | 98.32% | - | 97.16% | 99.80% | 97.77% | 99.69% | 99.62% | 98.48% |
| 91.32% | 51.92% | 96.77% | - | 85.84% | 99.28% | 96.43% | 89.75% | 78.33% | 71.63% |
| 95.39% | 90.39% | 98.74% | - | 98.24% | 99.92% | 99.42% | 99.77% | 99.62% | 98.91% |
| 97.82% |
|---|
| 96.92% |
| 84.59% |
| 0.5 |
| 0.75 |
| 0.9 |
SL1 + Custom
PlotNet v39
| 95.17% | 87.37% | 98.91% | - | 97.92% | 99.86% | 97.78% | 99.73% | 99.81% | 99.31% |
| 91.94% | 51.98% | 97.49% | - | 90.01% | 99.59% | 96.68% | 89.78% | 87.25% | 73.92% |
| 95.54% | 91.11% | 99.26% | - | 98.25% | 99.97% | 99.46% | 99.79% | 99.83% | 99.91% |
| 98.12% |
|---|
| 97.32% |
| 86.51% |
| 0.5 |
| 0.75 |
| 0.9 |
Analysis of different object detection models (PlotQA)
| mAP |
|---|
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
PlotNet v30
| 95.27% | 85.59% | 99.23% | - | 99.48% | 99.89% | 97.91% | 99.77% | 99.80% | 99.31% |
| 92.07% | 56.89% | 92.50% | - | 93.94% | 99.55% | 73.97% | 88.80% | 83.75% | 58.36% |
| 95.43% | 91.09% | 99.54% | - | 99.50% | 99.98% | 99.59% | 99.82% | 99.83% | 99.78% |
| 98.28% |
|---|
| 97.36% |
| 82.20% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 0.85)
PlotNet v32
| 95.18% | 84.77% | 99.32% | - | 99.44% | 99.84% | 98.07% | 99.76% | 99.68% | 99.68% |
| 91.88% | 61.44% | 96.44% | - | 95.58% | 99.52% | 97.19% | 90.64% | 97.55% | 87.66% |
| 95.45% | 91.51% | 99.54% | - | 99.45% | 99.95% | 99.63% | 99.82% | 99.68% | 99.78% |
| 98.31% |
|---|
| 97.30% |
| 90.88% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 2.0)
PlotNet v34
| 95.22% | 85.95% | 99.15% | - | 99.39% | 99.82% | 97.59% | 99.75% | 99.78% | 99.36% |
| 92.27% | 65.13% | 93.99% | - | 95.82% | 99.26% | 91.31% | 89.00% | 95.39% | 65.82% |
| 95.42% | 91.37% | 99.42% | - | 99.42% | 99.93% | 99.50% | 99.81% | 99.79% | 99.76% |
| 98.27% |
|---|
| 97.33% |
| 87.55% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 3.0)
PlotNet v35
| 95.29% | 83.75% | 99.07% | - | 88.56% | 99.72% | 98.03% | 99.66% | 99.73% | 99.68% |
| 92.50% | 59.76% | 97.03% | - | 86.39% | 99.31% | 96.74% | 89.70% | 96.94% | 92.89% |
| 95.47% | 91.59% | 99.25% | - | 88.60% | 99.82% | 99.62% | 99.70% | 99.79% | 99.77% |
| 97.07% |
|---|
| 95.94% |
| 90.14% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 4.0)
PlotNet v44
| 94.98% | 85.00% | 99.26% | - | 99.47% | 99.55% | 98.01% | 99.73% | 99.94% | 99.45% |
| 92.03% | 60.37% | 95.23% | - | 95.11% | 98.85% | 97.02% | 88.88% | 96.50% | 81.45% |
| 95.14% | 90.11% | 99.50% | - | 99.51% | 99.82% | 99.58% | 99.82% | 99.95% | 99.77% |
| 98.13% |
|---|
| 97.27% |
| 89.49% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 1.0)
PlotNet v45
| 95.14% | 79.49% | 99.08% | - | 99.50% | 99.76% | 97.65% | 99.73% | 99.60% | 98.59% |
| 92.14% | 55.59% | 93.69% | - | 95.57% | 99.43% | 90.24% | 84.96% | 91.33% | 58.70% |
| 95.31% | 91.25% | 99.33% | - | 99.51% | 99.88% | 99.54% | 99.81% | 99.67% | 99.72% |
| 98.22% |
|---|
| 96.50% |
| 84.63% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 1.25)
PlotNet v46
| 95.27% | 78.47% | 98.08% | - | 99.35% | 96.95% | 96.31% | 99.75% | 99.56% | 99.02% |
| 92.06% | 51.74% | 94.84% | - | 94.98% | 94.84% | 93.31% | 89.53% | 93.79% | 55.31% |
| 95.43% | 91.50% | 99.22% | - | 99.35% | 97.92% | 97.95% | 99.81% | 99.58% | 99.73% |
| 97.83% |
|---|
| 95.94% |
| 84.49% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 1.5)
PlotNet v47
| 95.01% | 82.51% | 98.96% | - | 99.46% | 99.85% | 98.00% | 99.76% | 99.66% | 99.59% |
| 91.99% | 56.92% | 95.70% | - | 95.67% | 99.49% | 96.43% | 92.12% | 95.34% | 87.63% |
| 95.22% | 91.12% | 99.41% | - | 99.51% | 99.95% | 99.53% | 99.82% | 99.67% | 99.74% |
| 98.22% |
|---|
| 96.98% |
| 90.14% |
| 0.5 |
| 0.75 |
| 0.9 |
Only Custom Loss (gamma = 0.85)
Only Custom Loss (gamma = 1.75)
Analysis of different object detection models (PlotQA)
| mAP |
|---|
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
PlotNet v42
| 95.24% | 85.56% | 99.04% | - | 99.36% | 95.66% | 97.03% | 99.76% | 99.90% | 99.42% |
| 92.94% | 59.12% | 94.87% | - | 95.58% | 92.46% | 94.46% | 94.29% | 95.36% | 76.85% |
| 95.37% | 91.08% | 99.28% | - | 99.53% | 96.88% | 98.77% | 99.79% | 99.90% | 99.75% |
| 97.82% |
|---|
| 96.78% |
| 88.44% |
| 0.5 |
| 0.75 |
| 0.9 |
SL1 + log IOU
PlotNet v43
| 95.21% | 89.84% | 99.42% | - | 99.51% | 99.84% | 98.13% | 99.80% | 99.92% | 99.69% |
| 92.80% | 70.11% | 98.47% | - | 96.33% | 99.27% | 97.31% | 94.12% | 97.66% | 94.42% |
| 95.38% | 91.19% | 99.60% | - | 99.52% | 99.97% | 99.66% | 99.82% | 99.93% | 99.77% |
| 98.32% |
|---|
| 97.93% |
| 93.39% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v48
SL1 + Custom Loss (gamma = 2.0)
SL1 + 1- IOU
| 95.23% | 89.30% | 99.35% | - | 99.49% | 99.80% | 96.80% | 99.79% | 99.86% | 99.63% |
| 92.78% | 68.26% | 97.75% | - | 95.90% | 99.04% | 93.64% | 92.97% | 96.24% | 93.12% |
| 95.44% | 91.25% | 99.60% | - | 99.53% | 99.97% | 98.39% | 99.82% | 99.88% | 99.77% |
| 98.18% |
|---|
| 97.70% |
| 92.19% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v51
| 0.5 |
| 0.75 |
| 0.9 |
SL1 + Only GIOU
| mAP |
|---|
| 97.76% |
|---|
| 94.58% |
| 77.22% |
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
FrRCNN_FPN_RA
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 98.11% |
|---|
| 97.21% |
| 83.89% |
| 95.33% | 91.60% | 98.96% | - | 99.27% | 99.76% | 99.02% | 99.77% | 99.69% | 99.57% |
| 91.02% | 31.69% | 97.08% | - | 81.57% | 99.36% | 96.06% | 85.33% | 82.00% | 90.95% |
| 95.04% | 86.46% | 98.64% | - | 99.08% | 99.73% | 97.30% | 99.59% | 99.63% | 99.39% |
PlotNet v2
Analysis of different object detection models (PlotQA)
PlotNet v3
| 95.06% | 88.81% | 99.14% | - | 99.14% | 99.79% | 98.08% | 99.75% | 99.65% | 99.65% |
| 91.35% | 52.85% | 98.06% | - | 88.65% | 99.24% | 97.00% | 90.32% | 93.36% | 94.55% |
| 95.35% | 90.97% | 99.42% | - | 99.46% | 99.91% | 99.61% | 99.81% | 99.72% | 99.77% |
| 98.22% |
|---|
| 97.67% |
| 89.49% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 98.19% |
|---|
| 97.63% |
| 90.93% |
| 95.39% | 91.39% | 99.19% | - | 99.53% | 99.65% | 99.14% | 99.82% | 99.81% | 99.77% |
| 91.61% | 55.72% | 98.21% | - | 91.46% | 97.33% | 96.75% | 94.36% | 97.24% | 95.67% |
| 95.10% | 89.25% | 99.01% | - | 99.41% | 99.14% | 97.52% | 99.77% | 99.81% | 99.70% |
PlotNet v4
PlotNet v0
| 95.52% | 91.24% | 99.66% | - | 99.52% | 99.87% | 98.27% | 99.83% | 99.62% | 99.77% |
| 95.52% | 91.24% | 99.45% | - | 99.52% | 99.56% | 97.44% | 99.83% | 99.62% | 99.77% |
| 95.52% | 91.24% | 99.79% | - | 99.52% | 99.97% | 99.68% | 99.83% | 99.62% | 99.78% |
| 98.33% |
|---|
| 98.14% |
| 97.99% |
| 0.5 |
| 0.75 |
| 0.9 |
#FLOPs: 283.85G
#Param: 43.81M
#FLOPs: 100.927G
#Param: 131.944M
#FLOPs: 100.927G
#Param: 131.944M
#FLOPs: 100.927G
#Param: 131.944M
PlotNet v12
| 95.24% | 88.94% | 99.01% | - | 99.42% | 99.83% | 98.06% | 99.79% | 99.75% | 99.64% |
| 92.16% | 61.18% | 98.38% | - | 93.46% | 99.44% | 97.21% | 94.21% | 95.45% | 96.02% |
| 95.40% | 91.06% | 99.29% | - | 99.51% | 99.96% | 99.56% | 99.83% | 99.75% | 99.77% |
| 98.24% |
|---|
| 97.74% |
| 91.94% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v14
| 95.15% | 88.71% | 98.37% | - | 99.09% | 99.79% | 97.75% | 99.73% | 99.75% | 99.43% |
| 90.05% | 40.38% | 96.55% | - | 88.16% | 99.19% | 96.42% | 89.49% | 86.98% | 89.03% |
| 95.36% | 91.12% | 98.77% | - | 99.14% | 99.93% | 99.40% | 99.80% | 99.77% | 99.68% |
| 98.11% |
|---|
| 97.53% |
| 86.25% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 97.90% |
|---|
| 97.12% |
| 85.68% |
| 95.41% | 89.75% | 98.99% | - | 99.12% | 99.86% | 98.64% | 99.78% | 99.86% | 99.73% |
| 91.47% | 35.11% | 97.37% | - | 85.78% | 99.22% | 96.01% | 88.77% | 87.35% | 90.02% |
| 95.21% | 85.61% | 98.69% | - | 98.84% | 99.74% | 96.98% | 99.71% | 99.79% | 99.53% |
PlotNet v15
| mAP |
|---|
| 97.76% |
|---|
| 94.58% |
| 77.22% |
| IOU |
|---|
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
FrRCNN_FPN_RA
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 98.16% |
|---|
| 97.49% |
| 90.17% |
| 95.42% | 91.12% | 98.98% | - | 99.34% | 99.93% | 99.13% | 99.82% | 99.91% | 99.76% |
| 90.87% | 52.84% | 97.68% | - | 90.27% | 99.37% | 96.55% | 92.97% | 96.34% | 94.67% |
| 95.11% | 87.89% | 98.59% | - | 99.17% | 99.80% | 97.55% | 99.77% | 99.91% | 99.63% |
PlotNet v8
Analysis of different object detection models (PlotQA)
PlotNet v7
| 95.17% | 87.85% | 99.05% | - | 99.29% | 99.81% | 97.87% | 99.75% | 99.96% | 99.64% |
| 91.62% | 48.07% | 98.08% | - | 89.12% | 99.29% | 96.97% | 93.00% | 96.63% | 94.72% |
| 95.40% | 90.91% | 99.31% | - | 99.48% | 99.93% | 99.53% | 99.81% | 99.96% | 99.77% |
| 98.23% |
|---|
| 97.60% |
| 89.72% |
| 0.5 |
| 0.75 |
| 0.9 |
PlotNet v1
| 94.97% | 87.13% | 97.42% | - | 97.87% | 99.11% | 96.38% | 99.68% | 99.02% | 99.04% |
| 89.46% | 37.63% | 93.64% | - | 78.12% | 95.22% | 94.06% | 88.10% | 66.95% | 83.86% |
| 95.32% | 90.65% | 97.83% | - | 98.31% | 99.66% | 98.19% | 99.78% | 99.02% | 99.40% |
| 97.57% |
|---|
| 96.74% |
| 80.78% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 97.62% |
|---|
| 94.03% |
| 78.71% |
| 95.32% | 86.32% | 99.37% | - | 99.46% | 99.95% | 98.93% | 99.80% | 99.70% | 99.76% |
| 87.96% | 6.30% | 95.11% | - | 78.23% | 99.28% | 94.62% | 80.52% | 84.85% | 81.51% |
| 94.79% | 58.12% | 99.06% | - | 98.57% | 99.83% | 97.16% | 99.64% | 99.64% | 99.43% |
PlotNet v5
- Table F1
-
Template wise answer distribution Table
-
Human Accuracy
-
Comparison Table (FQA, DVQA, PlotQA) like Table 1 in Kushal's paper
-
Number of questions with a particular question length (X=Q-length, Y=Number of Qs)->histogram
-
Average Q-length, Min Q-length and Max Q-length
-
Number of unique Qs compare with other datasets
-
Number of Qs answered by varying vocab. size
| IOU@0.75 | IOU@0.9 | IOU@0.5 | |
|---|---|---|---|
| mAP | |||
| Table F1-score |
| Structural | Data Retrieval | Reasoning | |
|---|---|---|---|
| Yes/No | |||
| Fixed vocab. | |||
| Open Vocab. |

PlotQA Answer Distribution

| Model | Binary | Fixed Vocab. | OOV |
|---|---|---|---|
| SAN (6.08%) | 80.20% | 19.80% | 0.00% |
| VOES (18.46%) | 0.00% | 13.43% | 20.60% |
PlotQA data distribution
| Dataset Split | Images | QA-pairs (old) | QA-pairs (new) |
|---|---|---|---|
| Train | 157070 | 5,733,893 | 20,249,479 |
| Validation | 33650 | 1,228,468 | 4,360,648 |
| Test | 33657 | 1,228,313 | 4,342,514 |
| Total | 224,377 | 8,190,674 | 28,952,641 |
ANSWERS
Old Distribution
New Distribution
PlotQA data distribution on TEST split
| Yes/No | Fixed Vocab | OOV |
|---|---|---|
| 23.46% | 76.53% | 0.00% |
| Yes/No | Fixed Vocab | OOV |
|---|---|---|
| 27.46% | 46.1% | 26.4% |
| Datasets | DVQA |
|---|
| Methods\Templates |
| Distribution |
| PlotQA |
|---|
| Yes/No | Fixed Vocab | OOV |
|---|---|---|
| 3.86 | 15.38 | 80.76 |
| Datasets | PlotQA (Top-1k) |
|---|
| Methods\Templates |
| Distribution |
| PlotQA (Top-5k) | PlotQA (Top-10k) |
|---|
| Yes/No | Fixed Vocab | OOV |
|---|---|---|
| 3.86 | 26.90 | 69.24 |
| Yes/No | Fixed Vocab | OOV |
|---|---|---|
| 3.86 | 33.65 | 62.49 |
TEMPLATES
| Structure | Data | Reasoning |
|---|---|---|
| 13.48% | 31.93% | 54.59% |
| Structure | Data | Reasoning |
|---|---|---|
| 30.37% | 23.97% | 45.66% |
| Datasets | DVQA |
|---|
| Methods\Templates |
| Distribution |
| PlotQA |
|---|
Old Distribution
New Distribution
| Structure | Data | Reasoning |
|---|---|---|
| 13.48% | 31.93% | 54.59% |
| Structure | Data | Reasoning |
|---|---|---|
| 4.30% | 13.73% | 81.97% |
| Datasets | DVQA |
|---|
| Methods\Templates |
| Distribution |
| PlotQA |
|---|
PlotQA data distribution on TEST split
| Structure | Data | Reasoning |
|---|---|---|
| 82.13% | 15.02% | 14% |
| 17.8% | 84.98% | 85.91% |
| 0.00% | 0.00% | 0.00% |
| Yes/No |
|---|
| Fixed Vocab |
| OOV |
| Structure | Data | Reasoning |
|---|---|---|
| 37.59% | 20.85% | 24.18% |
| 62.4% | 56.3% | 29.89% |
| 0.00% | 22.84% | 45.92% |
| Datasets | DVQA | DIP |
|---|
| Answer \ Template |
| Structure | Data | Reasoning |
|---|---|---|
| 82.13% | 15.02% | 14% |
| 17.8% | 84.98% | 85.91% |
| 0.00% | 0.00% | 0.00% |
| Yes/No |
|---|
| Fixed Vocab |
| OOV |
| Structure | Data | Reasoning |
|---|---|---|
| 37.64% | 5.14% | 1.88% |
| 62.36% | 16.01% | 12.80% |
| 0.00% | 78.85% | 85.32% |
| Datasets | DVQA | DIP |
|---|
| Answer \ Template |
Old Distribution
New Distribution
PlotQA data distribution on TEST split
TEMPLATES & ANSWERS
*keeping top-1K answers in fixed vocab
Accuracy of different models (in %)
| Structure | Data | Reasoning |
|---|---|---|
| 94.01 | 95.35 | 66.02 |
| 77.30 | 32.06 | 29.27 |
| NA | 0.90 | 3.40 |
| 0.00 | 0.00 | 0.00 |
| 42.29 | 27.61 | 25.48 |
| NA | 32.00 | 15.44 |
| 94.01 | 95.35 | 66.02 |
| 81.66 | 40.60 | 35.74 |
| NA | 32.06 | 17.10 |
| Yes/No |
|---|
| Fixed Vocab |
| OOV |
| Yes/No |
| Fixed Vocab |
| OOV |
| Yes/No |
| Fixed Vocab |
| OOV |
| SAN |
|---|
| VOES |
|---|
| MaskRCNN | FrRCNN-FPN-RA |
|---|
| Answer \ Template |
| SAN- VOES |
|---|
Zooming in the accuracy of different models
* All accuracies are calculated with 5% threshold
| Structure | Data | Reasoning |
|---|---|---|
| 94.01 | 95.35 | 66.02 |
| 77.30 | 32.06 | 29.27 |
| NA | 0.90 | 3.40 |
| 0.00 | 0.00 | 0.00 |
| 42.72 | 29.71 | 31.19 |
| NA | 40.66 | 26.05 |
| 94.01 | 95.35 | 66.02 |
| 77.09 | 39.84 | 38.75 |
| NA | 40.66 | 26.05 |
| Model | MaskRCNN | FrRCNN-FPN-RA |
|---|---|---|
| SAN | 46.54 | 46.54 |
| VOES | 20.22 | 24.08 |
| SAN-VOES | 53.96 | 55.75 |
Overall Model Accuracy
Accuracy of different models (in %)
| Structure | Data | Reasoning |
|---|---|---|
| 91.12 | 97.32 | 62.75 |
| 66.85 | 30.76 | 16.03 |
| NA | 0.00 | 0.00 |
| 0.00 | 0.00 | 0.00 |
| 42.12 | 16.07 | 7.24 |
| NA | 57.39 | 14.95 |
| 91.12 | 97.32 | 62.75 |
| 66.86 | 22.64 | 7.95 |
| NA | 57.39 | 14.95 |
| Yes/No |
|---|
| Fixed Vocab |
| OOV |
| Yes/No |
| Fixed Vocab |
| OOV |
| Yes/No |
| Fixed Vocab |
| OOV |
| SAN |
|---|
| VOES |
|---|
| ~FrRCNN-FPN-RA (new dataset) | FrRCNN-FPN-RA (old dataset) |
|---|
| Answer \ Template |
| SAN- VOES |
|---|
Zooming in the accuracy of different models
* All accuracies are calculated with 5% threshold
| Structure | Data | Reasoning |
|---|---|---|
| 94.01 | 95.35 | 66.02 |
| 77.30 | 32.06 | 29.27 |
| NA | 0.90 | 3.40 |
| 0.00 | 0.00 | 0.00 |
| 42.72 | 29.71 | 31.19 |
| NA | 40.66 | 26.05 |
| 94.01 | 95.35 | 66.02 |
| 77.09 | 39.84 | 38.75 |
| NA | 40.66 | 26.05 |
| Model | FrRCNN-FPN-RA (new dataset) |
FrRCNN-FPN-RA (old dataset) |
|---|---|---|
| SAN | 7.76 | 46.54 |
| VOES* | 18.46 (approx) | 24.08 |
| SAN-VOES | 22.52 | 55.75 |
Overall Model Accuracy (in %)

Look, Read, Reason and Answer




Q: Where did the last 1st place finish occur?
A: Thailand
IOU v/s mAP for different object detection models









FRCNN_SS
FrRCNN_RP
SSD
RetinaNet
FRCNN_FPN_SS
FrRCNN_RA
FrRCNN_FPN_RA
YoloV3

mAP v/s Table F1 score for different object detection models
| SS | RPN | Grid | ROI-Pool | ROI-Align | CNN | FPN | OHEM | FL | mAP |
|---|---|---|---|---|---|---|---|---|---|
| R-50 | 19.47% | ||||||||
| R-50 | 58.58% | ||||||||
| R-50 | 3.81% | ||||||||
| R-50 | 51.42% | ||||||||
| R-50 | 77.22% | ||||||||
| R-50 | 33.86% | ||||||||
| N.A | N.A | Darknet | 7.56% | ||||||
| N.A | N.A | R-50 | 30.07% | ||||||
| N.A | N.A | I-Net | 1.45% |
| FRCNN |
| FRCNN_FPN |
| FrRCNN_RP |
| FrRCNN_RA |
| FrRCNN_FPN_RA |
| Mask-RCNN |
| YOLO-V3 |
| RetinaNet |
| SSD |
| Proposal Method | Pooling Method | Feature Extraction | Class Imbalance |
|---|
| CV | ROI-Align | CNN | FPN | Linking | AvgPool | mAP |
|---|---|---|---|---|---|---|
| R-10 | 80.68% | |||||
| R-10 | A0 | 68.18% | ||||
| R-10 | A1 | 72.87% | ||||
| R-10 | A2 | 79.80% | ||||
| R-10 | A3 | 80.42% | ||||
| R-22 | 83.89% | |||||
| R-22 | A0 | 71.14% | ||||
| R-22 | A1 | 73.23% | ||||
| R-22 | A2 | 74.51% | ||||
| R-22 | A3 | 81.61% | ||||
| R-50 | 59.28% | |||||
| R-50 | 74.08% |
| PlotQA_R10 |
| PlotQA_R10 |
| PlotQA_R10 |
| PlotQA_R10 |
| PlotQA_R10 |
| PlotQA_R22_FPN |
| PlotQA_R22_FPN |
| PlotQA_R22_FPN |
| PlotQA_R22_FPN |
| PlotQA_R22_FPN |
| PlotQA_R50 |
| PlotQA_R50_FPN |
*We are not doing anything for class-Imbalance
| IoU@0.5 | IoU@0.75 | IoU@0.9 | |
|---|---|---|---|
| FRCNN | |||
| FRCNN_FPN | |||
| FrRCNN_RP | |||
| FrRCNN_RA | |||
| FrRCNN_FPN_RA | |||
| Mask-RCNN | |||
| YOLO-V3 | |||
| RetinaNet | |||
| SSD |
Table Accuracy
Experiments
R50
R-
clipped
with FPN
without FPN
AvgPool
~AvgPool
~AvgPool
mAP
#FLOPs
#Params
Time
Table-F1
mAP
#FLOPs
#Params
Time
Table-F1
74.08%
119.38G
56.72M
59.28%
113.02G
233.2M
83.89%
96.14G
131.82M
80.68%
35.09G
130.44M
0.026s
0.032s
-
-
-
-
-
-
-
-
-
-
0.055s
-
-
0.040s
AvgPool
17.65G
42.36M
0.040s
61.28G
43.74M
0.055s
80.42%
51.42%
1.29T
35.9M
77.22%
283.85G
43.81M
0.374s
Fr-
RCNN
AvgPool
(i) mAP is reported at 0.9 IOU
(ii) Table F1-score is calculated using 5% relaxation for numeric values
| Selective Search | RPN+FPN | CV | CV+FPN | |
|---|---|---|---|---|
| #Proposals | ~2000 | ~225120 | ~90 | ~90*#Levels |
82.92%
0.57
0.70
0.72
Table Accuracy at 5% relaxation for numeric values
R-50_FPN
R-22_FPN
R-10
FrRCNN_FPN_RA
R-50
| IOU | Precision | Recall | F1-score |
|---|---|---|---|
| 0.5 | |||
| 0.75 | |||
| 0.9 |
| IOU | Precision | Recall | F1-score |
|---|---|---|---|
| 0.5 | |||
| 0.75 | |||
| 0.9 |
PlotQA
Experiments
R50
R-
clipped
with FPN
without FPN
AvgPool
AvgPool-A0
~AvgPool
~AvgPool
mAP
#FLOPs
#Params
Time
Table-F1
mAP
#FLOPs
#Params
Time
Table-F1
74.08%
119.38G
56.72M
59.28%
113.02G
233.2M
83.89%
96.14G
131.82M
80.68%
35.09G
130.44M
0.026s
0.032s
54.62G
26.943M
0.024s
-
-
-
-
-
-
-
-
-
-
0.025s
71.14%
-
-
0.016s
14.33G
25.57M
0.016s
68.18%
AvgPool-A1
AvgPool-A2
AvgPool-A3
16.19G
35.02M
0.016s
17.65G
42.36M
0.016s
14.85G
28.21M
0.016s
55.66G
29.583M
0.025s
58.37G
36.40M
0.025s
61.28G
43.74M
0.025s
73.23%
74.51%
72.87%
79.80%
80.42%
51.42%
1.29T
35.9M
77.22%
283.85G
43.81M
0.374s
Fr-
RCNN
AvgPool
(i) mAP is reported at 0.9 IOU
(ii) Table F1-score is calculated using 5% relaxation for numeric values
| Selective Search | RPN+FPN | CV | CV+FPN | |
|---|---|---|---|---|
| #Proposals | ~2000 | ~225120 | ~90 | ~90*#Levels |
82.92%
0.57
0.70
0.72
Analysis of different object detection models (PlotQA)
R-50_FPN
R-22_FPN
| 95.04% | 86.46% | 98.64% | - | 99.08% | 99.73% | 97.30% | 99.59% | 99.63% | 99.39% |
| 91.02% | 31.69% | 97.08% | - | 81.57% | 99.36% | 96.06% | 85.33% | 82.00% | 90.95% |
| 95.33% | 91.60% | 98.96% | - | 99.27% | 99.76% | 99.02% | 99.77% | 99.69% | 99.57% |
| 98.11% |
|---|
| 97.21% |
| 83.89% |
| 0.5 |
| 0.75 |
| 0.9 |
R-10
| 94.97% | 87.13% | 97.42% | - | 97.87% | 99.11% | 96.38% | 99.68% | 99.02% | 99.04% |
| 89.46% | 37.63% | 93.64% | - | 78.12% | 95.22% | 94.06% | 88.10% | 66.95% | 83.86% |
| 95.32% | 90.65% | 97.83% | - | 98.31% | 99.66% | 98.19% | 99.78% | 99.02% | 99.40% |
| 97.57% |
|---|
| 96.74% |
| 80.78% |
| 0.5 |
| 0.75 |
| 0.9 |
| 94.98% | 63.22% | 97.5% | - | 98.17% | 99.61% | 95.92% | 99.46% | 99.76% | 98.22% |
| 90.77% | 5.12% | 95.58% | - | 80.72% | 99.16% | 94.79% | 76.83% | 65.56% | 58.17% |
| 95.28% | 90.55% | 98.11% | - | 98.64% | 99.69% | 97.49% | 99.70% | 99.82% | 99.77% |
| 97.67% |
|---|
| 94.09% |
| 74.08% |
| 0.5 |
| 0.75 |
| 0.9 |
| 0.5 |
| 0.75 |
| 0.9 |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
| IOU |
|---|
| 97.76% |
|---|
| 94.58% |
| 77.22% |
| mAP |
|---|
R-50
| 93.66% | 71.68% | 97.85% | - | 94.72% | 99.80% | 97.46% | 98.58% | 97.46% | 93.97% |
| 87.64% | 15.72% | 74.57% | - | 41.87% | 98.92% | 81.60% | 54.21% | 43.35% | 35.67% |
| 93.97% | 89.42% | 98.80% | - | 98.84% | 99.85% | 99.38% | 99.47% | 99.54% | 99.76% |
| 97.67% |
|---|
| 93.91% |
| 59.28% |
| 0.5 |
| 0.75 |
| 0.9 |
R-22_FPN
(avgpool)
| 0.5 |
| 0.75 |
| 0.9 |
R-10
(avgpool)
| 0.5 |
| 0.75 |
| 0.9 |
| 95.14% | 85.08% | 98.81% | - | 97.46% | 99.71% | 97.22% | 99.66% | 99.33% | 99.43% |
| 91.13% | 32.03% | 97.26% | - | 85.15% | 99.32% | 95.43% | 87.42% | 83.50% | 75.02% |
| 95.37% | 91.18% | 99.09% | - | 97.67% | 99.80% | 98.96% | 99.73% | 99.41% | 99.70% |
| 97.88% |
|---|
| 96.87% |
| 82.92% |
| 94.89% | 87.25% | 97.56% | - | 98.79% | 99.58% | 92.86% | 99.65% | 99.75% | 99.29% |
| 89.61% | 36.31% | 92.40% | - | 82.23% | 98.18% | 88.27% | 85.66% | 71.14% | 79.96% |
| 95.23% | 91.22% | 98.04% | - | 99.34% | 99.73% | 95.39% | 99.76% | 99.78% | 99.70% |
| 97.58% |
|---|
| 96.63% |
| 80.42% |
| 91.59% | 77.72% | 89.15% | 35.24% | 67.96% | 7.89% | 77.59% | 92.93% | 55.06% | 89.74% |
| 87.59% | 42.77% | 79.05% | 20.46% | 66.39% | 0.22% | 69.78% | 88.29% | 46.63% | 84.60% |
| 92.05% | 79.30% | 89.55% | 45.42% | 68.00% | 18.51% | 80.07% | 93.07% | 56.32% | 89.76% |
| 71.21% |
|---|
| 68.49% |
| 58.58% |
FRCNN_
FPN_SS
| 91.23% | 78.60% | 97.87% | 0.00% | 97.46% | 29.13% | 87.96% | 98.37% | 84.87% | 99.13% |
| 55.73% | 0.73% | 34.91% | 0.00% | 62.27% | 0.87% | 49.02% | 63.99% | 23.60% | 71.44% |
| 92.45% | 93.56% | 99.92% | 0.00% | 98.05% | 88.88% | 98.76% | 99.67% | 99.15% | 99.71% |
| 87.02% |
| 76.46% |
| 36.26% |
| 0.5 |
| 0.75 |
| 0.9 |
FRCNN_
CV
| 81.99% | 79.55% | 99.34% | 30.93% | 95.70% | 49.36% | 97.75% | 95.59% | 99.42% | 94.85% |
| 47.54% | 4.96% | 50.83% | 5.83% | 32.43% | 0.33% | 46.2% | 33.72% | 80.53% | 36.31% |
| 91.24% | 95.05% | 99.77% | 51.83% | 99.87% | 99.91% | 99.94% | 99.75% | 99.97% | 99.84% |
| 93.72% |
|---|
| 82.45% |
| 33.86% |
| 0.5 |
| 0.75 |
| 0.9 |
Mask-RCNN
| IoU |
|---|
| 0.5 |
| 0.75 |
| 0.9 |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
| 76.92% | 71.67% | 92.13% | 50.17% | 94.44% | 17.10% | 90.59% | 84.17% | 80.90% | 62.68% |
| 72.08% |
|---|
| 15.51% | 10.90% | 7.15% | 6.54% | 11.70% | 0.02% | 4.39% | 8.08% | 9.59% | 1.70% |
| 7.56% |
|---|
| 87.95% | 94.13% | 99.95% | 66.25% | 99.94% | 99.58% | 99.99% | 99.89% | 99.77% | 99.64% |
| 94.71% |
|---|
YOLO-V3
| 0.5 |
| 0.75 |
| 0.9 |
| 24.92% | 18.21% | 44.27% | 22.71% | 30.29% | 45.05% | 75.62% | 34.81% | 66.07% | 16.93% |
| 37.89% |
|---|
| 1.39% | 0.04% | 2.18% | 1.25% | 0.39% | 0.04% | 3.39% | 0.44% | 5.14% | 0.20% |
| 1.45% |
|---|
| 61.09% | 74.27% | 71.15% | 47.41% | 78.61% | 99.80% | 99.95% | 93.46% | 97.35% | 78.72% |
| 80.18% |
|---|
SSD
| 0.5 |
| 0.75 |
| 0.9 |
| 24.37% | 0.00% | 88.62% | 22.49% | 99.03% | 0.00% | 66.20% | 66.77% | 37.60% | 42.83% |
| 14.67% | 0.00% | 65.33% | 7.43% | 64.38% | 0.00% | 57.42% | 99.73% | 31.77% | 29.97% |
| 27.76% | 0.00% | 91.43% | 28.62% | 99.42% | 0.00% | 66.77% | 67.41% | 38.00% | 43.05% |
| 46.24% |
|---|
| 44.79% |
| 30.07% |
| 0.5 |
| 0.75 |
| 0.9 |
Retina-Net
| mAP |
|---|
Analysis of different object detection models
Our Model
| 0.5 |
| 0.75 |
| 0.9 |
FrRCNN_FPN_RA
| 0.5 |
| 0.75 |
| 0.9 |
| 94.30% | 78.59% | 99.96% | - | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| 85.54% | 27.86% | 93.68% | - | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| 95.84% | 84.28% | 99.99% | - | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| 97.76% |
|---|
| 94.58% |
| 77.22% |
Analysis of different object detection models (DVQA)
| 0.5 |
| 0.75 |
| 0.9 |
| 95.80% | - | 99.92% | - | 99.89% | 99.99% | 100% | 99.66% | 100% | 99.72% |
| 82.10% | - | 96.52% | - | 98.17% | 99.24% | 98.79% | 96.00% | 98.51% | 98.33% |
| 98.28% | - | 99.99% | - | 99.95% | 99.99% | 100% | 99.83% | 100% | 99.82% |
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | heading | title | xlabel | xticklabel | ylabel | yticklabel |
|---|
| IOU |
|---|
| 99.73% |
|---|
| 99.37% |
| 95.96% |
| mAP |
|---|




Analysis of different object detection models
| Model ID | Backbone CNN | Proposal Method | Additional information | Object Detection Model |
|---|---|---|---|---|
| FRCNN_CV | ResNet50 | CV method | 2-stage detector, RoiAlign | Fast RCNN |
| FRCNN_SS | ResNet50 | Selective Search | 2-stage detector, RoiAlign | Fast RCNN |
| FRCNN_FPN_SS | ResNet50-FPN | Selective Search | 2-stage detector | Fast RCNN |
| FrRCNN_RA | ResNet50 | RPN | 2-stage detector, RoiAlign | Faster RCNN |
| FrRCNN_FPN_RA | ResNet50-FPN | RPN | 2-stage detector, RoiAlign | Faster RCNN |
| FrRCNN_RP | ResNet50 | RPN | 2-stage detector, RoiPool | Faster RCNN |
| RetinaNet | ResNet50-FPN | fixed anchor | 1-stage detector | RetinaNet |
| MaskRCNN | ResNet50 | RPN+segmentation | 2-stage detector | Mask RCNN |
| SSD | InceptionNet | fixed anchor | 1-stage detector | SSD |
| YoloV3 | DarkNet-106 layers | k-means+anchor | 1-stage detector | YoloV3 |
| Backbone | Segmentation Procedure | Model | Accuracy |
|---|---|---|---|
| SegNet | |||
| PixelLink |
Model descriptions and Model_IDs
| FRCNN_CV | FRCNN_SS | FRCNN_FPN_SS | FrRCNN_RP | FrRCNN_RA | FrRCNN_FPN_RA | Mask-RCNN | SSD | YoloV3 | RetinaNet ( FL ) |
|---|---|---|---|---|---|---|---|---|---|
| 92.45% | 90.71% | 92.05% | 86.72% | 93.05% | 95.84% | 91.24% | 61.09% | 87.95% | 27.76% |
| 93.57% | 82.4% | 79.30% | 73.28% | 95.46% | 96.30% | 95.05% | 74.27% | 94.13% | 0.00% |
| 99.92% | 90.23% | 89.55% | 97.68% | 99.99% | 99.99% | 99.77% | 71.15% | 99.95% | 91.43% |
| 0.00% | 37.00% | 45.42% | 56.64% | 62.83% | 72.25% | 51.83% | 47.41% | 66.25% | 28.62% |
| 98.05% | 74.78% | 68.00% | 92.69% | 99.99% | 99.95% | 99.87% | 78.61% | 99.94% | 99.42% |
| 88.88% | 17.59% | 18.51% | 83.15% | 99.99% | 100% | 99.91% | 99.80% | 99.58% | 0.00% |
| 98.76% | 78.59% | 80.07% | 95.99% | 99.99% | 99.99% | 99.94% | 99.95% | 99.99% | 66.77% |
| 99.67% | 93.22% | 93.07% | 93.28% | 99.88% | 99.92% | 99.75% | 93.46% | 99.89% | 67.41% |
| 99.15% | 59.12% | 56.32% | 97.86% | 99.90% | 99.90% | 99.97% | 97.35% | 99.77% | 38.00% |
| 99.71% | 91.54% | 89.76% | 84.08% | 99.87% | 99.99% | 99.84% | 78.72% | 99.64% | 43.05% |
| 87.02% | 71.52% | 71.21% | 86.14% | 95.09% | 96.43% | 93.72% | 80.18% | 94.71% | 46.24% |
|
|
|---|
| bar |
| dotline |
| legend-label |
| line |
| preview |
| title |
| xlabel |
| xticklabel |
| ylabel |
| yticklabel |
| mAP |
Trained at
IOU@0.1
Analysis of different object detection models
Results are shown on Test split by keeping IoU 0.5
| RetinaNet ( CE ) |
|---|
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| 00.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.23% | 78.60% | 97.87% | 0.00% | 97.46% | 29.13% | 87.96% | 98.37% | 84.87% | 99.13% |
| mAP @ 0.75 |
|---|
| 76.46% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 55.73% | 0.73% | 34.91% | 0.00% | 62.27% | 0.87% | 49.02% | 63.99% | 23.60% | 71.44% |
| mAP @ 0.9 |
|---|
| 36.26% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 92.45% | 93.56% | 99.92% | 0.00% | 98.05% | 88.88% | 98.76% | 99.67% | 99.15% | 99.71% |
| mAP @ 0.5 |
|---|
| 87.02% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 92.72% | 94.69% | 99.95% | 0.03% | 98.16% | 91.00% | 98.86% | 99.71% | 99.29% | 99.78% |
| mAP @ 0.1 |
|---|
| 87.42% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
FRCNN_CV
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 85.20% | 58.28% | 82.33% | 21.39% | 65.27% | 2.21% | 71.95% | 90.04% | 47.9% | 88.38% |
| mAP @ 0.75 |
|---|
| 61.29% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 53.38% | 0.22% | 12.59% | 3.13% | 14.06% | 0.03% | 42.13% | 25.49% | 11.68% | 31.98% |
| mAP @ 0.9 |
|---|
| 19.47% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 90.71% | 82.40% | 90.23% | 37.00% | 74.78% | 17.59% | 78.59% | 93.22% | 59.12% | 91.54% |
| mAP @ 0.5 |
|---|
| 71.52% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.61% | 83.75% | 90.83% | 55.78% | 82.81% | 22.55% | 82.54% | 93.25% | 69.75% | 91.61% |
| mAP @ 0.1 |
|---|
| 76.45% |
FRCNN_SS
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.59% | 77.72% | 89.15% | 35.24% | 67.96% | 7.89% | 77.59% | 92.93% | 55.06% | 89.74% |
| mAP @ 0.75 |
|---|
| 68.49% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 87.59% | 42.77% | 79.05% | 20.46% | 66.39% | 0.22% | 69.78% | 88.29% | 46.63% | 84.60% |
| mAP @ 0.9 |
|---|
| 58.58% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 92.05% | 79.30% | 89.55% | 45.42% | 68.00% | 18.51% | 80.07% | 93.07% | 56.32% | 89.76% |
| mAP @ 0.5 |
|---|
| 71.21% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 92.47% | 79.33% | 89.59% | 59.53% | 68.07% | 19.25% | 80.70% | 93.08% | 56.53% | 89.78% |
| mAP @ 0.1 |
|---|
| 72.83% |
FRCNN_FPN_SS
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 59.06% | 36.74% | 73.85% | 32.13% | 38.80% | 4.51% | 60.18% | 59.10% | 74.26% | 50.41% |
| mAP @ 0.75 |
|---|
| 48.90% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 6.92% | 1.68% | 7.78% | 1.10% | 1.45% | 0.00% | 4.35% | 6.10% | 3.57% | 5.18% |
| mAP @ 0.9 |
|---|
| 3.81% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 86.72% | 73.28% | 97.68% | 56.64% | 92.69% | 83.15% | 95.99% | 93.28% | 97.86% | 84.08% |
| mAP @ 0.5 |
|---|
| 86.14% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 88.96% | 76.48% | 97.77% | 67.77% | 92.98% | 88.16% | 96.23% | 93.81% | 98.1% | 84.58% |
| mAP @ 0.1 |
|---|
| 88.48% |
FrRCNN_RP
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 86.92% | 87.32% | 99.77% | 44.55% | 99.59% | 55.02% | 99.90% | 98.08% | 99.84% | 97.27% |
| mAP @ 0.75 |
|---|
| 86.83% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 63.89% | 12.82% | 70.95% | 16.76% | 60.61% | 0.18% | 83.88% | 60.76% | 93.47% | 50.87% |
| mAP @ 0.9 |
|---|
| 51.42% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.05% | 95.46% | 99.99% | 62.83% | 99.99% | 99.99% | 99.99% | 99.88% | 99.90% | 99.87% |
| mAP @ 0.5 |
|---|
| 95.09% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 95.13% | 96.22% | 99.99% | 75.76% | 99.99% | 99.99% | 99.99% | 99.92% | 99.90% | 99.92% |
| mAP @ 0.1 |
|---|
| 96.68% |
FrRCNN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 94.30% | 95.14% | 99.96% | 62.04% | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| mAP @ 0.75 |
|---|
| 92.98% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 85.54% | 18.07% | 93.68% | 37.65% | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| mAP @ 0.9 |
|---|
| 72.29% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 95.84% | 96.30% | 99.99% | 72.25% | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| mAP @ 0.5 |
|---|
| 96.43% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 96.35% | 96.51% | 99.99% | 82.72% | 99.95% | 100.00% | 100.00% | 99.94% | 99.90% | 99.99% |
| mAP @ 0.1 |
|---|
| 97.53% |
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 24.37% | 0.00% | 88.62% | 22.49% | 99.03% | 0.00% | 66.20% | 66.77% | 37.60% | 42.83% |
| mAP @ 0.75 |
|---|
| 44.79% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 14.67% | 0.00% | 65.33% | 7.43% | 64.38% | 0.00% | 57.42% | 29.73% | 31.77% | 29.97% |
| mAP @ 0.9 |
|---|
| 30.07% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 27.76% | 0.00% | 91.43% | 28.62% | 99.42% | 0.00% | 66.77% | 67.41% | 38.00% | 43.05% |
| mAP @ 0.5 |
|---|
| 46.24% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 32.69% | 0.00% | 93.24% | 31.13% | 99.42% | 0.00% | 69.45% | 67.46% | 38.05% | 43.12% |
| mAP @ 0.1 |
|---|
| 47.46% |
RetinaNet
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 24.92% | 18.21% | 44.27% | 22.71% | 30.29% | 45.05% | 75.62% | 34.81% | 66.07% | 16.93% |
| mAP @ 0.75 |
|---|
| 37.89% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 1.39% | 0.04% | 2.18% | 1.25% | 0.39% | 0.04% | 3.39% | 0.44% | 5.14% | 0.20% |
| mAP @ 0.9 |
|---|
| 1.45% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 61.09% | 74.27% | 71.15% | 47.41% | 78.61% | 99.80% | 99.95% | 93.46% | 97.35% | 78.72% |
| mAP @ 0.5 |
|---|
| 80.18% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 84.75% | 90.12% | 78.79% | 64.84% | 86.09% | 99.94% | 99.99% | 99.76% | 99.75% | 97.29% |
| mAP @ 0.1 |
|---|
| 90.13% |
SSD
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 81.99% | 79.55% | 99.34% | 30.93% | 95.70% | 49.36% | 97.75% | 95.59% | 99.42% | 94.85% |
| mAP @ 0.75 |
|---|
| 82.45% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 47.54% | 4.96% | 50.83% | 5.83% | 32.43% | 0.33% | 46.2% | 33.72% | 80.53% | 36.31% |
| mAP @ 0.9 |
|---|
| 33.86% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.24% | 95.05% | 99.77% | 51.83% | 99.87% | 99.91% | 99.94% | 99.75% | 99.97% | 99.84% |
| mAP @ 0.5 |
|---|
| 93.72% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.76% | 96.32% | 99.77% | 65.57% | 99.91% | 99.94% | 99.95% | 99.84% | 99.97% | 99.91% |
| mAP @ 0.1 |
|---|
| 95.50% |
MaskRCNN
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.00% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 76.92% | 71.67% | 92.13% | 50.17% | 94.44% | 17.10% | 90.59% | 84.17% | 80.90% | 62.68% |
| mAP @ 0.75 |
|---|
| 72.08% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 15.51% | 10.90% | 7.15% | 6.54% | 11.70% | 0.02% | 4.39% | 8.08% | 9.59% | 1.70% |
| mAP @ 0.9 |
|---|
| 7.56% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 87.95% | 94.13% | 99.95% | 66.25% | 99.94% | 99.58% | 99.99% | 99.89% | 99.77% | 99.64% |
| mAP @ 0.5 |
|---|
| 94.71% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.04% | 94.39% | 99.96% | 69.57% | 99.97% | 96.61% | 99.99% | 99.94% | 99.80% | 99.99% |
| mAP @ 0.1 |
|---|
| 95.63% |
YoloV3
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 0.00% | 0.92% | 0.00% | 0.00% | 0.40% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
| mAP @ 1.0 |
|---|
| 0.10% |
Demystifying mAP
if IOU(pred, gt) >= threshold:
if pred-label == gt-label:
TP += 1
elif pred-label != gt-label:
FN += 1
else:
FP += 1
TP, FP, FN
- True Positive (TP): A correct detection. Detection with IOU ≥ threshold
- False Positive (FP): A wrong detection. Detection with IOU < threshold
- False Negative (FN): A ground truth not detected
- True Negative (TN): Does not apply. It would represent a corrected misdetection. In the object detection task there are many possible bounding boxes that should not be detected within an image. Thus, TN would be all possible bounding boxes that were corrrectly not detected (so many possible boxes within an image). That's why it is not used by the metrics.
Source: https://blog.objectivity.co.uk/comparing-object-detection-models/
Demystifying mAP

There are 7 images with 15 ground-truth objects represented by the green bounding boxes and 24 detected objects represented by the red bounding boxes. Each detected object has a confidence level and is identified by a letter (A, B,..., Y).




There are 7 images with 15 ground-truth objects represented by the green bounding boxes and 24 detected objects represented by the red bounding boxes. Each detected object has a confidence level and is identified by a letter (A, B,..., Y).

Precision-Recall Curve

11-point interpolation
11-point interpolation




Interpolated Precision
| Class | Precision | Recall |
|---|---|---|
| bar | 0.8 | 0.89 |
| title | 0.04 | 0.05 |
| xlabel | 0.98 | 0.99 |
| ylabel | 0.99 | 1.0 |
| xticklabel | 0.97 | 0.97 |
| yticklabel | 0.98 | 0.98 |
| legend-label | 0.95 | 0.97 |
| preview | 0.97 | 0.97 |
| dot-line | 0.42 | 0.41 |
| line | 0.26 | 0.49 |
FrRCNN_FPN_RA
| Class | Precision | Recall |
|---|---|---|
| bar | 0.94 | 0.91 |
| title | 0.56 | 0.63 |
| xlabel | 0.82 | 0.88 |
| ylabel | 0.65 | 0.68 |
| xticklabel | 0.42 | 0.42 |
| yticklabel | 0.36 | 0.23 |
| legend-label | 0.87 | 0.84 |
| preview | 0.82 | 0.79 |
| dot-line | 0.38 | 0.24 |
| line | 0.35 | 0.26 |
PlotQA_R22_FPN
IOU@0.9
gt = [26.0, 551.0, 52.44, 564]
pred = [27.03, 550.9, 51.4, 564.8]
IOU = 87.26%
gt = [62, 542.16, 1043, 574]
pred = [54.44, 542.16, 1064.44, 574.13]
IOU = 96.74%

yticklabel:
bar:
IOU@0.9
Sensitivity of IOU
Our Proposed Model Output















Original Image
CV Proposals
Regression Targets
Model Output
After Postprocessing
Errors made by our model






Pre-processing
Ground-truth class and target offsets for RoIs
.
Finding neighbours and assigning links between them
Pre-processing



Observations
- 2-stage detectors work better than 1-stage detectors.
- The existing proposal methods cause many unnecessary computations.
- The anchors are independent of the image.
- The way in which the target classes are assigned to the anchors can confuse the network.
- The regression targets (offsets) of the anchors are huge.
- In NMS, discard the proposals based on both localization and classification score rather than the classification score alone.
- IOU is not a good metric.
- mAP seems to be a misleading metric.
Observations 8
Observations








Model 1
Model 2
| # Proposals | # Foreground | # Background | # Neutral | |
|---|---|---|---|---|
| FrRCNN_FPN_RA | 225120 | 1838 | 222315 | 967 |
| RetinaNet (FL) | 225120 | 1914 | 224867 | 1661 |
For RetinaNet, class ids are directly assigned to the proposals

Observations 1 to 4
Observations

Foreground Proposals for FrRCNN_FPN_RA
Foreground Proposals for RetinaNet_FL


Background Proposals for RetinaNet_FL
Background Proposals for FrRCNN_FPN_RA


Observation 5: The regression targets (offsets) of the anchors is significantly large


Misalignment between classification confidence and localization accuracy
Non-monotonic localization in iterative bounding box regression
Observations 6: In NMS, discard the proposals based on both localization and classification score rather than classification score only
Non-monotonic localization in iterative bounding box regression

IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 96.35% | 96.51% | 99.99% | 82.72% | 99.95% | 100.00% | 100.00% | 99.94% | 99.90% | 99.99% |
| mAP @ 0.1 |
|---|
| 97.53% |
FrRCNN_FPN_RA


| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 95.84% | 96.30% | 99.99% | 72.25% | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| mAP @ 0.5 |
|---|
| 96.43% |

| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 94.30% | 95.14% | 99.96% | 62.04% | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| mAP @ 0.75 |
|---|
| 92.98% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 85.54% | 18.07% | 93.68% | 37.65% | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| mAP @ 0.9 |
|---|
| 72.29% |


IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 96.35% | 96.51% | 99.99% | 82.72% | 99.95% | 100.00% | 100.00% | 99.94% | 99.90% | 99.99% |
| mAP @ 0.1 |
|---|
| 97.53% |
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 95.84% | 96.30% | 99.99% | 72.25% | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| mAP @ 0.5 |
|---|
| 96.43% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 94.30% | 95.14% | 99.96% | 62.04% | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| mAP @ 0.75 |
|---|
| 92.98% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 85.54% | 18.07% | 93.68% | 37.65% | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| mAP @ 0.9 |
|---|
| 72.29% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 96.35% | 96.51% | 99.99% | 82.72% | 99.95% | 100.00% | 100.00% | 99.94% | 99.90% | 99.99% |
| mAP @ 0.1 |
|---|
| 97.53% |
FrRCNN_FPN_RA
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 95.84% | 96.30% | 99.99% | 72.25% | 99.95% | 100.00% | 99.99% | 99.92% | 99.90% | 99.99% |
| mAP @ 0.5 |
|---|
| 96.43% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 94.30% | 95.14% | 99.96% | 62.04% | 99.94% | 78.83% | 99.99% | 99.74% | 99.90% | 99.97% |
| mAP @ 0.75 |
|---|
| 92.98% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 85.54% | 18.07% | 93.68% | 37.65% | 96.30% | 0.22% | 99.09% | 96.04% | 99.46% | 96.80% |
| mAP @ 0.9 |
|---|
| 72.29% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
YoloV3
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 76.92% | 71.67% | 92.13% | 50.17% | 94.44% | 17.10% | 90.59% | 84.17% | 80.90% | 62.68% |
| mAP @ 0.75 |
|---|
| 72.08% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 15.51% | 10.90% | 7.15% | 6.54% | 11.70% | 0.02% | 4.39% | 8.08% | 9.59% | 1.70% |
| mAP @ 0.9 |
|---|
| 7.56% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 87.95% | 94.13% | 99.95% | 66.25% | 99.94% | 99.58% | 99.99% | 99.89% | 99.77% | 99.64% |
| mAP @ 0.5 |
|---|
| 94.71% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.04% | 94.39% | 99.96% | 69.57% | 99.97% | 96.61% | 99.99% | 99.94% | 99.80% | 99.99% |
| mAP @ 0.1 |
|---|
| 95.63% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
YoloV3
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 76.92% | 71.67% | 92.13% | 50.17% | 94.44% | 17.10% | 90.59% | 84.17% | 80.90% | 62.68% |
| mAP @ 0.75 |
|---|
| 72.08% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 15.51% | 10.90% | 7.15% | 6.54% | 11.70% | 0.02% | 4.39% | 8.08% | 9.59% | 1.70% |
| mAP @ 0.9 |
|---|
| 7.56% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 87.95% | 94.13% | 99.95% | 66.25% | 99.94% | 99.58% | 99.99% | 99.89% | 99.77% | 99.64% |
| mAP @ 0.5 |
|---|
| 94.71% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.04% | 94.39% | 99.96% | 69.57% | 99.97% | 96.61% | 99.99% | 99.94% | 99.80% | 99.99% |
| mAP @ 0.1 |
|---|
| 95.63% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
YoloV3
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 76.92% | 71.67% | 92.13% | 50.17% | 94.44% | 17.10% | 90.59% | 84.17% | 80.90% | 62.68% |
| mAP @ 0.75 |
|---|
| 72.08% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 15.51% | 10.90% | 7.15% | 6.54% | 11.70% | 0.02% | 4.39% | 8.08% | 9.59% | 1.70% |
| mAP @ 0.9 |
|---|
| 7.56% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 87.95% | 94.13% | 99.95% | 66.25% | 99.94% | 99.58% | 99.99% | 99.89% | 99.77% | 99.64% |
| mAP @ 0.5 |
|---|
| 94.71% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.04% | 94.39% | 99.96% | 69.57% | 99.97% | 96.61% | 99.99% | 99.94% | 99.80% | 99.99% |
| mAP @ 0.1 |
|---|
| 95.63% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
MaskRCNN
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 81.99% | 79.55% | 99.34% | 30.93% | 95.70% | 49.36% | 97.75% | 95.59% | 99.42% | 94.85% |
| mAP @ 0.75 |
|---|
| 82.45% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 47.54% | 4.96% | 50.83% | 5.83% | 32.43% | 0.33% | 46.2% | 33.72% | 80.53% | 36.31% |
| mAP @ 0.9 |
|---|
| 33.86% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.24% | 95.05% | 99.77% | 51.83% | 99.87% | 99.91% | 99.94% | 99.75% | 99.97% | 99.84% |
| mAP @ 0.5 |
|---|
| 93.72% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.76% | 96.32% | 99.77% | 65.57% | 99.91% | 99.94% | 99.95% | 99.84% | 99.97% | 99.91% |
| mAP @ 0.1 |
|---|
| 95.50% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
MaskRCNN
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 81.99% | 79.55% | 99.34% | 30.93% | 95.70% | 49.36% | 97.75% | 95.59% | 99.42% | 94.85% |
| mAP @ 0.75 |
|---|
| 82.45% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 47.54% | 4.96% | 50.83% | 5.83% | 32.43% | 0.33% | 46.2% | 33.72% | 80.53% | 36.31% |
| mAP @ 0.9 |
|---|
| 33.86% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.24% | 95.05% | 99.77% | 51.83% | 99.87% | 99.91% | 99.94% | 99.75% | 99.97% | 99.84% |
| mAP @ 0.5 |
|---|
| 93.72% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.76% | 96.32% | 99.77% | 65.57% | 99.91% | 99.94% | 99.95% | 99.84% | 99.97% | 99.91% |
| mAP @ 0.1 |
|---|
| 95.50% |





IOU=0.1
IOU=0.5
IOU=0.75
IOU=0.9
MaskRCNN
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 81.99% | 79.55% | 99.34% | 30.93% | 95.70% | 49.36% | 97.75% | 95.59% | 99.42% | 94.85% |
| mAP @ 0.75 |
|---|
| 82.45% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 47.54% | 4.96% | 50.83% | 5.83% | 32.43% | 0.33% | 46.2% | 33.72% | 80.53% | 36.31% |
| mAP @ 0.9 |
|---|
| 33.86% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 91.24% | 95.05% | 99.77% | 51.83% | 99.87% | 99.91% | 99.94% | 99.75% | 99.97% | 99.84% |
| mAP @ 0.5 |
|---|
| 93.72% |
| bar | dotline | legend-label | line | preview | title | xlabel | xticklabel | ylabel | yticklabel |
|---|---|---|---|---|---|---|---|---|---|
| 93.76% | 96.32% | 99.77% | 65.57% | 99.91% | 99.94% | 99.95% | 99.84% | 99.97% | 99.91% |
| mAP @ 0.1 |
|---|
| 95.50% |





Pre-processing
Adding CV-mask as 4th channel in the input image



Pre-processing
Adding CV-mask as 4th channel in the input image



Pre-processing
Adding CV-mask as 4th channel in the input image



Converting Line to Dotline using CV methods





Input Image
Detect xtickmarks
Harris Corner Detector
Markers
Gt-image
Pre-processing
Converting Line to Dotline using CV methods





Corner Detection Errors
Input Image
Detect xtickmarks
Harris Corner Detector
Markers
Gt-image
Converting Line to Dotline using CV methods
Ground-truth errors




Converting Line to Dotline using CV methods
mAP Error Analysis









CV Proposals (raw)
CV Proposals (refined)
(non-zero IoU with parent)
Region Proposal Analysis (CV based proposals)
CV Proposals
(IoU > 5% with parent)






(B) The y-value of the visual element could be wrong due to the following reasons:
(a) pixel to value mapping is wrong i.e., the scale is wrong:
- the ticks using which scale is calculated are wrong due to incorrect OCR module
- the ticks using which scale is calculated are not the consecutive ticks (VED error)
(b) the height of the data element is wrong (VED error)
(A) The x-value of the visual element could be wrong due to the following reasons:
(C) The z-value of the visual element could be wrong due to the following reasons:
(a) the label associated with the bar is wrong because:
- the colour of the bar is not associated with the right preview
- mapping of preview and legend-label is not correct (VED error)
- OCR error in a legend-label text
(a) the tick-label associated with the bar is wrong (VED error)
(b) the OCR error in tick-label
TODO
- Word Level OCR accuracy
- Generate tables using detections at different IOUs
Optical Character Recognition (OCR)
| Character Level Accuracy | Word Level Accuracy | Sentence Level Accuracy |
|---|---|---|
| 99.49% | 98.45% | 98.29% |
| 99.59% | 98.39% | 96.91% |
| 99.80% | 99.21% | 99.15% |
| 99.82% | 99.29% | 99.23% |
| 94.31% | 82.87% | 54.90% |
| 99.94% | 99.77% | 99.70% |
| 98.82% | 96.33% | 91.36% |
| 97.95% | 94.60% | 96.58% |
Tesseract
Attention-OCR
Oracle Bounding Boxes
| Textual Elements | Character Level Accuracy | Word Level Accuracy | Sentence Level Accuracy |
|---|---|---|---|
| xlabel | 99.94% | 88.73% | 78.64% |
| ylabel | 98.43% | 87.17% | 79.58% |
| yticklabel | 93.38% | 86.92% | 86.24% |
| xticklabel | 94.8% | 91.32% | 90.95% |
| title | 99.31% | 97.92% | 82.58% |
| legend-label | 98.53% | 95.29% | 93.2% |
| Overall (avg.) | 97.39% | 91.22% | 85.20% |
| (weighted avg.) |
Optical Character Recognition (OCR)
| Character Level Accuracy | Word Level Accuracy | Sentence Level Accuracy |
|---|---|---|
| 94.61% | 82.02% | 81.44% |
| 95.85% | 83.27% | 76.60% |
| 85.20% | 62.90% | 61.0% |
| 87.79% | 62.33% | 60.08% |
| 83.55% | 48.23% | 10.0% |
| 97.88% | 92.33% | 90.42% |
| 90.81% | 71.84% | 63.25% |
| 88.85% | 65.97% | 64.02% |
Tesseract
Attention-OCR
Bounding Boxes after VED MaskRCNN)
| Textual Elements | Character Level Accuracy | Word Level Accuracy | Sentence Level Accuracy |
|---|---|---|---|
| xlabel | 95.5% | 81.29% | 67.72% |
| ylabel | 97.07% | 82.8% | 67.74% |
| yticklabel | 88.07% | 78.08% | 77.13% |
| xticklabel | 91.38% | 83.81% | 83.7% |
| title | 94.6% | 85.24% | 34.13% |
| legend-label | 91.99% | 78.26% | 79.71% |
| Overall (avg.) | 93.10% | 81.58% | 68.35% |
| (weighted avg.) |
SEMPRE
- ADDITION: 66.77%
- DIFFERENCE: 9.5%
- MAX: 73.8%
- MIN: 80.3%
- RATIO: 0.68%
- AVG: 67.33%
- MEDIAN: 6.6%
- COUNT: 41.85%
- YES/NO: 0.00%
- LOOKUP: 95.10%
Operation wise accuracy of SEMPRE
GT-GT-GT-SEMPRE
(32.55%)
VOES analysis
| VED | OCR | SIE | TQA |
|---|---|---|---|
| 100% | 100% | 100% | 32.55% |
| 100% | 100% | 52.20% | 26.24% |
| 100% | 100% | 79.38% | 29.24% |
| 100% | 97.06% | 23.78% | |
| 94.21% | 93.10% | 20.22% |
(SEMPRE)
(SIE)
(Tesseract OCR)
(VED)
SEMPRE on only OOV answer types gives 21.41%
bar: 82.81%
dot-line: 80.38%
line: 65.54%
(SIE after modifying rules)
Some interesting failure cases
SIE stage
TEST/2175.png
WACV 2020
By Nitesh Methani
WACV 2020
- 1,038