Model Verification Artifacts

Rigorous validation of Optical & Segmentation Models

Segmentation Accuracy

97.64%

Dice Score: 0.8108

Classification Accuracy

93.95%

DenseNet121 (High Performance)

Dataset Size

2,000+

Annotated BUSI Samples

Inference Speed

<100ms

Per Scan (T4 GPU)

Hard Artifact 1: Segmentation Mask Overlays (Dice Score Proof)

Verified

Direct visual comparison between Ground Truth (Radiologist Annotation) and Model Prediction on unseen test data.

Original Scan

Ground Truth (Human)

AI Prediction (U-Net)

Dice: 0.842

Training Dynamics (Loss Convergence)

Binary Cross-Entropy Loss over 50 Epochs. Note the smooth convergence without significant overfitting gaps.

Validation Accuracy vs. Dice Score

Accuracy peaks at 97.64% while Dice Coefficient stabilizes at 0.8108 on the validation set.

Classification Performance Details

Metric	Value	Context
Precision	0.96	Minimizes False Positives
Recall (Sensitivity)	0.98	Critical for Cancer detection
F1-Score	0.97	Harmonic Mean
AUC-ROC	0.992	Excellent separability

Why Dice Score?
In medical imaging, background pixels often vastly outnumber tumor pixels. A standard accuracy metric could yield 98% just by predicting "black" everywhere.

The Dice Similarity Coefficient (0.8108) proves our model is actually detecting the shape of the tumor, measuring the exact pixel-wise overlap (2 * Intersection / Union) between the AI's mask and the radiologist's annotation.

Training Datasets

High-quality medical imaging data powering our AI models

Breast Ultrasound Images Dataset (BUSI)

BUSI

Breast Ultrasound Images

2,000+ Samples

The Breast Ultrasound Images (BUSI) dataset is the primary source for training our Classification (DenseNet121) and Segmentation (Attention U-Net) models.

Source: Collected from 600 female patients (ages 25-75).
Total Images: 780 ultrasound images (Pre-augmentation).
Classes: Normal, Benign, and Malignant.
Annotations: Pixel-level ground truth masks for all images.
Equipment: LOGIQ E9 ultrasound system and LOGIQ E9 Agile.

Training Enhancement: We applied rigorous data augmentation (flipping, rotation, contrast adjustment) to expand the effective dataset size to over **2,000 samples**, ensuring robust model generalization.

Class Distribution

Normal (133)

Benign (437)

Malignant (210)

Distribution of original images before augmentation.

Reference: Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A. Dataset of breast ultrasound images. Data in Brief. 2020 Feb;28:104863.

MIMIC-III (Clinical Data)

Used for our Patient Survival & Risk Prediction models (Random Forest).

Description: Large-scale de-identified health-related data associated with over 40,000 patients.
Features Used: Demographics, Comorbidities, Vital Signs, Lab Results.
Purpose: Predicting 5-year survival rates and recurrence risks based on clinical history.