Algorithms

Atypical Mitoses

Charles-Antoine Collins-Fekete
October 28, 2025
4 min read
Atypical Mitoses

Introducing OMG-Octo Atypical: A New Dataset for Atypical Mitosis Classification

We're excited to announce the release of OMG-Octo Atypical, a significant expansion of our original OMG-Octo database that now incorporates atypical mitotic figures. This new resource addresses a critical need in cancer prognostication and represents our continued commitment to advancing automated mitosis detection in computational pathology.

Why Atypical Mitoses Matter

Mitotic activity quantification is fundamental to grading numerous cancer types, including breast cancer, sarcomas, neuro-endocrine tumors, and melanoma. While our previous work focused on detecting conventional mitotic figures, distinguishing between typical and atypical mitoses provides additional prognostic value that pathologists rely on for cancer grading decisions.

The Challenge We're Addressing

Manual identification of mitotic figures remains laborious and subjective, with high variability between observers. The challenge becomes even more complex when differentiating atypical from typical mitotic figures—a distinction that carries important clinical implications but requires specialized expertise and consistent criteria.

Our Approach: Data-Driven Development

Following the "Bitter Lesson" principle that emphasizes data scale over algorithmic novelty, we've focused on creating comprehensive, high-quality datasets. The OMG-Octo Atypical database builds upon our existing foundation by adding carefully annotated atypical mitotic figures to enable robust machine learning model development.

Dataset Composition

For our atypical mitosis classification work, we combined multiple data sources:

  • OMG-Octo Atypical (our new in-house dataset)
  • AMi-Br dataset
  • MIDOG 2025 Atypical Training Set
  • LUNG-MITO dataset
  • GBM-TCGA dataset

Together, these resources comprise 17,664 typical mitotic figures and 7,973 atypical mitotic figures, providing the scale and diversity needed for effective model training.

Technical Implementation

Our classification approach leverages modern deep learning architectures:

Model Architecture

We evaluated multiple state-of-the-art architectures including ConvNeXt, EfficientNet variants, and UNI (a vision transformer-based foundation model for pathology). Interestingly, our best results came from a ConvNext-tiny architecture trained from scratch, suggesting that for this specific task, domain-specific training on appropriately sized models may outperform foundation model approaches.

Data Augmentation Strategy

To enhance model robustness across different tissue preparation methods and scanning protocols, we implemented:

  • Random horizontal and vertical flipping
  • RandAugment with carefully tuned parameters
  • Histology-specific color augmentation for H&E-stained images, including stain deconvolution and concentration perturbation

Ensemble Methods

Our final predictions combine:

  • Test-Time Augmentation (TTA) across multiple image transformations
  • Ensemble voting across the five best-performing models
  • Optimal threshold selection using Youden's J-Statistic to maximize balanced accuracy

Performance Results

On the MIDOG++ test set, our approach achieved a balanced accuracy of 0.9107 for atypical mitotic cell classification. This strong performance indicates that atypical mitotic figures possess distinguishable features that deep learning algorithms can reliably identify—a promising finding for clinical translation.

Real-World Impact

The high accuracy we've achieved suggests that automated systems can effectively assist pathologists in identifying atypical mitoses, potentially:

  • Reducing inter-observer variability in cancer grading
  • Accelerating diagnostic workflows
  • Improving consistency in prognostic assessments
  • Supporting pathologists in handling increasing caseloads

Looking Forward

The OMG-Octo Atypical database is now publicly available, continuing our commitment to open science and collaborative advancement in computational pathology. We believe that by sharing these resources, we can accelerate progress across the field and ultimately improve patient outcomes.

This work was developed as part of our submission to the MIDOG 2025 challenge, which specifically targets the critical problem of domain generalization—ensuring that algorithms work reliably across different laboratories with varying staining protocols, scanning equipment, and tissue preparation methods.

Get Involved

We encourage researchers, pathologists, and data scientists to explore the OMG-Octo Atypical database and contribute to advancing mitotic figure detection and classification. Together, we can build more robust, generalizable tools that bring real value to clinical practice.


For technical details, dataset access, and implementation code, visit our resources page or contact us at c.fekete@ucl.ac.uk

Team: Zhuoyan Shen, Maria Hawkins, Esther Bär, Konstantin Bräutigam, and Charles-Antoine Collins-Fekete

Share this post

Charles-Antoine Collins-Fekete

Charles-Antoine Collins-Fekete

Dr. Collins-Fekete is a UKRI Future Leaders Fellow at UCL, leading research in AI for cancer diagnosis with a focus on digital pathology. He has established a team of post-doctoral researchers, published over 30 peer-reviewed papers, and secured substantial funding exceeding £3 Mio. As founder of the Octopath spin-out and co-founder of the UCL Cancer Collaboratorium, he drives the translation of cutting-edge science into impactful solutions for cancer care.