Pathology Ai: Difference between revisions
BloomWiki: Pathology Ai |
BloomWiki: Pathology Ai |
||
| (One intermediate revision by the same user not shown) | |||
| Line 1: | Line 1: | ||
<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
{{BloomIntro}} | {{BloomIntro}} | ||
Computational pathology applies deep learning to the analysis of digitized tissue slides — whole-slide images (WSI) captured by digital pathology scanners. Pathology is the gold standard for cancer diagnosis, but it is labor-intensive, subjective, and facing a global workforce shortage. AI can analyze WSIs to classify cancer grade, predict molecular biomarkers, identify cell types, and predict patient survival — performing tasks that would take a pathologist hours in seconds. With FDA-cleared AI tools entering clinical pathology laboratories, the field is transitioning from research to real-world impact. | Computational pathology applies deep learning to the analysis of digitized tissue slides — whole-slide images (WSI) captured by digital pathology scanners. Pathology is the gold standard for cancer diagnosis, but it is labor-intensive, subjective, and facing a global workforce shortage. AI can analyze WSIs to classify cancer grade, predict molecular biomarkers, identify cell types, and predict patient survival — performing tasks that would take a pathologist hours in seconds. With FDA-cleared AI tools entering clinical pathology laboratories, the field is transitioning from research to real-world impact. | ||
</div> | |||
== Remembering == | __TOC__ | ||
<div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
== <span style="color: #FFFFFF;">Remembering</span> == | |||
* '''Whole Slide Image (WSI)''' — A digitized pathology slide; gigapixel images (~100,000 × 100,000 pixels) scanned at 20–40× magnification. | * '''Whole Slide Image (WSI)''' — A digitized pathology slide; gigapixel images (~100,000 × 100,000 pixels) scanned at 20–40× magnification. | ||
* '''H&E staining''' — Hematoxylin and Eosin; the standard pathology stain coloring nuclei blue and cytoplasm pink. | * '''H&E staining''' — Hematoxylin and Eosin; the standard pathology stain coloring nuclei blue and cytoplasm pink. | ||
| Line 17: | Line 22: | ||
* '''Biomarker prediction from morphology''' — Predicting molecular alterations (MSI, BRCA mutation, TMB) from H&E histology without molecular testing. | * '''Biomarker prediction from morphology''' — Predicting molecular alterations (MSI, BRCA mutation, TMB) from H&E histology without molecular testing. | ||
* '''Cell segmentation (pathology)''' — Detecting and classifying individual cells (tumor, immune, stromal) within tissue; HoverNet, StarDist, CellViT. | * '''Cell segmentation (pathology)''' — Detecting and classifying individual cells (tumor, immune, stromal) within tissue; HoverNet, StarDist, CellViT. | ||
</div> | |||
== Understanding == | <div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Understanding</span> == | |||
Pathology AI faces a unique challenge: slides are gigapixel-scale images far too large for direct processing by neural networks (a 40× WSI can be 100,000 × 100,000 pixels = 10 billion pixels). Two dominant strategies address this: | Pathology AI faces a unique challenge: slides are gigapixel-scale images far too large for direct processing by neural networks (a 40× WSI can be 100,000 × 100,000 pixels = 10 billion pixels). Two dominant strategies address this: | ||
| Line 30: | Line 37: | ||
'''FDA-cleared pathology AI''': Paige Prostate is the first FDA-authorized AI for prostate cancer detection. PathAI and other companies have cleared tools for various cancer types. Regulatory scrutiny is high: prospective clinical validation, algorithmic bias testing, and reader studies are required. | '''FDA-cleared pathology AI''': Paige Prostate is the first FDA-authorized AI for prostate cancer detection. PathAI and other companies have cleared tools for various cancer types. Regulatory scrutiny is high: prospective clinical validation, algorithmic bias testing, and reader studies are required. | ||
</div> | |||
== Applying == | <div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Applying</span> == | |||
'''WSI classification with CLAM (MIL):''' | '''WSI classification with CLAM (MIL):''' | ||
<syntaxhighlight lang="python"> | <syntaxhighlight lang="python"> | ||
| Line 87: | Line 96: | ||
: '''Cell segmentation''' → HoverNet, StarDist, CellViT, CellPose | : '''Cell segmentation''' → HoverNet, StarDist, CellViT, CellPose | ||
: '''Commercial AI''' → Paige, PathAI, Aiforia, Ibex Medical Analytics | : '''Commercial AI''' → Paige, PathAI, Aiforia, Ibex Medical Analytics | ||
</div> | |||
== Analyzing == | <div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Analyzing</span> == | |||
{| class="wikitable" | {| class="wikitable" | ||
|+ Pathology AI Clinical Applications | |+ Pathology AI Clinical Applications | ||
| Line 107: | Line 118: | ||
'''Failure modes''': Scanner variability — staining protocols and scanner calibration differ; models overfit to specific scanner characteristics. Stain normalization needed but can introduce artifacts. Tumor heterogeneity — sampling bias in biopsies; AI sees only a portion of the actual tumor. Interobserver variability — ground truth labels from pathologists have significant disagreement rates. Whole-slide processing bottleneck — gigapixel images require significant compute infrastructure. | '''Failure modes''': Scanner variability — staining protocols and scanner calibration differ; models overfit to specific scanner characteristics. Stain normalization needed but can introduce artifacts. Tumor heterogeneity — sampling bias in biopsies; AI sees only a portion of the actual tumor. Interobserver variability — ground truth labels from pathologists have significant disagreement rates. Whole-slide processing bottleneck — gigapixel images require significant compute infrastructure. | ||
</div> | |||
== Evaluating == | <div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
Pathology AI evaluation: | == <span style="color: #FFFFFF;">Evaluating</span> == | ||
Pathology AI evaluation: | |||
# '''AUC per clinical task''': detection, grading, biomarker prediction. | |||
# '''Concordance with molecular tests''': for biomarker prediction models, compare to IHC/sequencing gold standards. | |||
# '''Reader study''': pathologists with and without AI assistance; measure diagnostic accuracy, time, confidence. | |||
# '''Multi-site validation''': test on slides from different labs, scanners, preparation protocols. | |||
# '''Attention visualization''': inspect which tissue regions drive predictions — should match known diagnostic criteria. | |||
</div> | |||
== Creating == | <div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
Building a pathology AI pipeline: | == <span style="color: #FFFFFF;">Creating</span> == | ||
Building a pathology AI pipeline: | |||
# Data: collect WSIs with slide-level labels (diagnosis, grade, biomarker status) from pathology archive. | |||
# Preprocessing: tissue segmentation, patch extraction at 256×256 / 20×, feature extraction with UNI or CONCH. | |||
# MIL training: CLAM with 5-fold cross-validation; attention-based pooling. | |||
# Interpretability: generate attention heatmaps overlaid on WSI; pathologist verification. | |||
# Bias audit: evaluate performance across patient demographics. | |||
# Clinical validation: prospective reader study at target institution. | |||
# Regulatory: work with regulatory consultant on FDA 510(k) or De Novo pathway. | |||
[[Category:Artificial Intelligence]] | [[Category:Artificial Intelligence]] | ||
[[Category:Pathology]] | [[Category:Pathology]] | ||
[[Category:Medical Imaging]] | [[Category:Medical Imaging]] | ||
</div> | |||
Latest revision as of 01:55, 25 April 2026
How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?
Computational pathology applies deep learning to the analysis of digitized tissue slides — whole-slide images (WSI) captured by digital pathology scanners. Pathology is the gold standard for cancer diagnosis, but it is labor-intensive, subjective, and facing a global workforce shortage. AI can analyze WSIs to classify cancer grade, predict molecular biomarkers, identify cell types, and predict patient survival — performing tasks that would take a pathologist hours in seconds. With FDA-cleared AI tools entering clinical pathology laboratories, the field is transitioning from research to real-world impact.
Remembering[edit]
- Whole Slide Image (WSI) — A digitized pathology slide; gigapixel images (~100,000 × 100,000 pixels) scanned at 20–40× magnification.
- H&E staining — Hematoxylin and Eosin; the standard pathology stain coloring nuclei blue and cytoplasm pink.
- IHC (Immunohistochemistry) — Staining technique detecting specific proteins; used for biomarker testing (HER2, PD-L1, ER/PR).
- Tumor grading — Assessing tumor aggressiveness from histological features; e.g., Gleason score (prostate), Bloom-Richardson (breast).
- Multiple Instance Learning (MIL) — A weakly-supervised framework handling gigapixel WSI by treating each slide as a bag of smaller patches.
- Patch-based classification — Dividing WSI into tiles (e.g., 256×256 pixels) and classifying each; used for training with slide-level labels.
- CLAM (Clustering-constrained Attention Multiple Instance Learning) — A widely used MIL framework for WSI classification.
- Attention mechanism (pathology) — Identifies which patches are most diagnostically relevant within a slide.
- PathAI — A commercial computational pathology company with FDA-cleared tools; founded by Andrew Beck.
- Paige — First FDA-authorized AI for prostate cancer pathology; detects cancer in prostate biopsies.
- Foundation models (pathology) — CONCH, UNI, Phikon — vision transformers pre-trained on millions of pathology images; strong feature extractors.
- Pan-cancer classification — Predicting tumor type directly from histology across multiple cancer types.
- Biomarker prediction from morphology — Predicting molecular alterations (MSI, BRCA mutation, TMB) from H&E histology without molecular testing.
- Cell segmentation (pathology) — Detecting and classifying individual cells (tumor, immune, stromal) within tissue; HoverNet, StarDist, CellViT.
Understanding[edit]
Pathology AI faces a unique challenge: slides are gigapixel-scale images far too large for direct processing by neural networks (a 40× WSI can be 100,000 × 100,000 pixels = 10 billion pixels). Two dominant strategies address this:
Patch-based approaches: Extract thousands of smaller patches (256×256 or 512×512 pixels) from each slide. Train a CNN or ViT on each patch individually. Aggregate patch-level predictions to a slide-level diagnosis. This works but requires patch-level annotations, which are expensive and often unavailable.
Multiple Instance Learning (MIL): The dominant approach for slide-level labels. Each slide is a "bag" of patches. The bag label (e.g., cancer present) is known, but which patches contain cancer is unknown. MIL aggregates patch features using attention or pooling to produce a slide-level prediction. CLAM's attention mechanism additionally identifies which patches are driving the prediction — providing weak localization.
Pathology foundation models: Pre-trained on millions of pathology patches using self-supervised learning (DINO, MAE, DINOv2), models like UNI, CONCH, and Prov-GigaPath learn rich histological feature representations. These serve as feature extractors for downstream tasks with minimal labeled data — a major advance for data-scarce pathology problems.
Biomarker prediction from morphology: Neural networks trained on paired (WSI, molecular test result) data can predict molecular biomarkers from histology alone. TCGA-trained models predict microsatellite instability (MSI), BRAF mutation, HER2 amplification, and survival from H&E slides without any molecular testing. These predictions are not yet clinical-grade but suggest deep morphological correlates of molecular biology.
FDA-cleared pathology AI: Paige Prostate is the first FDA-authorized AI for prostate cancer detection. PathAI and other companies have cleared tools for various cancer types. Regulatory scrutiny is high: prospective clinical validation, algorithmic bias testing, and reader studies are required.
Applying[edit]
WSI classification with CLAM (MIL): <syntaxhighlight lang="python"> import torch import torch.nn as nn import torch.nn.functional as F
class Attn_Net_Gated(nn.Module):
"""Gated attention network for MIL aggregation."""
def __init__(self, L=1024, D=256, dropout=0.25):
super().__init__()
self.attention_a = nn.Sequential(nn.Linear(L, D), nn.Tanh(), nn.Dropout(dropout))
self.attention_b = nn.Sequential(nn.Linear(L, D), nn.Sigmoid(), nn.Dropout(dropout))
self.attention_c = nn.Linear(D, 1)
def forward(self, x):
a = self.attention_a(x)
b = self.attention_b(x)
A = self.attention_c(a * b) # Gated attention scores
return A, x # (N, 1), (N, L)
class CLAM_SB(nn.Module):
"""CLAM single-branch for binary WSI classification."""
def __init__(self, feature_dim=1024, n_classes=2, dropout=0.25):
super().__init__()
self.attention_net = Attn_Net_Gated(L=feature_dim, D=256, dropout=dropout)
self.classifiers = nn.Linear(feature_dim, n_classes)
self.instance_classifier = nn.Linear(feature_dim, 2) # For instance-level clustering
def forward(self, h):
# h: (N, feature_dim) — patch embeddings from pre-trained feature extractor
A, h = self.attention_net(h)
A = F.softmax(A, dim=0).transpose(0, 1) # Softmax over patches: (1, N)
M = torch.mm(A, h) # Weighted aggregation: (1, feature_dim)
logits = self.classifiers(M) # Slide-level prediction
Y_hat = torch.argmax(logits, dim=1)
Y_prob = F.softmax(logits, dim=1)
return logits, Y_prob, Y_hat, A # A contains attention scores for visualization
- Feature extraction pipeline
- 1. Segment tissue from background (Otsu thresholding)
- 2. Extract non-overlapping 256×256 patches at 20× magnification
- 3. Extract features using pathology foundation model (UNI, CONCH, ResNet50-ImageNet)
- 4. Feed patch features to CLAM for WSI-level prediction
- Using UNI (pre-trained ViT on 100K pathology images)
- import timm
- uni = timm.create_model("hf_hub:MahmoodLab/uni", pretrained=True)
</syntaxhighlight>
- Computational pathology tools
- WSI viewing → QuPath (open-source), Aperio ImageScope, SlideViewer
- MIL frameworks → CLAM (GitHub), TransMIL, DTFD-MIL
- Foundation models → UNI, CONCH (Mahmood Lab), Prov-GigaPath (Microsoft/Providence)
- Cell segmentation → HoverNet, StarDist, CellViT, CellPose
- Commercial AI → Paige, PathAI, Aiforia, Ibex Medical Analytics
Analyzing[edit]
| Application | AI Performance | Clinical Status |
|---|---|---|
| Prostate cancer detection | AUC 0.97 (Paige) | FDA authorized |
| Breast cancer mitosis counting | Expert-level | CE marked (several) |
| Colorectal cancer grading | High | Research → clinical |
| MSI prediction from H&E | AUC ~0.85 | Research |
| Cell type quantification | High (specialized tools) | Used in trials |
| Survival prediction | C-index 0.65-0.75 | Research |
Failure modes: Scanner variability — staining protocols and scanner calibration differ; models overfit to specific scanner characteristics. Stain normalization needed but can introduce artifacts. Tumor heterogeneity — sampling bias in biopsies; AI sees only a portion of the actual tumor. Interobserver variability — ground truth labels from pathologists have significant disagreement rates. Whole-slide processing bottleneck — gigapixel images require significant compute infrastructure.
Evaluating[edit]
Pathology AI evaluation:
- AUC per clinical task: detection, grading, biomarker prediction.
- Concordance with molecular tests: for biomarker prediction models, compare to IHC/sequencing gold standards.
- Reader study: pathologists with and without AI assistance; measure diagnostic accuracy, time, confidence.
- Multi-site validation: test on slides from different labs, scanners, preparation protocols.
- Attention visualization: inspect which tissue regions drive predictions — should match known diagnostic criteria.
Creating[edit]
Building a pathology AI pipeline:
- Data: collect WSIs with slide-level labels (diagnosis, grade, biomarker status) from pathology archive.
- Preprocessing: tissue segmentation, patch extraction at 256×256 / 20×, feature extraction with UNI or CONCH.
- MIL training: CLAM with 5-fold cross-validation; attention-based pooling.
- Interpretability: generate attention heatmaps overlaid on WSI; pathologist verification.
- Bias audit: evaluate performance across patient demographics.
- Clinical validation: prospective reader study at target institution.
- Regulatory: work with regulatory consultant on FDA 510(k) or De Novo pathway.