File size: 2,823 Bytes

fa2c3bc
 
 
 
 
 
 
 
f01962b
fa2c3bc
f01962b
fa2c3bc
 
f01962b
 
 
fa2c3bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f01962b
fa2c3bc
 
15a31f5
fa2c3bc
 
 
f01962b
fa2c3bc
 
15a31f5
fa2c3bc
 
 
f01962b
fa2c3bc
15a31f5
fa2c3bc
 
 
f01962b
fa2c3bc
 
 
15a31f5
fa2c3bc
 
15a31f5
fa2c3bc
 
 
 
f01962b
 
 
 
 
 
 
fa2c3bc

---
license: mit
library_name: pytorch
tags:
- Medical Vsion-Language Pre-Training
- BenchX
---

# MGCA-ViT Checkpoint Model Card

A retrained MGCA-ViT model for benchmarking medical vision-language pre-training methods within the BenchX framework.

## Model Details
- **Model Type**: MGCA-ViT
- **Architecture**: ViT-Base image encoder and BioClinicalBERT text encoder
- **Original Papers**: [Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning](https://arxiv.org/abs/2210.06044)
- **Benchmark Paper**: [BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays](https://arxiv.org/abs/2410.21969)
- **Benchmark Framework**: https://github.com/yangzhou12/BenchX

## Intended Use
- **Primary Use Cases**:
  - Benchmarking performance for Medical Image Classification
  - Benchmarking performance for Medical Image Segmentation
  - Benchmarking performance for Medical Report Generation

## Pre-Training Data
- **Dataset**: 
  - Data source(s): MIMIC-CXR
  - Types of medical images: Frontal chest X-rays
  - Text data type: Associated radiology reports

## Prerequisites

Please follow the [instruction](https://github.com/yangzhou12/BenchX/blob/release/README.md#installation) to install BenchX. 

## Training & Evaluation

### 1. Classification

To fine-tune MGCA-ViT for classification, run this command:

```
python bin/train.py config/classification/<dataset_name>/mgca_vit.yml
```

### 2. Segmentation
To fine-tune MGCA-ViT for segmentation, run this command:

```
python mmsegmentation/tools/train.py config/benchmark/<dataset_name>/mgca_vit.yml
```

### 3. Report Generation
To fine-tune MGCA-ViT for report generation, run this command:
```
python bin/train.py config/report_generation/<dataset_name>/mgca_vit.yml
```

### 4. Evaluation
To evaluate fine-tuned MGCA-ViT models, run:

```
# For classification and report generation
python bin/test.py config/<task_name>/<dataset_name>/mgca_vit.yml validator.splits=[test] ckpt_dir=<path_to_checkpoint>

# For segmentation
python mmsegmentation/tools/my_test.py mmsegmentation/config/<dataset_name>/mgca_vit.yml <path_to_checkpoint>
```

## Citations
```bibtex
@article{wang2022multi,
  title={Multi-granularity cross-modal alignment for generalized medical visual representation learning},
  author={Wang, Fuying and Zhou, Yuyin and Wang, Shujun and Vardhanabhuti, Varut and Yu, Lequan},
  journal={Advances in NeurIPS},
  volume={35},
  pages={33536--33549},
  year={2022}
}
```
```bibtex
@inproceedings{zhou2024benchx,
  title={BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays},
  author={Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh},
  booktitle={Proceedings of NeurIPS},
  year={2024}
}
```