We collect existing papers on radiology report generation published in prominent conferences and journals.

Table of Contents


  • A Systematic Review of Deep Learning-based Research on Radiology Report Generation (arXiv 2311) [paper]
  • A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data (arXiv 2405) [paper]
  • Automated Radiology Report Generation: A Review of Recent Advances (IEEE Reviews in Biomedical Engineering'24) [paper]
  • From Vision to Text: A Comprehensive Review of Natural Image Captioning in Medical Diagnosis and Radiology Report Generation (Medical Image Analysis)[paper]
  • Automatic Medical Report Generation: Methods and Applications (arXiv'2408) [paper]


  • MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs (MIMIC-CXR) [paper][data].
  • Preparing a collection of radiology examinations for distribution and retrieval (IU X-ray) [paper][data].
  • Learning Visual-Semantic Embeddings for Reporting Abnormal Findings on Chest X-rays (MIMIC-ABN) [paper][code]
  • An efficient but effective writer: Diffusion-based semi-autoregressive transformer for automated radiology report generation (XRG-COVID-19) [paper][data].
  • HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction (HistGen WSI) [paper][data].
  • CheXpert Plus: Hundreds of Thousands of Aligned Radiology Texts, Images and Patients (CheXpert Plus) [paper] [data]
  • CXR-PRO: MIMIC-CXR with Prior References Omitted (CXR-PRO) [data]
  • MS-CXR: Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing (MS-CXR) [data]
  • EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images (EHRXQA)[paper][code][data]
  • MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images (MIMIC-Ext-MIMIC-CXR-VQA)[code][data]
  • MS-CXR-T: Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing (MS-CXR-T)[data]
  • CAD-Chest: Comprehensive Annotation of Diseases based on MIMIC-CXR Radiology Report (CAD-Chest)[data][paper][code]
  • VinDr-CXR: An open dataset of chest X-rays with radiologist annotations (VinDr-CXR)[data]
  • Chest ImaGenome Dataset (ImaGenome) [data]
  • Interpretable medical image Visual Question Answering via multi-modal relationship graph learning (Medical-CXR-VQA) [MedIA'24][code]
  • ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation (ReXPref-Prior)[data]
  • An open chest X-ray dataset with benchmarks for automatic radiology report generation in French (CASIA-CXR) [Neurocomputing'24] [data][paper]
  • PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology (WSI-VQA)[arXiv'2401][paper][data]


  • FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores (arXiv'2405) [paper][code]
  • FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation (EMNLP'23) [paper][code]
  • DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation (ACL'24) [paper][code]
  • RaTEScore: A Metric for Radiology Report Generation [paper][code]
  • GREEN: Generative Radiology Report Evaluation and Error Notation [paper][code]
  • When Radiology Report Generation Meets Knowledge Graph (MIRQI) [paper][code]
  • Evaluating progress in automatic chest X-ray radiology report generation (RadCliQ)[paper][code]
  • Evaluating GPT-4 on Impressions Generation in Radiology Reports (Radiology)[paper]
  • ReXamine-Global: A Framework for Uncovering Inconsistencies in Radiology Report Generation Metrics (arXiv'2408)[paper]
  • MRScore: Evaluating Medical Report with LLM-Based Reward System (MICAAI'24) [paper]

Foundation Models for Medicine

  • CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation (arXiv'2401) [paper][code]
  • XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models (ACLW'24)[paper][code]
  • Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training (ICML'24) [paper][code]
  • A generalist vision--language foundation model for diverse biomedical tasks (Nature Medicine'24)[paper][code]
  • ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training (arXiv'2311)[paper][code]
  • CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training (MICCAI'23)[paper][code]
  • GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition (ICCV'21)[paper][code]
  • CXR-LLAVA: a multimodal large language model for interpreting chest X-ray images (arXiv'2310)[paper][code]
  • LLaVA-OneVision: Easy Visual Task Transfer (arXiv'2408)[paper][code]
  • Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity (arXiv'2410) [paper]
  • MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging (arXiv'2410)[paper]




  • Automatic Radiology Reports Generation via Memory Alignment Network [paper] [code]
  • PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation [paper][code]
  • Bootstrapping Large Language Models for Radiology Report Generation [paper] [code]


  • Instance-level Expert Knowledge and Aggregate Discriminative Attention for Radiology Report Generation [paper] [code]
  • AHIVE: Anatomy-aware Hierarchical Vision Encoding for Interactive Radiology Report Retrieval [paper] [[code]]
  • InVERGe: Intelligent Visual Encoder for Bridging Modalities in Report Generation (Workshop) [paper][code]


  • DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation [paper][code]
  • SICAR at RRG2024: GPU Poor’s Guide to Radiology Report Generation [paper]
  • BiCAL: Bi-directional Contrastive Active Learning for Clinical Report Generation [paper]
  • CID at RRG24: Attempting in a Conditionally Initiated Decoding of Radiology Report Generation with Clinical Entities [paper]
  • RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports [paper]
  • MLeVLM: Improve Multi-level Progressive Capabilities based on Multimodal Large Language Model for Medical Visual Question Answering [paper][code]
  • Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation [paper]


  • LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation [paper][code]


  • Medical Report Generation via Multimodal Spatio-Temporal Fusion [paper]
  • Diffusion Networks with Task-Specific Noise Control for Radiology Report Generation [paper]
  • Divide and Conquer: Isolating Normal-Abnormal Attributes in Knowledge Graph-Enhanced Radiology Report Generation [paper]
  • In-context Learning for Zero-shot Medical Report Generation [paper]


  • HERGen: Elevating Radiology Report Generation with Longitudinal Data [paper] [code]
  • Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [paper]



  • Textual Inversion and Self-supervised Refinement for Radiology Report Generation [paper] [[code]]
  • Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation [paper] [code]
  • CT2Rep: Automated Radiology Report Generation for 3D Medical Imaging [paper] [code]
  • WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole Slide Images [paper][code]
  • Multivariate Cooperative Game for Image-Report Pairs: Hierarchical Semantic Alignment for Medical Report Generation [paper]
  • MRScore: Evaluating Medical Report with LLM-Based Reward System [paper]
  • Energy-Based Controllable Radiology Report Generation with Medical Knowledge [paper]



  • Complex Organ Mask Guided Radiology Report Generation [paper][code]


  • From Vision to Text: A Comprehensive Review of Natural Image Captioning in Medical Diagnosis and Radiology Report Generation [paper]
  • Enhancing the vision–language foundation model with key semantic knowledge-emphasized report refinement [paper]


  • Multi-grained Radiology Report Generation with Sentence-level Image-language Contrastive Learning [paper] [[code]]
  • SGT++: Improved Scene Graph-Guided Transformer for Surgical Report Generation [paper][[code]]
  • PhraseAug: An Augmented Medical Report Generation Model with Phrasebook [paper] [[code]]
  • Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting [paper] [code]
  • An Organ-aware Diagnosis Framework for Radiology Report Generation [paper]
  • Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation [paper]
  • A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images with Multi-Relationship Graph Learning [paper]


  • Semi-Supervised Medical Report Generation via Graph-Guided Hybrid Feature Consistency [paper] [[code]]
  • Multi-Level Objective Alignment Transformer for Fine-Grained Oral Panoramic X-Ray Report Generation [paper] [[code]]


  • CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation [paper] [code]
  • TSGET: Two-Stage Global Enhanced Transformer for Automatic Radiology Report Generation [paper] [code]

Expert Systems with Applications'24

  • CheXReport: A transformer-based architecture to generate chest X-ray reports suggestions [paper][code]


  • Improving radiology report generation with multi-grained abnormality prediction [paper]
  • An open chest X-ray dataset with benchmarks for automatic radiology report generation in French [paper][data]
  • Trust it or not: Confidence-guided automatic radiology report generation [paper]

Academic Radiology'24

  • Practical Evaluation of ChatGPT Performance for Radiology Report Generation [paper]

IEEE Transactions on Emerging Topics in Computational Intelligence'24

  • End-to-End Clustering Enhanced Contrastive Learning for Radiology Reports Generation [paper]

arXiv papers'24

  • Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation [paper] [code]
  • FITA: Fine-grained Image-Text Aligner for Radiology Report Generation [paper] [[code]]
  • GREEN: Generative Radiology Report Evaluation and Error Notation [paper] [[code]]
  • CheXpert Plus: Hundreds of Thousands of Aligned Radiology Texts, Images and Patients [paper] [code]
  • Topicwise Separable Sentence Retrieval for Medical Report Generation [paper] [[code]]
  • Dia-LLaMA: Towards Large Language Model-driven CT Report Generation [paper] [[code]]
  • ICON: Improving Inter-Report Consistency of Radiology Report Generation via Lesion-aware Mix-up Augmentation [paper] [code]
  • MAIRA-2: Grounded Radiology Report Generation [paper][[code]]
  • Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images [paper]
  • The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It [paper][code]
  • Improving Expert Radiology Report Summarization by Prompting Large Language Models with a Layperson Summary [paper]
  • Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation [paper]
  • X-ray Made Simple: Radiology Report Generation and Evaluation with Layman's Terms [paper]
  • Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis [paper][code]
  • Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation [paper]]
  • R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation [paper][code]
  • Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation [paper]
  • M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation [paper]
  • Medical Report Generation Is A Multi-label Classification Problem [paper]
  • KARGEN: Knowledge-enhanced Automated Radiology Report Generation Using Large Language Models [paper]
  • Democratizing MLLMs in Healthcare: TinyLLaVA-Med for Efficient Healthcare Diagnostics in Resource-Constrained Settings [paper]
  • SLaVA-CXR: Small Language and Vision Assistant for Chest X-ray Report Automation [paper]
  • Expert-level vision-language foundation model for real-world radiology and comprehensive evaluation [paper]
  • CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset [paper][code]
  • 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [paper]



  • Advancing radiograph representation learning with masked record modeling [paper][code]



  • KiUT: Knowledge-injected U-Transformer for Radiology Report Generation [paper] [[code]]
  • METransformer: Radiology report generation by transformer with multiple learnable expert tokens [paper][[code]]
  • Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation [paper] [code]
  • Interactive and Explainable Region-guided Radiology Report Generation [paper][code]


  • ORGAN: Observation-Guided Radiology Report Generation via Tree Reasoning [paper] [code]


  • RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning [paper] [code]
  • Normal-Abnormal Decoupling Memory for Medical Report Generation [paper] [code]
  • Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting [paper] [[code]]
  • PhenotypeCLIP: Phenotype-based Contrastive Learning for Medical Imaging Report Generation [paper]


  • Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports [paper] [code]



  • Pragmatic Radiology Report Generation [paper] [code]


  • Improving Radiology Report Generation with D 2-Net: When Diffusion Meets Discriminator [paper] [[code]]


  • Radiology report generation with a learned knowledge base and multi-modal alignment [paper] [code]


  • Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation [paper][[code]]


  • Evaluating progress in automatic chest X-ray radiology report generation[paper][code]


  • From Observation to Concept: A Flexible Multi-view Paradigm for Medical Report Generation [paper] [[code]]
  • Joint Embedding of Deep Visual and Semantic Features for Medical Image Report Generation [paper] [[code]]


  • R2gengpt: Radiology report generation with frozen llms [paper][code]

arXiv papers'23

  • MAIRA-1: A specialised large multimodal model for radiology report generation [paper] [[code]]
  • Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation [paper][code]



  • Clinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation [paper] [[code]]


  • Reinforced Cross-modal Alignment for Radiology Report Generation [paper] [code]


  • A Medical Semantic-Assisted Transformer for Radiographic Report Generation [paper] [code]
  • CheXRelNet An Anatomy-Aware Model for Tracking Longitudinal Relationships Between Chest X-Rays [paper][code]

Nature Machine Intelligence'22

  • Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports [paper][code]


  • Knowledge matters: Chest radiology report generation with general and specific knowledge [paper] [code]


  • Automated Radiographic Report Generation Purely on Transformer: A Multicriteria Supervised Approach [paper] [[code]]

arXiv papers'22



  • Cross-modal Memory Networks for Radiology Report Generation [paper] [code]


  • Progressive Transformer-Based Generation of Radiology Reports [paper] [code]


  • Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation [paper] [code]



  • When Radiology Report Generation Meets Knowledge Graph [paper] [code]


  • Generating Radiology Reports via Memory-driven Transformer [paper] [code]

Other Resources

  • Learning to Exploit Temporal Structure for Biomedical Vision–Language Processing (CVPR'23) [paper [code]


Feel free to contact me if you find any interesting papers missing.



