PMC-Patients

A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems

About

There are two tasks in PMC-Patients for benchmarking ReCDS systems: Patient-to-Article Retrieval (PAR) and Patient-to-Patient Retrieval (PPR). For a given query patient, PAR aims to retrieve relevant articles from PubMed, and PPR aims to retrieve similar patients from PMC-Patients.

For more details about PMC-Patients, please refer to our paper:

Dataset & Submission

PMC-Patients contain 167k patient summaries collected from PubMed Central, annotated with 3.1M relevant articles and 293k similar patients defined by PubMed citation relationships.

Please visit our GitHub repository to download the dataset and submit your model:

Patient-to-Article Retrieval (PAR) Leaderboard
Model MRR (%) P@10 (%) nDCG@10 (%) R@1k (%)
1
June 25, 2023
DPR (SciMult-MHAExpert)
University of Illinois at Urbana-Champaign
(Zhang et al. 2023)
64.44 22.12 28.62 69.09
2
Apr 5, 2023
DPR (PubMedBERT)
Tsinghua University
(Zhao et al. 2022)
42.96 16.08 19.51 63.40
3
Apr 5, 2023
DPR (SPECTER)
Tsinghua University
(Zhao et al. 2022)
46.41 15.59 19.70 57.98
4
Apr 5, 2023
DPR (BioLinkBERT)
Tsinghua University
(Zhao et al. 2022)
40.89 15.33 18.47 62.44
5
Apr 5, 2023
BM25
Tsinghua University
(Zhao et al. 2022)
48.22 9.97 15.28 30.64
6
Dec 16, 2021
Contriever
Meta AI Research
(Izacard et al. 2021)
15.03 3.41 4.62 16.74
7
Aug 14, 2019
Sentence-BERT
UKP-TUDA
(Reimers et al. 2019)
10.58 2.71 3.53 13.52
Patient-to-Patient Retrieval (PPR) Leaderboard
Model MRR (%) P@10 (%) nDCG@10 (%) R@1k (%)
1
June 25, 2023
DPR (SciMult-MHAExpert)
University of Illinois at Urbana-Champaign
(Zhang et al. 2023)
25.35 6.65 22.39 83.78
2
Apr 5, 2023
BM25
Tsinghua University
(Zhao et al. 2022)
22.86 4.67 18.29 69.66
3
Apr 5, 2023
DPR (BioLinkBERT)
Tsinghua University
(Zhao et al. 2022)
21.20 5.59 18.06 80.49
4
Apr 5, 2023
DPR (PubMedBERT)
Tsinghua University
(Zhao et al. 2022)
19.37 5.05 16.30 79.35
5
Apr 5, 2023
DPR (SPECTER)
Tsinghua University
(Zhao et al. 2022)
15.08 3.79 12.27 73.01
6
Dec 16, 2021
Contriever
Meta AI Research
(Izacard et al. 2021)
10.50 2.24 8.01 52.64
7
Aug 14, 2019
Sentence-BERT
UKP-TUDA
(Reimers et al. 2019)
5.28 1.17 3.88 37.55

Citation

If you use PMC-Patients in your research, please cite our paper by:

@misc{zhao2022pmcpatients,
  title={PMC-Patients: A Large-scale Dataset of Patient Notes and Relations Extracted from Case Reports in PubMed Central}, 
  author={Zhengyun Zhao and Qiao Jin and Sheng Yu},
  year={2022},
  eprint={2202.13876},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}