RACE Reading Comprehension Dataset

The RACE dataset is a large-scale ReAding Comprehension dataset collected from English Examinations that are created for middle school and high school students.

Report your results: If you have new results, please send Qizhe (qizhex@cs.cmu.edu) or Guokun (guokun@cs.cmu.edu) an email with the link to your paper!

Leaderboard

Model Report Time Institute RACE RACE-M RACE-H
Human Ceiling Performance Apr 15, 2017 CMU 94.5 95.4 94.2
Amazon Mechanical Turker Apr 15, 2017 CMU 73.3 85.1 69.4
ALBERT-SingleChoice + transfer learning (ensemble) Nov 06, 2020 Tencent Cloud Xiaowei & Tencent Cloud TI-ONE 91.4 93.6 90.5
Megatron-BERT (ensemble) Mar 13, 2020 NVIDIA Research 90.9 93.1 90.0
ALBERT-SingleChoice + transfer learning Nov 06, 2020 Tencent Cloud Xiaowei & Tencent Cloud TI-ONE 90.7 92.8 89.8
ALBERT + DUMA (ensemble) Mar 18, 2020 SJTU & Huawei Noah’s Ark Lab 89.8 92.6 88.7
Megatron-BERT Mar 13, 2020 NVIDIA Research 89.5 91.8 88.6
ALBERT (ensemble) Sep 26, 2019 Google Research & TTIC 89.4 91.2 88.6
UnifiedQA May 02, 2020 AI2 & UW 89.4 - -
ALBERT + DUMA Feb 08, 2020 SJTU & Huawei Noah’s Ark Lab 88.0 90.9 86.7
T5* May 02, 2020 Google 87.1 - -
ALBERT Sep 26, 2019 Google Research & TTIC 86.5 89.0 85.5
RoBERTa + MMM Oct 01, 2019 MIT & Amazon Alexa AI 85.0 89.1 83.3
DCMN+ (ensemble) Aug 30, 2019 SJTU & CloudWalk 84.1 88.5 82.3
RoBERTa Jul 26, 2019 Facebook AI 83.2 86.5 81.8
XLNet + DCMN+ Aug 30, 2019 SJTU & CloudWalk 82.8 86.5 81.3
BORT Oct 20, 2020 Amazon Alexa 82.2 85.9 80.7
XLNet Jun 19, 2019 Google Brain & CMU 81.75 85.45 80.21
BERT + DCMN+ Aug 30, 2019 SJTU & CloudWalk 75.8 79.3 74.4
GenNet Mar 03, 2020 AIT, Pune 75.4 77.3 79.6
Dual Co-Matching Network (DCMN) (ensemble) Mar 16, 2019 SJTU & CloudWalk 74.1 79.5 71.8
Option Comparison Network (OCN) (ensemble) Mar 07, 2019 Pattern Recognition Center, WeChat AI, Tencent Inc 73.5 78.4 71.5
Dual Co-Matching Network (DCMN) Mar 16, 2019 SJTU & CloudWalk 72.3 77.6 70.1
BERT_LARGE Feb 03, 2019 Tencent AI Lab 72.0 76.6 70.1
Option Comparison Network (OCN) Mar 07, 2019 Pattern Recognition Center, WeChat AI, Tencent Inc 71.7 76.7 69.6
BERT_LARGE Jan 23, 2019 River Valley High School, Singapore 67.9 75.6 64.7
Reading Strategies Model (ensemble) Oct 31, 2018 Tencent AI Lab & Cornell 66.7 72.0 64.5
BERT_BASE Jan 23, 2019 River Valley High School, Singapore 65.0 71.7 62.3
Reading Strategies Model Oct 31, 2018 Tencent AI Lab & Cornell 63.8 69.2 61.5
GPT Jun 11, 2018 OpenAI 59.0 62.9 57.4
Convolutional Spatial Attention (ensemble) Nov 21, 2018 Joint Laboratory of HIT and iFLYTEK Research 55.0 56.8 54.8
BiAttention (MRU) (ensemble) Mar 24, 2018 Nanyang Technological University &
Institute for Infocomm Research
53.3 60.2 50.3
Dynamic Fusion Networks (ensemble) Nov 14, 2017 MSR & CMU 51.2 55.6 49.4
Convolutional Spatial Attention Nov 21, 2018 Joint Laboratory of HIT and iFLYTEK Research 50.9 52.2 50.3
BiAttention (MRU) Mar 24, 2018 Nanyang Technological University &
Institute for Infocomm Research
50.4 57.7 47.4
Hierarchical Co-Matching Jun 11, 2018 Singapore Management University &
IBM Research
50.4 55.8 48.2
Dynamic Fusion Networks Nov 14, 2017 MSR & CMU 47.4 51.5 45.7
GAR + ElimiNet (ensemble) Jul 13, 2018 IIT Madras 47.2 47.4 47.4
Hierarchical Attention Flow Feb 02, 2018 Microsoft Research Asia &
Harbin Institute of Technology
46.0 45.0 46.4
ElimiNet Jul 13, 2018 IIT Madras 44.5 44.4 44.5
Gated Attention Reader* Apr 15, 2017 CMU 44.1 43.7 44.2
Stanford Attentive Reader* Apr 15, 2017 CMU 43.3 44.2 43.0
Sliding Window* Apr 15, 2017 CMU 32.2 37.3 30.4

* : The link points to the paper that tests the method on RACE.


Test your model on RACE

Why RACE is more challenging and interesting?

Useful resources to get you started

Data
Baseline code
Dataset paper