RACE Reading Comprehension Dataset

The RACE dataset is a large-scale ReAding Comprehension dataset collected from English Examinations that are created for middle school and high school students.

Report your results: If you have new results, please send Qizhe (qizhex@cs.cmu.edu) or Guokun (guokun@cs.cmu.edu) an email with the link to your paper!

Leaderboard

Model Report Time Institute RACE RACE-M RACE-H
Human Ceiling Performance Apr. 2017 CMU 94.5 95.4 94.2
Amazon Mechanical Turker Apr. 2017 CMU 73.3 85.1 69.4
ALBERT
(ensemble)
Sept. 26th 2019 Google Research & TTIC 89.4 91.2 88.6
ALBERT Sept. 26th 2019 Google Research & TTIC 86.5 89.0 85.5
RoBERTa + MMM Oct. 1st 2019 MIT & Amazon Alexa AI 85.0 89.1 83.3
DCMN+
(ensemble)
Sept. 2019 SJTU & CloudWalk 84.1 88.5 82.3
RoBERTa July 2019 Facebook AI 83.2 86.5 81.8
XLNet + DCMN+ August 2019 SJTU & CloudWalk 82.8 86.5 81.3
XLNet June 2019 Google Brain & CMU 81.75 85.45 80.21
BERT + DCMN+ August 2019 SJTU & CloudWalk 75.8 79.3 74.4
Dual Co-Matching Network (DCMN)
(ensemble)
Mar. 2019 SJTU & CloudWalk 74.1 79.5 71.8
Option Comparison Network (OCN)
(ensemble)
Mar. 2019 Pattern Recognition Center, WeChat AI, Tencent Inc 73.5 78.4 71.5
Dual Co-Matching Network (DCMN) Mar. 2019 SJTU & CloudWalk 72.3 77.6 70.1
BERT_LARGE Feb. 2019 Tencent AI Lab 72.0 76.6 70.1
Option Comparison Network (OCN) Mar. 2019 Pattern Recognition Center, WeChat AI, Tencent Inc 71.7 76.7 69.6
BERT_LARGE Jan. 2019 River Valley High School, Singapore 67.9 75.6 64.7
Reading Strategies Model
(ensemble)
Oct. 2018 Tencent AI Lab & Cornell 66.7 72.0 64.5
BERT_BASE Jan. 2019 River Valley High School, Singapore 65.0 71.7 62.3
Reading Strategies Model Oct. 2018 Tencent AI Lab & Cornell 63.8 69.2 61.5
GPT June 2018 OpenAI 59.0 62.9 57.4
Convolutional Spatial Attention
(ensemble)
Nov. 2018 Joint Laboratory of HIT and iFLYTEK Research 55.0 56.8 54.8
BiAttention (MRU)
(ensemble)
Mar. 2018 Nanyang Technological University &
Institute for Infocomm Research
53.3 60.2 50.3
Dynamic Fusion Networks
(ensemble)
Nov. 2017 MSR & CMU 51.2 55.6 49.4
Convolutional Spatial Attention Nov. 2018 Joint Laboratory of HIT and iFLYTEK Research 50.9 52.2 50.3
BiAttention (MRU) Mar. 2018 Nanyang Technological University &
Institute for Infocomm Research
50.4 57.7 47.4
Hierarchical Co-Matching June 2018 Singapore Management University &
IBM Research
50.4 55.8 48.2
Dynamic Fusion Networks Nov. 2017 MSR & CMU 47.4 51.5 45.7
ElimiNet
(ensemble)
Oct. 2017 IIT Madras 46.5 N/A N/A
Hierarchical Attention Flow Feb. 2018 Microsoft Research Asia &
Harbin Institute of Technology
46.0 45.0 46.4
Gated Attention Reader*
(ensemble)
Oct. 2017 CMU 45.9 N/A N/A
ElimiNet Oct. 2017 IIT Madras 44.5 N/A N/A
Gated Attention Reader* Apr. 2017 CMU 44.1 43.7 44.2
Stanford Attentive Reader* Apr. 2017 CMU 43.3 44.2 43.0
Sliding Window* Apr. 2017 CMU 32.2 37.3 30.4

* : The link does not point to the model paper, but the paper that tests the corresponding model on RACE.


Test your model on RACE

Why RACE is more challenging and interesting?

Useful resources to get you started

Data
Baseline code
Dataset paper