BLM-Roll-BakeE
Description
BLM-Roll-BakeE is an English Blackbird Language Matrices (BLM) dataset designed to test models' ability to learn and generalize rule-like patterns in verb argument structure from minimal data. It focuses on two verb classes: roll-class (causative/inchoative alternation, Theme as subject or object) and bake-class (unspecified object alternation, Agent subject, object optional).
Each instance is a Blackbird Language Matrix: a multiple-choice puzzle where seven context sentences encode a pattern, and the task is to select the correct continuation. Sentences are template-generated based on established verb classifications.
BLM-Roll-BakeE uses analogical paradigms with minimal contextual cues for semantic roles and contrastive distractors based on a principled error taxonomy. Distractors violate one constraint (semantic roles, syntax, or analogical consistency) while being grammatically isolated. This design enables detailed error analysis of rule acquisition.
The dataset is split by Verb Alternation Class (Levin, 1993): roll vs. bake verbs, and Lexical Variation Type: Type I (minimal variation) and Type II (moderate variation).
The dataset size is 3,000 instances for Type I and 15,000 instances for Type II, per verb class, with a 90–10 train/test split.
Type I has minimal lexical variation, sharing the same verb lemma and minimal noun variation, preserving structural and lexical patterns. Type II has moderate lexical variation, differing in verb lemma and feature-compatible nouns, requiring generalization.
The four CSV files correspond to these partitions:
-
blm_roll_type_i_3000.csv (roll, Type I)
-
blm_roll_type_ii_15000.csv (roll, Type II)
-
blm_bake_type_i_3000.csv (bake, Type I)
-
blm_bake_type_ii_15000.csv (bake, Type II)
The split column in each file indicates the train/test partition.
For each combination (roll/bake × Type I/II), a CSV file contains all generated instances, enriched with:
-
context sentences (`Sent_1..Sent_7`),
-
corresponding templates,
-
candidate answers (`Answer_1..Answer_7`) with canonical ordering,
-
error labels and Boolean correctness flags,
-
and a split column marking whether the instance belongs to the training or test subset used in the paper.
Reference
If you use this dataset, please cite the following publication:
Jiang, C., & Merlo, P. (2025). Analogical Structure, Minimal Contextual Cues and Contrastive Distractors: Input Design for Sample-Efficient Linguistic Rule Induction. arXiv preprint arXiv:2511.10441.