magpie-corpus


  • # MAGPIE corpus This is the **MAGPIE corpus**, a large sense-annotated corpus of potentially idiomatic expressions (PIEs), based on the British National corpus (BNC). Potentially idiomatic expressions are like idiomatic expressions, but the term also covers literal uses of idiomatic expressions, such as 'I leave work *at the end of the day*.' for the idiom 'at the end of the day'. The corpus contains 56,622 instances, covering 1,756 different idiom types, all of which have crowdsourced meaning labels. For details, see our [LREC paper](http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.35.pdf).