end2end_neural_el


  • This step requires the entity vectors and the word-embeddings to exist. An essential part of our system are the entity vectors (the equivalent of word-embeddings for entities). You can create your entity vectors by following the instructions of the [next chapter](#gerbil-evaluation), otherwise you can use the provided pretrained ones. We have pretrained 502661 entity vectors. Specifically, we have trained entity vectors for all the candidate entities from all possible spans of aida-TestA, aida-TestB, aida-Training 1, ACE2004, AQUAINT, MSNBC, Clueweb, DBpediaSpotlight, Derczynski, ERD2014, GERDAQ-Dev, GERDAQ-Test, GERDAQ-TrainingA, GERDAQ-TrainingB, KORE50, Microposts2016-Dev, Microposts2016-Test, Microposts2016-Train, N3-RSS-500, N3-Reuters-128, OKE 2015 task1, OKE 2016 task1, and the entity relatedness dataset of (Ceccarelli et al., 2013). In more detail, this is done by considering all possible spans of the document as a candidate span and querying our p(e|m) dictionary for all the candidate entities for this span (we keep only the top 30 for each candidate span).