end2end_neural_el
-
This step requires the entity vectors and the word-embeddings to exist.
An essential part of our system are the entity vectors (the equivalent of word-embeddings for entities).
You can create your entity vectors by following the instructions of the [next chapter](#gerbil-evaluation), otherwise you can use
the provided pretrained ones.
We have pretrained 502661 entity vectors.
Specifically,
we have trained entity vectors for all the candidate entities from all possible spans of
aida-TestA, aida-TestB, aida-Training 1, ACE2004, AQUAINT, MSNBC, Clueweb, DBpediaSpotlight, Derczynski,
ERD2014, GERDAQ-Dev, GERDAQ-Test, GERDAQ-TrainingA, GERDAQ-TrainingB, KORE50,
Microposts2016-Dev, Microposts2016-Test, Microposts2016-Train, N3-RSS-500, N3-Reuters-128,
OKE 2015 task1, OKE 2016 task1, and the entity relatedness dataset of (Ceccarelli et al., 2013).
In more detail,
this is done by considering all possible spans
of the document as a candidate span and querying our p(e|m) dictionary for all the candidate entities
for this span (we keep only the top 30 for each candidate span).