As a contribute to the scientific community working on the field of entity annotation, we developed a framework to compare text annotators: systems that, given a text document, aim at finding the entities the text is about, identified as Wikipedia pages. The BAT-Fframework, written in Java, comes along with a formal framework that defines a set of problems, the way systems can be compared to each other, and a set of measures that – extending classic IR measures – fairly and fully compares entity annotators features. The formal framework, whose understanding is required to use the benchmark framework, is presented in this paper published at WWW’13.
- Compare in a fair and complete way any Entity-Annotation system.
- Provides an implementation for all defined measures and match relations.
- Easily extensible adding new systems, new datasets, new similarity measures.
- Performs extensive testing on any Entity Annotator and any dataset.
- Performs runtime testing.
- Generates gnuplot charts and Latex tables summarizing test results.
- Completely open source, distributed under the GPLv3 license.