The classical bag-of-words paradimg relies its representation upon a merely set of words that compose an input text. Since this paradigm totally ignores the meaning of the words as well as the synonymy, polisemy and similarity between them, the performance of the resulting downstream applications can be drammatically penalized.
In order to overcome these limitations, A³ Lab developed a suite of algorithmic and artificial intelligence software tools for the efficient and efficacious semantic annotation (also referred in literature as entity linking) of natural language text with Wikipedia entities (pages). A number of results have shown that this annotation is extremely powerful: not only does it provide a deeper contextualization of the input text, but it also enable machines to effectively understand the natural language text as a small piece of the whole human knowledge.
Another key of success of A³ Lab is the introduction of a new representation that enhances the bag-of-words representation with a new graph of entities (concepts) derived from the semantic annotation of the input text. Thanks to this representation, machines can now exploit the interconnnections present in the underlying Knowledge Graph in order to infer and enrich the input text with information that is not explicitely stated in its content.
The A³ Lab software suite is publicly available here and, since its official launch in 2015, it has already satisfied more than 3 millions of textual queries.
Software & Datasets
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Stefan Rüd, Hinrich Schütze: SMAPH: A Piggyback Approach for Entity-Linking in Web Queries. ACM Trans. Inf. Syst. 37(1): 13:1-13:42 (2019)
Marco Ponza, Paolo Ferragina, Soumen Chakrabarti: A Two-Stage Framework for Computing Entity Relatedness in Wikipedia. CIKM 2017: 1867-1876
Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, Marco Cornolti, Didier Cherix, Bernd Eickmann, Paolo Ferragina, Christiane Lemke, Andrea Moro, Roberto Navigli, Francesco Piccinno, Giuseppe Rizzo, Harald Sack, René Speck, Raphaël Troncy, Jörg Waitelonis, Lars Wesemann: GERBIL: General Entity Annotator Benchmarking Framework. WWW 2015: 1133-1143
Francesco Piccinno, Paolo Ferragina: From TagME to WAT: a new entity annotator. ERD@SIGIR 2014: 55-62
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita: A framework for benchmarking entity-annotation systems. WWW 2013: 249-260
Paolo Ferragina, Ugo Scaiella: Fast and Accurate Annotation of Short Texts with Wikipedia Pages. IEEE Software 29(1): 70-75 (2012)
Paolo Ferragina, Ugo Scaiella: TagMe: on-the-fly annotation of short text fragments (by Wikipedia entities). CIKM 2010: 1625-1628