• BAT-Framework: a framework to fairly and fully benchmark text annotators (systems that, given a text document, aim at finding the topics the text is about, identified as Wikipedia pages), running deep testing that focus on many aspects of a text annotator (currently in alpha stage) [software].
  • TAGME: an annotator system that can efficiently and effectively identify meaningful phrases in a short fragment of text and link them to  pertinent pages of Wikipedia. This is the first step toward a semantic clustering algorithm for TagMySearch [paper - software]
  • TagMySearch: a web-snippet clustering meta-search engine as part of an Italian MIUR-FIRB Project (currently under development)
  • BC-ZIP: a LZ77-compressor which trades in a principled way decompression time for compressed space, or vice-versa. [ paper - software ]
  • SnakeT: the previous web-snippet clustering meta-search engine developed by our group in 2005.  [paper]
  • SmallText: a 100% pure Java library designed to access efficiently huge collections of textual files and labeled graphs which are stored in compressed form and at various levels of granularity. [software]
  • Pizza&Chili: this is a site built in collaboration with Gonzalo Navarro’s group which offers publicly available implementations of compressed (self-)indexes. Each implementation follows a suitable API of functions which should, in our intention, allow any programmer to plug the provided compressed indexes within their own software and play with their functionalities and efficiency. The site also offers a collection of texts and tools for experimenting and validating compressed indexes. [paper - software]
  • BioPromptBox: a search engine for biological datasets that can classify results in order to support the user on browsing them and refining the query. [paper - software]