Technical Details
Get – analyse – represent & disseminate
To execute these three phases, we will use the native XML-db application server eXist-db. The XML format used to store and index data is TEI. The choice to use an XML native db and TEI-XML format from the beginning, although less straightforward than others, will enable us to extend the metadata collected to the full-text at any time, without facing platform migrations.
For the collecting data phase (GET) the ARACNE framework will be used: an open source software developed by the Centro di Ateneo delle Biblioteche (CAB) of University of Napoli Federico II for managing and publishing archival document collections in TEI-XML format in eXist-db.
For the next phases (analysis and representation & dissemination) we will use several open source tools (some embedded, some easily pluggable with eXist-db), such as: Apache Lucene and Elastic search to aggregate the data.
To publish the results in machine-readable format, to create linked data in the eXist-db API, and to project data in other formats (i.e. json, XML-DC) Apache Jena will be used. Furthermore, for human-readable results, ARACNE, which provides the possibility to create a website for a published archival collection, will be used.