Clotho: Network Analysis and Distant Reading on Perseus Latin Corpus

Thibault Clérice (King’s College London)

Digital Classicist London & Institute of Classical Studies seminar 2014

Friday July 18th at 16:30, in Room G34, Senate House, Malet Street, London WC1E 7HU

Video recording of seminar (MP4)

Audio recording of seminar (MP3)

Presentation (PDF)

Clotho is born from a simple need : doing distant reading on a Latin corpus. Why should we do distant reading? We can see many reasons including word sense induction, sentiment analysis, network analysis which were the one we were looking for.

Through we think sentiment analysis stricto sensu is not possible in Latin, we aim to provide a way to do that quantitatively through context and network analysis. This does not replace the meme of conventional scholarship; where an expert reader parses a text and pronounces on the nature of its sentiment. Rather, we think that the proper nature of the Latin exemplum and auctoritas is liable to support interpretation by polarizing words according to their strict meaning(s).

The Clotho project is divided in two tools. The first one is based on python software, which enables the distant reading of the corpus and the export of its results. The other is a PHP platform which enables a team or a crowd to annotate and clean the results. Both are relatively easy to install and fully open source so new functionalities of wide application can be added.

For this project, there have been two outputs:

  • Cicero’s Network which has been realised with an earlier version of Clotho’s Python and Web interface
  • Lasciva Roma, a project which will be launched on the 25th of March 2013

As we seek to present the tool itself for the Cerch Spring Seminar serie, we think we should focus on the results of the crowdsourcing and how this kind of tool shall evolve to meet success both in usability and usage.

For this matter, we would like to discuss the sustainability of this kind of project: being divided in two part, relying on different platforms, languages, dependencies, is one part of the problem, though we think the major issue with Clotho is its dependency on a certain data format issued by Perseus Hopper.


