OpenITI / KITAB Corpus and Text Reuse Data 0 items

The Open Islamicate Texts Initiative (OpenITI) is a multi-institutional effort led by researchers at the Aga Khan University’s Institute for the Study of Muslim Civilisations in London, Roshan Institute for Persian Studies at the University of Maryland, College Park, and Universität Hamburg that aims to develop the digital infrastructure for the study of Islamicate cultures.

Since its founding in 2016, OpenITI's work has focused on the tasks necessary to build digital capacity in Islamicate studies, including improving Arabic-script optical character recognition (OCR) and handwritten text recognition (HTR), developing robust Arabic-script standards for OCR and HTR output and text encoding, and creating platforms for the collaborative creation of Islamicate text corpora and digital editions.

OpenITI's secondary focus comes out of our OCR and HTR work: we want to create a machine-actionable and standards-compliant scholarly corpus of Islamicate texts, covering an ever-increasing number of Persian, Arabic, Ottoman Turkish, and Urdu works. We will make these works available in a variety of formats (plaintext, OpenITI mARkdown, TEI XML) and enrich them with as much verified metadata as possible. Please see the OpenITI corpus project page for more information.

https://openiti.org/about.html

The KITAB project is developing methods that detect how authors copied from previous works. Arabic authors frequently made use of past works, cutting them into pieces and reconstituting them to address their own outlooks and concerns. We are working to discover relationships between these texts and also the profoundly intertextual circulatory systems in which they sit.

https://kitab-project.org/about/