Tensions of Standardization and Variation in the Encoding of Ancient Scripts in Unicode

Anshuman Pandey (Michigan)

Digital Classicist London seminar 2018

Friday July 20th at 16:30, in room 234, Senate House, Malet Street, London WC1E 7HU

Livecast at Digital Classicist London YouTube channel.

The Unicode Standard has transformed scholarship in global classics by providing technologies to represent texts in the languages of the world using the original writing systems. More than 100 scripts have been encoded in Unicode and there are active efforts to include additional scripts and characters. Encoding a script in Unicode requires determining the distinctiveness and boundaries of the ‘script’, identifying representative forms of characters and stylistic variants, and unifying scribal and regional styles. For most scripts the process is relatively straightforward. But, there exists a tension between ‘standardization’ and ‘customization’ for some scripts. From one angle, a given script may be considered a regional variant, or a ‘customization’, of a normative writing system. From another point of view, it may be a distinctive script, a ‘standard’ form in its own right and deserving of a separate encoding in Unicode. This talk will address aspects of this tension by illustrating issues related to the Unicode representation of scripts such as Phrygian and Sidetic, with regard to epichoric Greek; Proto-Sinaitic and Proto-Canaanite, in relation to Egyptian Hieroglyphs; and Elymaic and Khwarezmian, as derivations of Imperial Aramaic. The encoding of such customizations of standard scripts will encourage discussion about the needs for handling global writing systems in Unicode and the representation of linguistic data in digital classics and broader computational humanities.