Closing the Gap in Non-Latin-Script Data: Pragmatic Approaches for Increasing Awareness
https://zenodo.org/records/10698504
Multilingual and multi-script research in the Digital Humanities (DH) is still located at the margins of the field. Most projects are either focused on Latin-script materials or need to rely on Latin-script-based solutions. As a result, it is often necessary for DH researchers to build individual workarounds to mitigate the lack of support for non-Latin scripts (NLS). There is, furthermore, a severe lack of visibility and topic-related networks, and often no proper support group, for DH practitioners working with NLS materials.1 Thus, Multilingual Digital Humanities (MLDH) topics remain marginalized, to the point that even projects active in the field have no comprehensive perspective on the field.
The purpose of the project “Closing the Gap in Non-Latin Script Data” ( https://m-l-d-h.github.io/Closing-The-Gap-In-Non-Latin-Script-Data/ ) is to intervene here to improve the visibility of research relating to non-Latin scripts in the Digital Humanities, particularly in Germany, and to enhance collaboration among projects dealing with Arabic and similar languages. This serves three goals. First, we want to provide an overview of the state of the field in Germany and thereby draw further attention to projects on the margins of German DH. Second, we seek to promote communication and exchange among researchers, which will facilitate the necessary advancements in the field. Third, on a political level, we aim to challenge the hegemony of English (and other European languages) in DH. The dominance of the mono-cultural view and the need to address it critically has been articulated by many researchers (see, e.g., Fiormonte 2012); however, lack of visibility, difficulty in accessing sources, and inadequacy of research tools are still the daily reality in research on non-European languages—particularly those written in non-Latin scripts.
The central goal of our effort is thus to raise awareness for the field and to give an overview of projects, initiatives, infrastructures, methods, and workflows that are being implemented in Multilingual DH in Germany. We also have the opportunity to leverage the data and expertise that we acquire during the course of this project to establish a set of best practices and guidelines for researchers engaged in the field. Finally, we hope to serve as an exemplar of the practices that we advocate, with a special emphasis on the principles of open and FAIR science. As the core team of “Closing the Gap” is based at the Seminar for Semitic and Arabic Studies at the Freie Universität Berlin ( https://www.geschkult.fu-berlin.de/e/semiarab/arabistik/), we started with a focus mainly on projects working with Arabic, but we have expanded the data model to include many more languages.
As of November 2023, our database covers 159 projects and initiatives around the world, with a geographic focus on Germany. These projects, in turn, work with a total of 112 different languages—most of them written in non-Latin scripts. “Closing the Gap” is integrated in a network of institutions and initiatives, such as the Multilingual DH Lab of the Ada Lovelace Center for Digital Humanities at the Freie Universität Berlin ( https://www.ada.fu-berlin.de/ ), the Department of Digital Scholarship Services at the State and University Library of Hamburg ( https://www.sub.uni-hamburg.de/service/digitale-forschungsdienste.html ), and the Multilingual DH working group (AG) within the DHd ( https://m-l-d-h.github.io/DHd-AG/ ). Our project therefore serves as a “micro-hub” for NLS-related Multilingual DH.
Many of the projects that we have catalogued are also embedded in the disciplinary background of the so-called “Kleine Fächer” ( https://www.kleinefaecher.de/ ), which often involve NLS languages and suffer from a lack of recognition in the larger academic landscape. “Closing the Gap” aims to address this issue by providing visibility through our database, as well as by showing the interconnectedness of projects in Germany and across Europe.
When we started developing workflows for our project, we aimed for pragmatic solutions that allowed us to commit to OpenScience and FAIR principles from the very beginning, without needing to rely on an institutional infrastructure that was still unable to meet our needs. It was also important for us to act efficiently, given a limited period of funding. While one can debate the risks of conducting academic work on corporate-owned platforms, we chose to build our database in a public GitHub repository ( https://github.com/M-L-D-H/Closing-The-Gap-In-Non-Latin-Script-Data ), and to use GitHub Pages to host a web frontend that gives users an easy way of exploring the data.
Our main work consists of three pillars:
Since we strive to keep our work as transparent as possible, we decided to discuss all major project-related matters openly via the Issues function on GitHub. Anyone from the outside can therefore follow and comprehend our decisions. Better still, interested users have the ability to participate in these discussions, to suggest new features or ask questions to be answered by the core team or other collaborators. This way, we also engage in new ways of dealing with problems and failures, especially regarding the applicability of our stack to non-Latin-script textual data. We hope to raise awareness that a research process is not only a group effort, but that it is often non-linear as well.
In accordance with the conference theme of “DH Quo Vadis,” we want to present our ideas and workflows at DHd 2024 and to assess preliminary results of the project at the end of its first funding period. Furthermore, with the second phase of funding recently confirmed, this is an opportune time for us to share our vision of the future and the issues that we hope to address in the next two years, such as better data visualization, how to expand the multilingual use of TaDiRAH, or the potential implementation of a knowledge graph based on our data.
Promoting awareness of the issues facing Multilingual DH will be an essential part of securing the future of the Digital Humanities as a broad and interdisciplinary field par excellence. We intend to discuss the advantages and downsides of our approach—both on the technical side, with regard to our stack and the use of GitHub as a database host, and in terms of cultivating a real-life network of researchers who can assist one another.
Fußnoten
Bibliographie
- Asef, Esther, and Cosima Wagner. 2018. “Workshop-Bericht: ‘Nicht-lateinische Schriften in multilingualen Umgebungen: Forschungsdaten und Digital Humanities in den Regionalstudien.’” DHd Blog. https://dhd-blog.org/?p=10669 (zugegriffen: 17. Juli 2023).
- BMBF. n.d. Kleine Fächer – Große Potenziale – BMBF. Bundesministerium für Bildung und Forschung – BMBF. https://www.bmbf.de/bmbf/de/forschung/geistes-und-sozialwissenschaften/kleine-faecher/kleine-faecher_node.html (zugegriffen: 17. Juli 2023).
- Fiormonte, Domenico. 2017. “Digital Humanities and the Geopolitics of Knowledge.” In Digital Studies / Le Champ Numérique 7 (1). https://doi.org/10.16995/dscn.274.
- ———. 2021. “Taxation Against Overrepresentation? The Consequences of Monolingualism for Digital Humanities.” In Alternative Historiographies of the Digital Humanities, 333–76. Earth: punctum books. https://doi.org/10.53288/0274.1.00.
- Fiormonte, Domenico, Sukanta Chaudhuri, and Paola Ricaurte, eds. 2022. “Introduction.” In Global Debates in the Digital Humanities, ix–xxxiii. Minneapolis: University of Minnesota Press.
- Ghorbaninejad, Masoud, Nathan P. Gibson, and David Joseph Wrisley. 2023. “Right-to-Left (RTL) Text: Digital Humanists Plus Half a Billion Users.” In Debates in the Digital Humanities 2023. Minnesota: University of Minnesota Press.
- Gil, Alex, and Élika Ortega. 2016. “Global Outlooks in Digital Humanities: Multilingual Practices and Minimal Computing.” In Doing Digital Humanities. London: Routledge.
- Grallert, Till, Xenia Monika Kudela, Eliese-Sophia Lincke, Colinda Lindermann, Jana-Katharina Mende, Jonas Müller-Laackman, and Larissa Schmid. 2023. Umgang mit Multilingualität im DACH und DHd Verband (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.7957187.
- Ortega, Élika. 2014. “Multilingualism in DH.” Disrupting the Digital Humanities. https://web.archive.org/web/20210424073656/https://www.disruptingdh.com/multilingualism-in-dh/ (zugegriffen: 17. Juli 2023).
- Spence, Paul. 2021. Disrupting Digital Monolingualism: A report on multilingualism in digital theory and practice. London: Language Acts and Worldmaking. https://doi.org/10.5281/zenodo.5743283.