Re-Experiencing History: A Platform for the Re-Enactment of Historical Events with Multimodal Large Language Models
https://zenodo.org/records/14943096
Project Description
In 1982, seven people in the Chicago area tragically died after consuming cyanide-laced capsules (cf. Bergmann (2000)). The source of this tampering was initially unknown, leading to nationwide panic. Investigators were stumped until they conducted re-enactments of the purchase and consumption of the contaminated capsules. These re-enactments enabled them to trace the tainted products to specific store shelves and identify the exact locations where the tampering occurred.
Like criminologists, historians have to reconstruct past moments or situations. However, unlike criminologists, historians often cannot re-enact1 events due to the fact that their subject of investigation cannot be easily reproduced. The advent of AI in the form of powerful multimodal large language models (LLMs) is a game changer here: Historians can now prompt image generation models (mostly based on Stable Diffusion (Rombach et al., 2022)) to re-create scenes from primary sources and research findings. This enables them to quickly visualise past events, thereby enhancing their understanding of historical moments. The project Re-Experiencing History aims to create a platform exploiting LLMs to support users in re-enacting such historical events.
Our research includes assessing existing multimodal LLMs regarding historical accuracy and prompt-to-image alignment (Xu et al., 2023), manipulating generated images with further prompts, and fine-tuning LLMs to improve historical accuracy. The interdisciplinary approach where computational linguists work with historians is crucial, as Hutson, Huffman, and Ratican (2024) highlighted in their work on resurrecting Mary Sibley (1800-1878) using her diaries.
In a prototype setting, we focus on two scenarios from antiquity: the Roman triumph and the Lupercalia festival. We assess image generation capabilities, initially focusing on DALL-E 3. Based on literature (15 sources on the triumph, 5 on the Lupercalia), we crafted 100 prompts to generate six images per prompt, creating 600 images total.2 These images and prompts form a basis for further research. Figure 1 displays two images generated with the same prompt.
This project contributes to the field of digital humanities and explores the potential of AI in historical research and education. By bridging the gap between past events and modern technology, we aim to create a more immersive and accessible approach to studying history, potentially redefining how we interact with and understand our collective past.
Fußnoten
Bibliographie
- Algül, Aydın. 2018. The Roman Triumph: Participation, Historiography and Remembrance. (accessed 21 July 2024).
- Bergmann, J oy . 2000. A Bitter Pill. In Chicago Reader, (accessed 21 July 2024).
- Dray, W illiam H erbert . 1995. History as Re-Enactment: R. G. Collingwood's Idea of History. Oxford: Oxford University Press.
- Hutson, James , Paul Huffman and Jeremiah Ratican. 2024. Digital Resurrection of Historical Figures: A Case Study on Mary Sibley through Customized ChatGPT. In Faculty Scholarship, 590, (accessed 21 July 2024).
- Otani, M ayu , Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yute Nakashima, Esa Rahtu, Janne Heikkilä and Shin’ichi Satoh. 2023. Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14277–14286, (accessed 21 July 2024).
- Rombach, R obin, Andreas Blattmann, Dominik Lorenz, Patrick Esser and Björn Ommer. 2022. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695, (accessed 21 July 2024).
- Rosenzweig, R oy and David Thelen. 1998. The Presence of the Past: Popular Uses of History in American Life. New York: Columbia University Press.
- Xu, J iazheng , Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang and Yuxiao Dong. 2023. ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (eds.), Advances in Neural Information Processing Systems, 36, 15903–15935, (accessed 21 July 2024).