5 mai 2025
http://creativecommons.org/licenses/by-nc-sa/ , info:eu-repo/semantics/OpenAccess
Jean Mélou et al., « Deep-learning based Detection and Segmentation in Archaeology », HAL SHS (Sciences de l’Homme et de la Société), ID : 10670/1.5d7234...
1 Introduction Modern archaeology benefits from a convergence between traditional excavation methods and technological advancements, particularly those stemming from computer vision. Among these technologies, image segmentation plays a central role. It involves dividing an image into multiple meaningful regions to isolate specific elements, such as artifacts, architectural structures, or traces of ancient carvings. This task is crucial for extracting, analyzing, and interpreting visual information from archaeological surveys. However, segmentation in the archaeological context poses specific challenges. Images often come from complex scenes where objects of interest may be partially buried, degraded, or blended into their surroundings. Varied textures, shadows, and overlapping elements make it difficult to accurately identify shapes and contours. Despite these challenges, modern approaches, particularly those based on deep learning, have significantly improved segmentation capabilities. In this work, we propose applying two state-of-the-art segmentation methods to two archaeological problems. We will focus, on the one hand, on the detection and segmentation of petroglyphs, and on the other hand, on the segmentation of mosaics into tesserae. 2 Methods and Materials Petroglyphs Detection with YOLOv9 Petroglyphs serve as immutable spatio-temporal markers that hold crucial information about the history of local settlements. The study of these archaeological sites (Danielyan 2020) often requires cataloging all the petroglyphs present. This process traditionally involves photographing the rocks of interest and subsequently analyzing the images, manually detecting and outlining each petroglyph — a labor-intensive task. Using YOLOv9, we propose automating this demanding step. YOLO (You Only Look Once) is a family of real-time object detection models renowned for their speed and efficiency. Designed to simultaneously localize and classify objects in an image in a single step, these networks have evolved over successive versions to deliver increasingly impressive performance. With YOLOv9 (Wang, Yeh, and Mark Liao 2025), the latest iteration in the series, significant improvements have been introduced. This version leverages advancements in network architecture, optimization, and data processing to enhance accuracy while maintaining exceptional speed. YOLOv9 incorporates optimized modules such as advanced attention mechanisms, adaptive anchoring strategies, and better balancing for detecting objects of various sizes. We trained YOLOv9 with annotated images provided by archaeologists. Due to the limited availability of data, data augmentation was essential. Additionally, the images originated from only a few sites, influencing various factors such as rock color. Training was automatically halted after 379 epochs, demonstrating YOLOv9’s capability to adapt to this specialized archaeological dataset. Segmenting Mosaics with Segment Anything Another aspect of our work focuses on the segmentation of tesserae that compose mosaics. These tesserae exhibit varying shapes and sizes, often irregular, and are separated by mortar 155 arranged in a non-uniform manner. Furthermore, the tesserae typically have muted colors and low contrast, making it difficult to distinguish them from the mortar. Automatic segmentation of tesserae thus represents a significant challenge. The goal of this study is to develop a segmentation method specifically tailored to mosaics, concentrating on extracting tesserae as distinct entities. Such an approach would allow archaeologists to analyze the tesserae directly, facilitating their digitization. The detected tesserae would form the basis for deeper analysis, aiding in the interpretation and utilization of the extracted information. While machine learning approaches like Segment Anything (Kirillov et al. 2023) outperform traditional methods, they are not without limitations. When applied to the full image of a mosaic, Segment Anything tends to detect broader shapes, such as characters or decorative elements on the mosaic, rather than focusing on individual tesserae. To counter this, we apply Segment Anything to smaller, localized sections of the image devoid of identifiable forms. This adjustment allows for the generation of more precise masks. This approach, however, requires post-processing. Issues such as duplicate masks and overlaps can arise. To address these, statistical analysis of tesserae sizes and a selection algorithm for the masks are employed to eliminate undesirable duplicates and guarantee segmentation accuracy. 3 Results and Discussion For petroglyphs, our algorithm delivers satisfactory results, particularly when petroglyphs from the same site are included in the training data. However, due to the vast variability in features (e.g., petroglyph shapes, rock types), some elements may go undetected. To address this, an executable application has been developed, allowing archaeologists to manually refine the results produced by the network. This tool will be released as open-source software. Figure 1 illustrates an example of petroglyph segmentation before and after archaeologist intervention. Similarly, the tesserae segmentation will also be integrated into an open-source application. This tool will support the use of various input image types, such as those enhanced through gradient emphasis or captured under different lighting conditions for the same scene. This application has already been employed to conduct a statistical study on the tesserae of Saint-Romain-en-Gal (France), analyzing their size, color, and roughness. This study enabled the tesserae to be grouped based on these characteristics, marking the beginning of an investigation into the materials used in their construction.