For decades, the meaning behind ancient finger-drawn lines etched into limestone cave walls remained elusive. Known as finger flutings, these grooves—some over 60,000 years old—are among the earliest known expressions of symbolic behavior by humans and Neanderthals. But beyond their aesthetic or ritual significance, archaeologists have long asked a simpler, more personal question: Who made them?
A team of researchers from Griffith University in Australia has taken a significant step toward answering that question. Using deep learning techniques trained on both physical and virtual simulations, the team developed a machine learning model capable of predicting the sex of the individuals who made the markings with surprising accuracy.
Their findings, recently published in Scientific Reports, suggest that these early artistic gestures may hold more than cultural meaning—they may contain traces of individual identity. With the help of neural networks and modern experimental archaeology, the study presents a potential paradigm shift in archaeology.

But while the results are promising, the researchers are quick to stress the limits of their approach. The dataset is modern and small. The technology is still learning to distinguish human nuance. Still, the open-source methodology they’ve released could set a new standard for transparency in archaeological research.
Reconstructing prehistoric gestures with machine learning
The Griffith team, led by digital archaeologist Dr. Andrea Jalandoni, collected data from 96 adult volunteers who were asked to create finger flutings in two distinct environments. In one, participants dragged their fingers through a specially formulated clay designed to mimic moonmilk cave surfaces. In the other, they used hand-tracked gestures in a virtual reality environment powered by the Meta Quest 3 headset.
Each participant created nine flutings per setting. These were photographed or digitally captured and then fed into two convolutional neural networks: ResNet-18 and EfficientNet-V2-S, both widely used for image classification tasks. The models were trained to classify the flutings by the sex of the participant—information that was self-reported in binary categories.


In the tactile experiment, one model achieved up to 84% accuracy in classification. By contrast, the VR-based data produced inconsistent results, likely due to the lack of physical resistance and sensory feedback, which affected how participants moved their hands in virtual space.
You can access the open-source dataset and codebase. The research design and preliminary results were also detailed in a recent article from Nature’s Scientific Reports, with methodological context drawn from a separate 2024 peer-reviewed review on the limitations of traditional biometric analyses in cave art studies.
Breaking from flawed historical methods
For years, archaeologists attempting to analyze finger flutings relied on ratios between finger lengths or groove widths to infer the sex or age of their creators. One common approach was the 2D:4D digit ratio, used to distinguish male from female hands. But such techniques—often applied to distorted, eroded surfaces—have been widely criticized for their methodological weaknesses.
“The older methods were speculative at best,” said Dr. Gervase Tuxworth, a computer scientist and co-author on the study. “Our goal was to move toward something quantifiable and testable.”
The new machine learning pipeline differs in one crucial way: it analyzes the final markings, not hand anatomy. Rather than measuring fingers or extrapolating body features, the models were trained directly on the visual patterns in the flutings. This makes the system more agnostic and potentially more adaptable to the variations of ancient markings—though only further testing will confirm that.
Importantly, the researchers chose not to keep their system behind a paywall or proprietary license. “We’re inviting others to test, replicate, or challenge these results,” said Dr. Robert Haubt, co-author and information scientist at Griffith. “This needs to be collaborative science.”
Accuracy, limitations, and what comes next
While the system performed well on modern tactile data, researchers noted clear signs of overfitting, particularly in the virtual data. The models showed strong performance on training images but less consistency on unseen samples. This means that while the algorithm may be recognizing patterns, those patterns may be specific to the experimental setup rather than generalizable to ancient cave markings.


Another issue is the dataset itself: all participants were adults from a modern population, skewed toward Australian university attendees. The sex classifications were binary, self-reported, and limited to adults, meaning no insights could yet be drawn about children’s roles in Paleolithic art—a hypothesis that has been central to past research on cave markings.
Yet the study still represents a foundational proof of concept. “This isn’t a final tool,” Jalandoni said. “It’s a starting point. What matters is that we’re listening to people who lived tens of thousands of years ago—and finding new ways to understand what they left behind.”
