Gadget

Signpost: 7,000 Africans show why AI needs humans

Artificial intelligence is often described as a triumph of machines over human limitation. Yet one of the most revealing AI stories to emerge from Africa tells of a different victory: by people deciding that their voices should not be excluded from the future.

More than 7,000 individuals across several African countries volunteered to record their speech for a new open dataset released by Google and a consortium of African research institutions. The dataset, called WAXAL, contains over 1,250 hours of transcribed speech across 21 languages. Those numbers tell a human tale of participation that runs against much of how modern AI is built.

Most large AI systems learn from data gathered incidentally. Text is scraped from the web and images are pulled from public repositories, For WAXAL, African universities and community organisations worked with volunteers who contributed their voices so that speech technologies could exist in languages that have long been absent from digital systems.

Prof Isaac Wiafe of the University of Ghana describes it as a collective decision about belonging.

“Over 7,000 volunteers joined us because they wanted their voices and languages to belong in the digital future,” he said. “Today, that collective effort has sparked an ecosystem of innovation in fields like health, education, and agriculture. This proves that when the data exists, possibility expands everywhere.”

Speech technology has become one of the main ways people interact with digital services. Voice assistants, transcription tools, automated call centres, and educational platforms increasingly assume that speech is the most natural interface. When systems fail to recognise a language or accent, exclusion is inevitable.

Despite Africa being home to more than 2,000 languages, only a small fraction is supported by mainstream speech technologies. The obstacle has often been blamed on a shortage of data. That is true in a narrow sense, but it overlooks the human layer of the problem. Data does not exist unless people are willing and able to produce it under conditions they trust.

WAXAL was developed over three years with funding from Google, but the work of collecting speech data was led by African institutions, including Makerere University in Uganda, the University of Ghana, and community organisations like Digital Umuganda in Rwanda.

Joyce Nakatumba-Nabende, a senior lecturer at Makerere University, links the dataset directly to local capacity. “For AI to have a real impact in Africa, it must speak our languages and understand our contexts,” she says.

Building speech datasets inside the communities that speak them also changes the quality of what can be built on top, and shifts who controls the outcome. In a notable departure from multinationals’ past practice, the African partner institutions retain ownership of the data. Google provided funding and technical guidance but has relinquished control.

The project demonstrates that African languages can be incorporated into modern AI systems through a process that is technically rigorous and socially grounded. Once that process exists, the remaining languages are no longer blocked by feasibility.

This is where the volunteers return to the centre of the story. Their contribution turns language support from an abstract aspiration into a practical pathway. Each recorded voice becomes part of a foundation that others can extend. In this sense, the dataset functions like infrastructure. Invisible once it is in place, but decisive in shaping who can participate.

That is key. Progress here depended on trust and willingness. It required people to believe that contributing their voices would lead to something worthwhile, even if the benefits were not immediate.

Aisha Walcott-Bryant, head of Google Research Africa, says the ultimate impact lies in empowerment:  “This dataset provides the critical foundation for students, researchers, and entrepreneurs to build technology on their own terms, in their own languages, finally reaching over 100 million people. We look forward to seeing African innovators use this data to create everything from new educational tools to voice-enabled services that create tangible economic opportunities across the continent.”

This includes access to education platforms that rely on voice interaction, health services that use automated triage, and agricultural advice delivered via speech.

WAXAL does not guarantee inclusion or solve the complexity of Africa’s linguistic landscape. But it shows that AI built with human participation produces different outcomes from AI built around human leftovers. It restores people to the beginning of the pipeline, rather than leaving them at the receiving end.

Arthur Goldstuck is CEO of World Wide Worx, editor-in-chief of Gadget.co.za, and author of “The Hitchhiker’s Guide to AI – The African Edge”.

Exit mobile version