Artificial intelligence systems pass bar exams, write code and mimic empathy. But few of them answer a simpler question: are they helping people live better lives?
That question now has a test. The Flourishing AI Benchmark (FAI), developed by faith-based tech company Gloo, avoids the standard approach that tests how clever a model is. Rather, it measures how much good it can do. Instead of scoring models on speed or logic, it evaluates how they respond across seven dimensions of human wellbeing: Character, Relationships, Happiness, Meaning, Health, Finances and Faith.
“The concept of human flourishing is not new,” says Pat Gelsinger, executive chairman and head of technology at Gloo, in an exclusive interview with Gadget this week. “It can be traced back through the centuries, beginning with Aristotle and continuing today as an area of scientific research.”
The benchmark draws on frameworks like the Global Flourishing Study, based on data from over 200,000 people in more than 20 countries.”
Gelsinger is former CEO of Intel and VMware, and architect of key technologies like USB and Wi-Fi standardisation. He is now proposing a new standard for the AI age.
“While traditional metrics matter, they are solely about technical capabilities – for example, generating faster responses or correctness of defined areas like mathematics.
“We need to shift that conversation to include whether those responses actively promote human wellbeing. It isn’t just a matter of demonstrating the absence of bad, we have to be able to demonstrate the presence of good.”
More than two dozen of the most advanced large language models were tested on 1,229 questions designed around the seven dimensions.
“The questions included both objective multiple-choice items and subjective scenario-based prompts,” says Gelsinger. “Responses were evaluated using other LLMs acting as judges, each prompted to take on a specific expert persona for the dimension being assessed.
“These judge models used a structured rubric with 25 scoring criteria. If a response touched on multiple areas, it was also scored by judges from those secondary dimensions.”
For Gelsinger, the benchmark represents more than a measurement tool. It is a challenge to the industry.
“Our goal: make AI better across the board and these benchmarks are a means to encourage and measure that progress. No model currently meets the holistic threshold for human flourishing. While many perform well in certain dimensions like personal finance or health, they struggle in other areas like meaning or faith.
“It isn’t just a matter of demonstrating the absence of bad; we have to be able to demonstrate the presence of good. I like to say if it’s not supporting human flourishing then the engineering ‘ain’t done’, it’s a bug and needs to be fixed.
“If the self-driving car has a high accident rate, the engineering isn’t finished and it needs more work. If the humanoid robot isn’t safe in the presence of humans, it needs more work. If the AI models, which are embedding extraordinary pools of human knowledge, do not reflect our values as humans, they need more work.”
Gelsinger says the FAI Benchmark is an example of the work the Gloo team is doing to create a framework that invites shared accountability and clearer ethical standards around AI so they can be open, adaptable and scalable. But ethics, he insists, cannot be left to technologists alone.
“Ethical AI also requires diverse voices and inputs from beyond tech, including ethicists, psychologists, faith leaders across a variety of faith traditions, and everyday users.
“AI is a tool, not a replacement for humans. Every response from an AI should make that clear. Our AI systems should be leading us to be better in every dimension including relationally with other humans.”
With the right safeguards, Gelsinger says, AI can support care without supplanting it. “AI can assist human care use cases, but only with the right context, privacy, guardrails, security, and access.”
“AI is just beginning its journey, we’re still early on the adoption curve, yet we’re already having discussions about the impact to human wellbeing. This is a major win.”
The goal is as bold as it is practical: “We can teach and educate every child on the planet, and in doing so, lift them out of, and eventually end, extreme poverty. That is AI for human flourishing.”
Arthur Goldstuck is CEO of World Wide Worx, editor-in-chief of Gadget.co.za, and author of The Hitchhiker’s Guide to AI – The African Edge.
