Find out what 36,000+ microfinance clients told us about the impact of their loans. Download the 2024 Microfinance Index

From Weeks to Days: How AI Helped us Cut Project Timelines by 75%

We partnered with BaldeCash, a Peruvian social enterprise, to pilot our first AI-fuelled impact study with AI interviewer 'Beto'.

Measuring social impact at scale is about striking the balance between speed, quality, and reach. We’re constantly looking for the most effective ways to deliver that balance for our clients. Recent advances in Large Language Models (LLMs) intrigued us: we saw an opportunity to listen to customers more efficiently without sacrificing quality.

Our exploration began with a simple question: Could AI help us conduct high-quality interviews while maintaining the human touch that makes our impact measurement so valuable? The potential benefits – concurrent calls at a low cost, full standardization, and real-time validation of inconsistent or incomplete responses – were clear. But could we hit a quality standard that made for a delightful experience for respondents?

From Vision to Reality: Building an AI Enumerator

To move from theory to practice, we needed a real-world test. We partnered with BaldeCash, a Peruvian social enterprise that provides university students with access to laptops and other tech equipment through long-term financing.

Our pilot involved using a conversational AI agent (who we named ‘Beto’) to conduct end-to-end interviews with BaldeCash’s users, to understand BaldeCash’s impact on students’ education and career prospects. After designing a 25-question survey incorporating both quantitative and open-ended questions, we developed Beto as an AI interviewer specifically created for Peruvian Spanish speakers. Like any good interviewer, Beto needed the right voice, personality, and guardrails to conduct effective conversations with students. We carefully refined these elements to ensure Beto could engage naturally with respondents while gathering reliable data.

Results of our AI Project Pilot

We designed our test to compare Beto’s performance with our human enumerators, focusing on two key metrics: response rates and data quality.

To maximize the likelihood of reaching our interview targets, we test three different approaches for starting the survey. For any of these options, customers would receive an SMS informing them about Beto and the purpose of the study. 

  1. Self-scheduled calls: Customers chose their preferred interview time through an online scheduler
  2. App-initiated calls: Customers started the interview through a mobile app whenever convenient
  3. Direct outreach: We called customers during likely free hours (evenings and weekends)

App-initiated calls proved most successful, followed closely by direct calls. Scheduled calls, interestingly, did not seem to work.  Overall, while the response rate to AI surveys was significantly lower than to calls conducted by our Peruvian enumerators – just like text messaging or WhatsApp have much lower response rates – the overall response rate was high enough to reach the target number of interviews in a span of three days.

Quality-wise, Beto also showed promise. The AI interviewer performed similarly to a first-time human enumerator, though not yet matching our experienced interviewers’ ability to draw out nuanced responses. Below, we break down Beto’s performance across key quality metrics:

Criteria for evaluating the quality of data

  • Relevance: Does the response align with the question asked?

    Results: The majority of customers provided answers that aligned with the pre-determined answer options on the first try, particularly for categorical questions. When responses were ambiguous or incomplete, Beto would follow up – up to two times – to clarify and guide the customer to provide a response within the given answer options. 

    For open-ended questions, however, Beto faced some challenges. Responses were often less expansive. Without pre-set options and multiple examples to compare against, Beto operated with a lower threshold for deciding when to move on to the next question. As a result, responses sometimes lacked relevance, as Beto was not able to intuitively probe further or identify opportunities for richer engagement.

  • Depth: Does the response have a good level of detail?

    Results: Customers tended to provide shorter qualitative answers than those elicited by human enumerators, but Beto still achieved better depth than typical online surveys.

    The relatively shorter responses may reflect a trust and/or accountability gap between respondents and an AI-powered enumerator compared to human interviewers. Building mechanisms to foster greater trust and encourage openness could help close this gap and draw out more detailed responses in the future.

We believe that closing the gap between Beto and a high-performing human enumerator is largely a matter of addressing specific areas for improvement. These include identifying new strategies to boost response rates, probing more effectively, and setting a more comprehensive structure for eliciting detailed and relevant responses to qualitative questions.

BaldeCash’s CEO, Ruben Montenegro, shared the following about their experience: “We were impressed by how quickly we were able to design the survey with 60 Decibels and roll it out. Initially, we had some doubts about the response rates, but we discovered that completing the surveys was faster than expected. The quality of the data was excellent, and 60 Decibels produced an outstanding report that met all our impact measurement needs while providing valuable insights to support the growth of our business.”

What’s next?

At 60dB, we are big believers in AI’s potential to improve the speed and quality of data collection. We are already underway in:

  • adapting our AI enumerator to other languages, starting with Hindi and French
  • piloting other modalities for initiating interviews to boost our response rates
  • building new AI agents to enrich our qualitative insights

Faster insights mean our clients can make more informed decisions about their impact, which translate to more effective and efficient solutions for end customers and beneficiaries. Beyond our AI enumerator, we are also exploring innovative ways to analyze our 10 million impact data points to uncover patterns and insights that weren’t visible before. This combination of AI-powered data collection and analysis represents a significant step forward in how we understand social impact.

If you are interested in collaborating or sharing ideas, we would love to hear from you!

Let's talk

We help you understand your social performance by listening to your customers, suppliers, employees, or beneficiaries.
Get in touch

Quirky, informative, and a little bit nerdy - sign up for The Volume

Priority access to our favorite monthly finds, alongside 11,361 other data geeks

By signing up to the 60 Decibels mailing list, you consent to your data being collected and stored in line with our privacy policy