Do We Dare Use Generative AI for Mental Health?
๐ Abstract
The article discusses the evolution of the mental health app Woebot, its approach to using AI technology, and its exploration of incorporating large language models (LLMs) like ChatGPT into its platform. It highlights Woebot's focus on evidence-based techniques, its rules-based conversational design, and the challenges it faced in evaluating whether generative AI could enhance its users' experiences while maintaining its core principles.
๐ Q&A
[01] Woebot's Approach and Evolution
1. What is Woebot and how does it differ from chatbots like ChatGPT?
- Woebot is a mental health app that uses a rules-based conversational design, delivering evidence-based tools inspired by cognitive behavioral therapy (CBT).
- Unlike ChatGPT, which generates unpredictable statements, Woebot's content is written by conversational designers trained in evidence-based approaches and collaborating with clinical experts.
- Woebot's conversations follow a structured, rules-based approach, while ChatGPT uses statistical methods to determine its next words.
2. How has Woebot's technology evolved over time?
- Woebot initially used regular expressions (regexes) to understand user intent, but later replaced them with supervised learning classifiers to improve accuracy.
- The team has continuously evaluated and updated the AI models used, such as replacing fastText with the more advanced BERT model in 2019.
3. What were Woebot's core beliefs and principles that influenced its development?
- Woebot was designed to be an adjunct to human support, not a replacement, and to follow principles of "sitting with open hands" and encouraging user growth and self-discovery.
- Careful conversational design was crucial to ensure interactions aligned with these core beliefs, including techniques like "table reads" to refine the content.
[02] Exploring Generative AI Integration
1. What were the key considerations and challenges Woebot faced in exploring the use of LLMs like ChatGPT?
- Woebot was excited by the potential of LLMs to enable more fluid and complex conversations, but also concerned about the risks of LLMs providing unsafe or inappropriate responses.
- The team experimented with ChatGPT and found issues such as conversations ending quickly, lack of engagement in psychological processes, and a tendency to provide lists of information rather than guiding the user.
2. How did Woebot approach integrating LLMs into its platform?
- Woebot developed its own LLM prompt-execution engine to allow it to selectively incorporate LLMs into its rules-based system, while maintaining control and safeguards.
- The team implemented technical safeguards, such as using "best in class" LLMs, validation steps, and carefully crafted prompts to elicit appropriate responses from the LLMs.
3. What were the results of Woebot's initial study on the LLM-augmented chatbot?
- The study found that users expressed similar satisfaction with the LLM-augmented and standard versions of Woebot, and both groups reported fewer self-reported symptoms.
- The LLM-augmented chatbot was well-behaved, refusing to provide inappropriate responses like medical advice or endorsing maladaptive behaviors.