• Home
  • About
  • Resume
  • Case studies
    • Government Chatbot
    • Kipsigis Living Dictionary
    • Inclusive Language Guide
    • User attitudes on Reddit
  • Contact

Government Chatbot

Evaluating the usability of a public-facing federal government chatbot through primary and secondary research
Usability testing | Secondary research

Project overview

Chatbots can be valuable tools to facilitate communication between an organization and its customers. They provide people with immediate judgement-free help, improve accessibility for users with some disabilities, and add business value by automating customer support. In this project, I worked as the lead UX researcher at a large federal government agency to evaluate the usability of their chatbot and identify actionable areas for improvement.

Challenge

Customer service communication was a known pain point with this government agency, which serves everyone living in the US. The chatbot was built to fix this problem, but it was designed with the business--rather than users--in mind. For instance, it was made up of 5 distinct designs owned across 4 business units, and it was only available on part of the agency's website. Against this backdrop, what usability issues emerge with the chatbot, and what's the best way to remedy them?

Solution

To assess this government chatbot, I conducted a literature review of chatbot design best practices, completed a heuristic evaluation of the chatbot, and ran usability testing with users. I identified 6 key insights about users' experiences with the chatbot and developed prioritized recommendations to improve the chatbot's design. This research sparked renewed agency-wide interest in the chatbot and laid the groundwork for ongoing cross-functional chatbot redesign efforts.

Methods

  • Literature review
  • Heuristic evaluation
  • Usability testing​

Tools

  • FigJam
  • Microsoft Office & Teams​​

Team

  • 2 UX researchers
  • 2 notetakers
  • 2 chatbot developers
  • 5 product owners

My role

  • I served as the lead UX researcher, managing the project from start to finish.
  • I identified research needs and determined the appropriate research methods to address those questions.
  • I conducted secondary research through a literature review and heuristic evaluation.
  • ​​I led usability testing--from developing the research materials to analyzing the data to presenting the findings.
  • I presented research insights and recommendations to executive leadership.

My process

1. Review
2. Prepare
3. Interview
4. Analyze
5. Share

Review

Because there has been significant exploration of the user experience of chatbots in the past several years--as chatbots have become widespread--I wanted to take advantage of existing research as much as possible. At the start of the project, I completed a literature review of trustworthy sources to identify best practices in chatbot design. These sources ranged from reputable online articles to published academic papers, all vetted for research rigor. Through this literature review, I compiled a list of 23 best practices for chatbot design like:

  • Clearly tell people what tasks the chatbot can do to avoid creating false expectations.
  • Be upfront about whether users are talking with a chatbot or a live agent.
  • Include stable links and floating buttons to access the chatbot and make sure that floating button are placed on the right and contrast with the rest of the page.

These best practices helped me become attuned to the types of issues that people face when using a chatbot and laid the groundwork for subsequent research. With these best practices in hand, I completed a heuristic evaluation of the agency's chatbot. Rather than relying solely on standard design heuristics, I evaluated the chatbot specifically against the best practices from the literature review, which were tailored to the chatbot context. In this work, I learned that the agency's chatbot struggled in terms of findability, since it wasn't available across the entire website and didn't appear on the "Help" page, and flexibility, since it didn't offer an escape hatch like live support when the user encountered a problem. 

Prepare

These secondary research findings offered valuable insight, but I wanted to validate my conclusions and identify additional usability issues through direct observation of users. These factors motivated my decision to conduct usability testing at this point in the project. 

An additional UX researcher joined the team to support usability testing, which enabled us to broaden the scope of the project. In addition to the agency's current chatbot, we also tested an AI-powered chatbot prototype--trained on data from the agency's website--to compare the current experience to a potential future state. We collaborated with 2 chatbot developers to build the prototype in line with best practices, and we worked together to draft research materials including a research plan, participant screener, and moderator's guide.

We recruited a demographically diverse set of 14 participants to ensure that our sample reflected the agency's customer base (7 using the current chatbot, 7 using the prototype chatbot). We also developed a moderator's guide that would work for the current and prototype chatbot research streams, so that we could ask all participants identical questions to ensure comparable data.

I also involved product owners during the planning stage of the project to help them feel connected to the research and to democratize research at the agency. 
I invited the product owners to review research materials, share research questions, and attend research sessions to foster connection and empathy with our customers.

Interview

Next, we ran virtual usability testing sessions to collect 1-on-1 user feedback. I facilitated sessions on the current chatbot experience, while the other UX researcher led sessions on the prototype chatbot experience. These sessions had 3 sections:

  • Background questions about participants' experiences with the agency and with chatbots in general
  • ​Series of 3 tasks that prompted participants to use the chatbot naturally in high-priority situations
  • Open-ended participant feedback on the chatbot and the research experience

We asked participants to think aloud and share their screen while completing tasks, so that we could hear their perspectives and watch their actions, since these kinds of attitudinal and behavioral data complement each other. We created video recordings of each session and used a FigJam board for notetaking to facilitate rapid synthesis and analysis after the sessions.

Analyze

Within 2 weeks of completing the research sessions, we processed and synthesized data from all 14 participants. We reviewed over 1,600 data points in FigJam, manually coded them by theme, and identified 6 key insights about users' experiences with the chatbots:

  • Participants appreciated online communication but expected seamless escalation to live service when self-service was insufficient.
  • Participants struggled to find the webpages with chatbots, but once on those pages, most were able to access the chatbot.
  • Participants expected a single, persistent chat experience across the website and got frustrated when this expectation wasn't met.
  • Participants understood that the chatbot focused on general information, but they wanted more specific responses nonetheless.
  • Participants mainly used prompt buttons to communicate with the chatbot but expected an open-ended text field as well.
  • According to participants, the AI-powered chatbot prototype enabled more efficient and human conversations vs. the current chatbot.

We made sure that our insights were supported by both primary and secondary research and that we didn't over-generalize the findings from this qualitative study.

We then came up with actionable and prioritized recommendations--grouped by insight area--for improving the agency's chatbot. For instance, in line with the second insight above, we recommended that the chatbot be made accessible from any page on the agency's website, especially the "Help" page, since participants often went there looking for contact options.

Share

I presented these research insights and recommendations at an executive summit on the agency's synchronous communication options. I explained exactly what I did, why I did it, and what I learned in a way that was accessible to a non-research audience, while emphasizing a concise set of key take-aways for executive leadership. These take-aways centered on large-scale structural chatbot improvements--like creating centralized chatbot infrastructure and offering more widespread escalation to live support--which require executive buy-in.

This presentation was well received by executive leadership across the agency, which resulted in increased buy-in from leadership to fast track chatbot research, design, and development and informed the future state vision for the agency's synchronous communication. My research process, which combined secondary sources and direct user feedback, was also recognized as a model workflow by the executive department that oversees my agency, prompting more of this kind of research at the agency.

Next steps

The findings from this project raised additional research questions, which led to targeted follow-up work.
01

Usability testing revealed that participants wanted relevant, detailed, specific, and actionable information from the chatbot. What does this kind of chatbot response look like in the context of this government agency? I conducted a literature review and comparative analysis to address these questions and lay the groundwork for additional user research.
02

Participants appreciated the humanness of conversation with the AI-powered chatbot, which sparked curiosity about the appropriate personality for this agency's chatbot. A chatbot's personality can make or break a user's trust, so pinpointing the right personality is key. I ran a survey with 107 participants to evaluate the helpfulness, trustworthiness, and desirability of different chatbot personalities to identify the correct approach for this agency.

Skills

In this project, I gained additional experience conducting evaluative UX research with a cross-functional team including other researchers, developers, and product owners in a large organization. I combined primary and secondary research to develop well-rounded insights and recommendations and advocated for those findings to executive leadership. I drove the project from start to finish and learned about the challenges that can arise when turning recommendations into actions, especially in a large government agency.
Thank you for reading my case study!

Want to work with me? Feel free to contact me.
Powered by Create your own unique website with customizable templates.
  • Home
  • About
  • Resume
  • Case studies
    • Government Chatbot
    • Kipsigis Living Dictionary
    • Inclusive Language Guide
    • User attitudes on Reddit
  • Contact