Inefficient assessment tool slowed drug discovery and introduced errors.
At BenevolentAI, scientists struggled with an assessment tool that relied on memory and external notes, wasting hours during the costly drug discovery process and producing inconsistent input for AI models.
Example of the legacy assessment flow with a decision capture modal.
Scientists had to complete assessments after data analysis in a modal that obscured the data. This led to more mistakes, inconsistent results, longer analysis times, and reliance on third-party note-taking tools.
Expert feedback on early sketches guided the initial design, allowing timely delivery under crunch time.
Old assessment
I conducted contextual inquiries with 7 users (4 lead scientists, 3 contributing scientists) to understand assessment workflows and pain points.
Key insights:
Users needed a way to capture evidence while browsing data to reduce cognitive load.
AI models lacked the complete, consistent input needed for training.
New assessment
Goal:
Enable users to capture assessments of data in real time.
Design objectives:
Present consistent assessment options to support both users and AI.
Reduce reliance on memory and eliminated the need for separate note-taking tools.
Simplify navigation and maintain a clear, consistent workflow
Mid-fidelity Figma prototypes were recreated in Google Sheets for interactive testing with
3 users.
Test showed that users using the new design spent 4 minutes less on assessments, and no external notes were needed. All users reported that the flow would positively impact their work.

Redesign boosted satisfaction, efficiency, and adoption across teams.
The redesign increased user satisfaction by 32%, reduced time-on-task by 10%, decreased navigation difficulties, and lowered cognitive load.
It was later adopted by another internal tool, further reducing assessment time for multiple scientists at BenevolentAI.


