Streamlining an ML feedback loop
Streamlining an ML feedback loop
Enhancing an ML feedback loop
TL;DR: Behavox’s ML model fine-tuning was too complex. It frustrated users and hurt risk detection accuracy. I streamlined workflows with embedded feedback tools, improving both usability and precision.
Super.com’s loyalty rewards system was complex causing low user engagement and retention. Users struggled to understand and redeem rewards. This hurt Super's ability to build loyalty. I led a redesign to create a task-based system that improved clarity and engagement.

NBC
+


NBC
+


SUPER
+


NBC
+

TL;DR: Behavox’s ML model fine-tuning was too complex. It frustrated users and hurt risk detection accuracy. I streamlined workflows with embedded feedback tools, improving both usability and precision.
64%
Rise in risk alert accuracy
13%
Increase in user satisfaction
Improved compliance

64%
Rise in risk alert accuracy
13%
Increase in user satisfaction
Improved compliance

The company
A compliance and risk management platform
Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.

The company
A compliance and risk management platform
Known for travel deals, e-commerce discounts, and fintech services, the app faced a major issue: low usage of its rewards program.
The company
A compliance and risk management platform
Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.

The company
A compliance and risk management platform
Known for travel deals, e-commerce discounts, and fintech services, the app faced a major issue: low usage of its rewards program.






The problem
The problem
A complex ML model tuning flow
Users struggled with making code changes to adjust risk detection models.
This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.
The problem
A complex ML model fine-tuning process
Users struggled with making code changes to adjust risk detection models.
This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.
Users struggled with making code changes to adjust risk detection models.
This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.
Users struggled with making code changes to adjust risk detection models.
This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.
The challenge
The challenge
How might we…
How might we…
Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?
Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?
Discovery
Using diagrams to clarify the ML refining flow
Discovery
Using diagrams to clarify the ML refining flow



Discovery
Using diagrams I clarified the ML refining flow
I mapped the legacy ML model lifecycle to understand where the refining process fit.
I mapped the legacy ML model lifecycle to understand where the refining process fit.
These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.
These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.
User interviews uncovered setbacks and opportunities
User interviews uncovered setbacks and opportunities
I worked with our researcher to scope interviews and uncover key insights:
I worked with our researcher to scope interviews and uncover key insights:
"I spend half my time reviewing irrelevant alert content"
Alert overload overwhelmed users, driving the need for quicker marking and review.
"I spend half my time reviewing irrelevant alert content"
Alert overload overwhelmed users, driving the need for quicker marking and review.
Manual model updates were time-consuming and inefficient
Updating the system manually was time-consuming and inefficient
Manual model updates were time-consuming and inefficient
Updating the system manually was time-consuming and inefficient
Adding new alert signals was critical for better accuracy
The inability to add new signals (phrases) reduced the model’s effectiveness.
Adding new alert signals was critical for better accuracy
The inability to add new signals (phrases) reduced the model’s effectiveness.




We also identified a new user type
Workflows varied greatly based on company size:
Small companies: A few compliance officers managed operations.
Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.
Definition
Definition
I quickly turned insights into solutions through sketches and wireframes
I quickly turned insights into solutions through sketches and wireframes
I turned insights into solutions with sketches and wireframes
1/4
These early drafts aligned the team and helped us prioritize effectively. We opted for a feedback-based approach to reduce coding needs. This allowed ML to leverage users expertise for risk validation.
2/4
I explored ways to improve ML feedback:
Review panel: Guided users through steps while tracking progress.
Inline modal dialog: Focused users on isolated, one-off decisions.
Justification dropdown: Kept feedback contextual and close to regulatory details.
3/4
Enhancing model scenario training: I introduced the "link to scenario" feature, enabling users to assign new content to ML scenarios. This ensured accurate sorting and refined models for specific contexts.
4/4
I addressed compliance managers’ ML performance monitoring needs. Dashboards centralized actionable insights and simplified workflows. They helped managers track performance and make informed decisions.
1/4
These early drafts aligned the team and helped us prioritize effectively. We opted for a feedback-based approach to reduce coding needs. This allowed ML to leverage compliance officers' expertise for risk validation.
2/4
I explored three ways to improve ML risk alert reviews:
Review panel: Guided users through steps while tracking progress.
Inline modal dialog: Focused users on isolated, one-off decisions.
Justification dropdown: Kept feedback contextual and close to regulatory details.
3/4
Enhancing model scenario training: I introduced the "link to scenario" feature, enabling users to assign new risk content directly to ML scenarios. This ensured accurate categorization and refined models for specific contexts.
4/4
I addressed compliance managers’ ML performance monitoring needs by designing dashboards. These centralized actionable insights and simplified workflows. They helped managers track performance and make faster, informed decisions.
1/4
These early drafts aligned the team and helped us prioritize effectively. We opted for a feedback-based approach to reduce coding needs. This allowed ML to leverage compliance officers' expertise for risk validation.
2/4
I explored three ways to improve ML risk alert reviews:
Review panel: Guided users through steps while tracking progress.
Inline modal dialog: Focused users on isolated, one-off decisions.
Justification dropdown: Kept feedback contextual and close to regulatory details.
3/4
Enhancing model scenario training: I introduced the "link to scenario" feature, enabling users to assign new risk content directly to ML scenarios. This ensured accurate categorization and refined models for specific contexts.
4/4
I addressed compliance managers’ ML performance monitoring needs by designing dashboards. These centralized actionable insights and simplified workflows. They helped managers track performance and make faster, informed decisions.
1/4
These early drafts aligned the team and helped us prioritize effectively. We opted for a feedback-based approach to reduce coding needs. This allowed ML to leverage compliance officers' expertise for risk validation.
2/4
I explored three ways to improve ML risk alert reviews:
Review panel: Guided users through steps while tracking progress.
Inline modal dialog: Focused users on isolated, one-off decisions.
Justification dropdown: Kept feedback contextual and close to regulatory details.
3/4
Enhancing model scenario training: I introduced the "link to scenario" feature, enabling users to assign new risk content directly to ML scenarios. This ensured accurate categorization and refined models for specific contexts.
4/4
I addressed compliance managers’ ML performance monitoring needs by designing dashboards. These centralized actionable insights and simplified workflows. They helped managers track performance and make faster, informed decisions.
Usability testing
Testing solutions to enhance ML training and workflows
I collaborated with researcher to validate our solutions. We validated 3 user-centered solutions:
Usability testing
Testing solutions to enhance ML training and workflows
I collaborated with researcher to validate our solutions. We validated 3 user-centered solutions:

NBC
+

A binary feedback system to simplify flagged content reviews.
A binary feedback system to simplify flagged content reviews.
A side panel to assign new risk signals directly to ML model scenarios.
A side panel to assign new risk signals directly to ML models.

NBC
+


NBC
+

A manager analytics dashboard to track ML health and feedback contributors.
A analytics dashboard to track ML health and feedback contributors.
Challenge
Overcoming backend hurdles to keep the project moving
Overcoming backend hurdles to keep the project moving
As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.
As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.
Challenge
Overcoming backend hurdles to keep the project moving
As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.
Challenge
Overcoming backend hurdles to keep the project moving
As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.




We had to cut the dashboard from the project due to inaccurate time estimates.
We had to cut the dashboard from the project due to inaccurate time estimates.
We had to cut the dashboard from the project due to inaccurate time estimates.
Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.
Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.
Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.




To prevent a project pause, I worked with my PM to find a solution.
To prevent a project pause, I worked with my PM to find a solution.
To avoid a project pause, I worked with my PM to find a workaround.
To keep momentum, I worked with my PM to secure a two-week testing window. This let us evaluate the rest of the solution while devs updated estimates.
To keep momentum, I worked with my PM to secure a two-week testing window. This let us evaluate the rest of the solution while devs updated estimates.
To keep momentum, I worked with my PM to secure a 2-week testing window. This let us evaluate the rest of the solution while devs updated estimates.
Usability testing
Usability testing
My assumptions about feedback placement didn’t hold up in testing
My assumptions about feedback placement didn’t hold up in testing
My assumptions about feedback placement didn’t hold up in testing


"I wasn't able to find how to give feedback on flagged risk signals."
"I wasn't able to find how to give feedback on flagged risk signals."
"I wasn't able to find how to give feedback on flagged risk signals."
I assumed users would click highlighted text to add feedback, as they were familiar with our text-highlight feature. Yet, most users struggled to find the option, exposing a gap in discoverability.


"I always go to the justification… it helps me clarify flagged risk content."
"I always go to the justification… it helps me clarify flagged risk content."
"I always go to the justification… it helps me clarify flagged risk content."
Testing revealed that users heavily relied on justifications to review flagged content. This confirmed they were crucial to workflows and the ideal area for the feedback feature.
Testing showed that users relied on justifications to evaluate flagged risk content. It was clear that justifications were key to their workflows and the best place to embed the feedback feature.
Iteration
Usability testing
Refining justifications to boost the feedback feature's usability
Refining justifications to boost the feedback feature's usability
The justification area became the ideal place for feedback. But, it had usability challenges that needed fixing:
The justification area became the ideal place for feedback. But, it had usability challenges that needed fixing:




The cluttered layout, filled with dense text blocks, made scanning key details difficult for users. Also, key regulatory information was hidden among secondary important details.
To resolve these issues, I made a few improvements:
Enhanced spacing and grouping: cut clutter and improved scannability.
Added an expandable area for non-essential details: Minimized distractions.
Centralized regulatory data in a tooltip: Made critical details quicker to access.
To resolve these issues, I made a few improvements:
To resolve these issues, I made a few improvements:
The justification area became the ideal place for feedback. But, it had usability issues that needed fixing:




The cluttered layout, filled with dense text blocks, made scanning key details difficult for users. Also, key regulatory information was hidden among secondary important details.
Enhanced spacing and grouping: Reduced clutter and improved scannability.
Enhanced spacing and grouping: Reduced clutter and improved scannability.
Added an expandable area for non-essential details: Minimized distractions.
Added an expandable area for non-essential details: Minimized distractions.
Centralized regulatory data in a tooltip: Made critical details quicker to access.
Centralized regulatory data in a tooltip: Made critical details quicker to access.
Iteration
Refining justifications to boost the feedback feature's usability
User segmentation testing
Validating our new feedback approach
Users found giving feedback in the justification area intuitive. High clickthrough rates validated this approach.




Handoff
Shipping the new ML model feedback mechanism and planning ahead
Shipping the new ML model feedback mechanism and planning ahead
After a final team review, I refined the designs and handed them off to the dev team. I also updated them to align with the newly deployed design system.
After a final team review, I refined the designs and handed them off to the dev team. I also updated them to align with the newly deployed design system.




In the first phase, we launched the ML model feedback mechanism.
In the first phase, we launched the ML model feedback mechanism.



Dashboard designs were added to the backlog for future implementation.


Dashboard designs were added to the backlog for future implementation.
Handoff
Shipping the new ML feedback mechanism and planning ahead
After a final team review, I refined and handed the designs to devs.
We were deploying a new design system, so I updated the designs for consistency.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
Handoff
Shipping the new ML feedback mechanism and planning ahead
After a final team review, I refined and handed the designs to devs.
We were deploying a new design system, so I updated the designs for consistency.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
Impact
Simpler feedback and smarter models
Simpler feedback and smarter models
Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.
Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.
64%
64%
Increase in risk alert accuracy
Boost in user activity rate
64%
Increase in risk alert accuracy
13%
13%
Rise in user satisfaction
Rise in user satisfaction
13%
Rise in user satisfaction
Improved compliance
Improved compliance
Improved compliance
Key learning
Key learning
Smart trade-offs drive progress
Progress requires thoughtful trade-Offs
Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.
Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.
Key learning
Smart trade-offs drive progress
Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.
Wanna hear the full story?
Email me
Email me
Up next
Simpler feedback and smarter models
Impact
Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.
65%
Increase in risk alert accuracy
13%
Rise in user satisfaction
Improved compliance
Key learning
Smart trade-offs drive progress
Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.
Usability testing
My assumptions about feedback placement didn’t hold up in testing
We shifted to a task-based system called ‘Missions.’
The "Missions" concept was simple: users completed in-app tasks to earn rewards. It offered a more engaging and holistic way to communicate rewards.

Users struggled with making code changes to adjust risk detection models.
This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.
The problem
A complex ML model tuning flow
The challenge
How might we…
Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?
Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.
The company
A compliance and risk management platform


User interviews uncovered setbacks and opportunities
I worked with our researcher to scope 10 interviews revealing key insights:
Discovery
Using diagrams I clarified the ML refining flow


I mapped the legacy ML model lifecycle to understand where the refining process fit.
These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.
"I spend half my time reviewing irrelevant alert content."
Alert overload overwhelmed users, driving the need for quicker marking and review.
Manual model updates were time-consuming and inefficient
Updating the system manually was time-consuming and inefficient
Adding new alert signals was critical for better accuracy
The inability to add new signals (phrases) reduced the model’s effectiveness.
We also identified a new user type
Workflows varied greatly based on company size:
Small companies: A few compliance officers managed operations.
Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.
We also identified a new user type


We also identified a new user type
Workflows differed a lot based on company needs. For instance:
Small companies: A few compliance officers handled operations.
Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.
We also identified a new user type


Workflows varied greatly based on company size:
Small companies: A few compliance officers managed operations.
Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.
User interviews uncovered setbacks and opportunities
I worked with our researcher to scope interviews and uncover key insights:
Usability testing
Testing solutions to enhance ML training and workflows.
I collaborated with researcher to validate our solutions. We validated three user-centered solutions:
A binary feedback system to simplify flagged content reviews.
We decided to test the thumbs-up/down approach. This aimed to simplify feedback while keeping users focus on flagged content.
A side panel to assign new risk signals directly to ML model scenarios.
We assessed how users assigned missed risk content to scenarios. This helped refine ML models by training them with accurate inputs.
A manager analytics dashboard to track ML health and feedback contributors.
I refined the dashboard based on available backend data. We wanted to assesses how managers used it to track model health and feedback contributors.

User segmentation testing
Validating our new feedback approac
Users found giving feedback in the justification area intuitive. High clickthrough rates validated this approach.


User segmentation testing
Validating our new feedback system
Users found giving feedback in the justification area intuitive. High clickthrough rates validated this.