Streamlining an ML feedback loop

Streamlining an ML feedback loop

Enhancing an ML feedback loop

TL;DR: Behavox’s ML model fine-tuning was too complex. It frustrated users and hurt risk detection accuracy. I streamlined workflows with embedded feedback tools, improving both usability and precision.

Super.com’s loyalty rewards system was complex causing low user engagement and retention. Users struggled to understand and redeem rewards. This hurt Super's ability to build loyalty. I led a redesign to create a task-based system that improved clarity and engagement.

Rise in alert accuracy

11%
11%

Increase in satisfaction

Improved compliance

Rise in alert accuracy

11%
11%

Increase in satisfaction

Improved compliance

NBC

+

NBC

+

SUPER

+

NBC

+

TL;DR: Behavox’s ML model fine-tuning was too complex. It frustrated users and hurt risk detection accuracy. I streamlined workflows with embedded feedback tools, improving both usability and precision.

64%

Rise in risk alert accuracy

13%

Increase in user satisfaction

Improved compliance

64%

Rise in risk alert accuracy

13%

Increase in user satisfaction

Improved compliance

The company

A compliance and risk management platform

Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.

The company

A compliance and risk management platform

Known for travel deals, e-commerce discounts, and fintech services, the app faced a major issue: low usage of its rewards program.

The company

A compliance and risk management platform

Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.

The company

A compliance and risk management platform

Known for travel deals, e-commerce discounts, and fintech services, the app faced a major issue: low usage of its rewards program.

The problem

The problem

A complex ML model tuning flow

Users struggled with making code changes to adjust risk detection models.

This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.

The problem

A complex ML model fine-tuning process
Users struggled with making code changes to adjust risk detection models.

This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.

Users struggled with making code changes to adjust risk detection models.

This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.

Users struggled with making code changes to adjust risk detection models.

This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.

The challenge

The challenge

How might we…

How might we…

Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?

Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?

Discovery

Using diagrams to clarify the ML refining flow

Discovery

Using diagrams to clarify the ML refining flow

Discovery

Using diagrams I clarified the ML refining flow

I mapped the legacy ML model lifecycle to understand where the refining process fit.

I mapped the legacy ML model lifecycle to understand where the refining process fit.

These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.

These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.

User interviews uncovered setbacks and opportunities

User interviews uncovered setbacks and opportunities

I worked with our researcher to scope interviews and uncover key insights:

I worked with our researcher to scope interviews and uncover key insights:

"I spend half my time reviewing irrelevant alert content"

Alert overload overwhelmed users, driving the need for quicker marking and review.

"I spend half my time reviewing irrelevant alert content"

Alert overload overwhelmed users, driving the need for quicker marking and review.

Manual model updates were time-consuming and inefficient

Updating the system manually was time-consuming and inefficient

Manual model updates were time-consuming and inefficient

Updating the system manually was time-consuming and inefficient

Adding new alert signals was critical for better accuracy

The inability to add new signals (phrases) reduced the model’s effectiveness.

Adding new alert signals was critical for better accuracy

The inability to add new signals (phrases) reduced the model’s effectiveness.

We also identified a new user type

Workflows varied greatly based on company size:
  • Small companies: A few compliance officers managed operations.

  • Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.

Definition

Definition

I quickly turned insights into solutions through sketches and wireframes

I quickly turned insights into solutions through sketches and wireframes
I turned insights into solutions with sketches and wireframes

Usability testing

Testing solutions to enhance ML training and workflows

I collaborated with researcher to validate our solutions. We validated 3 user-centered solutions:

Usability testing

Testing solutions to enhance ML training and workflows
I collaborated with researcher to validate our solutions. We validated 3 user-centered solutions:

NBC

+

A binary feedback system to simplify flagged content reviews.

A binary feedback system to simplify flagged content reviews.

A side panel to assign new risk signals directly to ML model scenarios.

A side panel to assign new risk signals directly to ML models.

NBC

+

NBC

+

A manager analytics dashboard to track ML health and feedback contributors.

A analytics dashboard to track ML health and feedback contributors.

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox.

Artsiom Mezin

Sr. Engineering Manager

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox.

Artsiom Mezin

Sr. Engineering Manager

Challenge

Overcoming backend hurdles to keep the project moving

Overcoming backend hurdles to keep the project moving
As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.

As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.

Challenge

Overcoming backend hurdles to keep the project moving

As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.

Challenge

Overcoming backend hurdles to keep the project moving

As I finalized prototypes for testing, backend restrictions emerged as a major roadblock.

We had to cut the dashboard from the project due to inaccurate time estimates.

We had to cut the dashboard from the project due to inaccurate time estimates.

We had to cut the dashboard from the project due to inaccurate time estimates.

Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.

Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.

Management weighed pausing the project due to the unclear estimates. This risked further delays and missed deadlines.

To prevent a project pause, I worked with my PM to find a solution.

To prevent a project pause, I worked with my PM to find a solution.

To avoid a project pause, I worked with my PM to find a workaround.

To keep momentum, I worked with my PM to secure a two-week testing window. This let us evaluate the rest of the solution while devs updated estimates.

To keep momentum, I worked with my PM to secure a two-week testing window. This let us evaluate the rest of the solution while devs updated estimates.

To keep momentum, I worked with my PM to secure a 2-week testing window. This let us evaluate the rest of the solution while devs updated estimates.

Usability testing

Usability testing

My assumptions about feedback placement didn’t hold up in testing

My assumptions about feedback placement didn’t hold up in testing
My assumptions about feedback placement didn’t hold up in testing
"I wasn't able to find how to give feedback on flagged risk signals."

"I wasn't able to find how to give feedback on flagged risk signals."

"I wasn't able to find how to give feedback on flagged risk signals."

I assumed users would click highlighted text to add feedback, as they were familiar with our text-highlight feature. Yet, most users struggled to find the option, exposing a gap in discoverability.

"I always go to the justification… it helps me clarify flagged risk content."

"I always go to the justification… it helps me clarify flagged risk content."

"I always go to the justification… it helps me clarify flagged risk content."

Testing revealed that users heavily relied on justifications to review flagged content. This confirmed they were crucial to workflows and the ideal area for the feedback feature.

Testing showed that users relied on justifications to evaluate flagged risk content. It was clear that justifications were key to their workflows and the best place to embed the feedback feature.

Iteration

Usability testing

Refining justifications to boost the feedback feature's usability

Refining justifications to boost the feedback feature's usability
The justification area became the ideal place for feedback. But, it had usability challenges that needed fixing:

The justification area became the ideal place for feedback. But, it had usability challenges that needed fixing:

Green gradient background
Green gradient background
Green gradient background

The cluttered layout, filled with dense text blocks, made scanning key details difficult for users. Also, key regulatory information was hidden among secondary important details.

To resolve these issues, I made a few improvements:

Enhanced spacing and grouping: cut clutter and improved scannability.

Added an expandable area for non-essential details: Minimized distractions.

Centralized regulatory data in a tooltip: Made critical details quicker to access.

To resolve these issues, I made a few improvements:

To resolve these issues, I made a few improvements:

The justification area became the ideal place for feedback. But, it had usability issues that needed fixing:

The cluttered layout, filled with dense text blocks, made scanning key details difficult for users. Also, key regulatory information was hidden among secondary important details.

Enhanced spacing and grouping: Reduced clutter and improved scannability.

Enhanced spacing and grouping: Reduced clutter and improved scannability.

Added an expandable area for non-essential details: Minimized distractions.

Added an expandable area for non-essential details: Minimized distractions.

Centralized regulatory data in a tooltip: Made critical details quicker to access.

Centralized regulatory data in a tooltip: Made critical details quicker to access.

Iteration

Refining justifications to boost the feedback feature's usability
Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox UI.

Artsiom Mezin

Sr. Engineering Manager

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox UI.

Artsiom Mezin

Sr. Engineering Manager

User segmentation testing

Validating our new feedback approach

Users found giving feedback in the justification area intuitive. High clickthrough rates validated this approach.

Handoff

Shipping the new ML model feedback mechanism and planning ahead

Shipping the new ML model feedback mechanism and planning ahead
After a final team review, I refined the designs and handed them off to the dev team. I also updated them to align with the newly deployed design system.

After a final team review, I refined the designs and handed them off to the dev team. I also updated them to align with the newly deployed design system.

Green gradient background
Green gradient background
Green gradient background

In the first phase, we launched the ML model feedback mechanism.

In the first phase, we launched the ML model feedback mechanism.

Green gradient background
Green gradient background

Dashboard designs were added to the backlog for future implementation.

Green gradient background
Green gradient background

Dashboard designs were added to the backlog for future implementation.

Handoff

Shipping the new ML feedback mechanism and planning ahead

After a final team review, I refined and handed the designs to devs.

We were deploying a new design system, so I updated the designs for consistency.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Handoff

Shipping the new ML feedback mechanism and planning ahead

After a final team review, I refined and handed the designs to devs.

We were deploying a new design system, so I updated the designs for consistency.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox UI.

Artsiom Mezin

Sr. Engineering Manager

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox UI.

Artsiom Mezin

Sr. Engineering Manager

Impact

Simpler feedback and smarter models


Simpler feedback and smarter models
Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.

Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.

64%

64%

Increase in risk alert accuracy

Boost in user activity rate

64%

Increase in risk alert accuracy

13%

13%

Rise in user satisfaction

Rise in user satisfaction

13%

Rise in user satisfaction

Improved compliance

Improved compliance

Improved compliance

Key learning

Key learning

Smart trade-offs drive progress

Progress requires thoughtful trade-Offs

Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.

Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.

Key learning

Smart trade-offs drive progress

Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.

Communication with Yanick was always plain and easy. His general ingenuity had a significant impact on Behavox UI.

Artsiom Mezin

Sr. Engineering Manager

Wanna hear the full story?

Wanna hear the full story?

Wanna hear the full story?

Email me

Email me

Wanna hear the full story?

Email me

Email me

Thanks for reading!

Let's stay in touch:

Simpler feedback and smarter models

Impact

Our new approach simplified workflows. It made risk detection faster and more intuitive. This paved the way for reducing regulatory violations and improving efficiency.

65%

Increase in risk alert accuracy

13%

Rise in user satisfaction

Improved compliance

Up next
Up next

Key learning

Smart trade-offs drive progress

Backend challenges threatened project progress. Yet, teamwork and prioritization kept us moving forward. Testing part of the project uncovered valuable insights. This helped us make smarter trade-offs.

Usability testing

My assumptions about feedback placement didn’t hold up in testing

We shifted to a task-based system called ‘Missions.’

The "Missions" concept was simple: users completed in-app tasks to earn rewards. It offered a more engaging and holistic way to communicate rewards.

Users struggled with making code changes to adjust risk detection models.

This inefficiency frustrated users and hurt adoption. It led to missed risks and lower compliance accuracy for organizations.

The problem

A complex ML model tuning flow

The challenge

How might we…

Make it easier for users to fine-tune ML models and improve usability and risk detection accuracy?

Behavox offers SaaS tools to track, detect, and manage compliance risks. These tools help enterprises ensure regulatory adherence and reduce threat exposure.

The company

A compliance and risk management platform

User interviews uncovered setbacks and opportunities

I worked with our researcher to scope 10 interviews revealing key insights:

Discovery

Using diagrams I clarified the ML refining flow

I mapped the legacy ML model lifecycle to understand where the refining process fit.

These diagrams helped me understand the flow and validate with engineers on Slack for quick feedback.

"I spend half my time reviewing irrelevant alert content."

Alert overload overwhelmed users, driving the need for quicker marking and review.

Manual model updates were time-consuming and inefficient

Updating the system manually was time-consuming and inefficient

Adding new alert signals was critical for better accuracy

The inability to add new signals (phrases) reduced the model’s effectiveness.

We also identified a new user type

Workflows varied greatly based on company size:

  • Small companies: A few compliance officers managed operations.

  • Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.

We also identified a new user type
We also identified a new user type

Workflows differed a lot based on company needs. For instance:

  • Small companies: A few compliance officers handled operations.

  • Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.

We also identified a new user type

Workflows varied greatly based on company size:

Small companies: A few compliance officers managed operations.

Large organizations: A full team, led by a compliance risk manager, monitored performance and met regulations.

User interviews uncovered setbacks and opportunities

I worked with our researcher to scope interviews and uncover key insights:

Usability testing

Testing solutions to enhance ML training and workflows.

I collaborated with researcher to validate our solutions. We validated three user-centered solutions:

A binary feedback system to simplify flagged content reviews.

We decided to test the thumbs-up/down approach. This aimed to simplify feedback while keeping users focus on flagged content.

A side panel to assign new risk signals directly to ML model scenarios.


We assessed how users assigned missed risk content to scenarios. This helped refine ML models by training them with accurate inputs.

A manager analytics dashboard to track ML health and feedback contributors.

I refined the dashboard based on available backend data. We wanted to assesses how managers used it to track model health and feedback contributors.

User segmentation testing

Validating our new feedback approac

Users found giving feedback in the justification area intuitive. High clickthrough rates validated this approach.

User segmentation testing

Validating our new feedback system

Users found giving feedback in the justification area intuitive. High clickthrough rates validated this.