Transforming ML tuning for clearer, faster risk detection

Impact

Role

Scope

64%

Boost in risk alert accuracy

31%

Improvement in user satisfaction

Reduced reliance on company analysts

Impact

Role

Scope

64%

Boost in ML risk alert accuracy

31%

Improvement in user satisfaction

Reduced reliance on company analysts

Impact

Role

Scope

Boost in risk alert accuracy

31%

Increase in user satisfactio

Reduced reliance on analysts

Problem

Analyst dependency slowed ML risk detection

Problem

Analyst dependency slowed ML risk detection

Problem

Complex ML tuning created risk-detection bottlenecks

Behavox’s ML tuning relied on analysts, slowing detection and hurting model quality. Compliance officers couldn’t adjust rules themselves. Queues grew, alerts lagged, and exposure to real threats increased.

Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make

This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.

Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.

Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”

Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.

UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.

IDEATION

Four directions. One goal: get feedback closer to the model

My goal was to connect feedback, review, and oversight into a single flow.

In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.

Feedback-first design. Let compliance officers train ML models directly, in context.

In-context risk alert review showing a compliance officer validating a bribery signal directly on highlighted text to train an ML model.

Side-panel workflow for reviewing ML risk signals, allowing compliance officers to confirm accuracy and provide structured feedback.”

Streamlined review paths. Reduced steps and errors by exploring panels, inline actions, and modals.

End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.

In-context scenario assignment. Let reviewers tag new signals in the moment, sharpening ML accuracy and cutting rework.

Inline context menu on highlighted email text with options to create a label or link the content to a compliance scenario.

Side panel form for linking highlighted content to a compliance scenario, with dropdowns for scenario and use case and save actions.

Dashboard UI showing high ML risk-signal volume at scale, with large numeric metrics (e.g., 1415 and 2057) and a trend chart. Visual represents growing alert volume and the need for faster, clearer ML tuning to improve risk detection accuracy and reduce analyst dependency.

Weekly breakdown of ML risk signals (Week 1–4) displayed as horizontal bars with counts (145, 267, 223, 198). Visual illustrates signal fluctuations over time and highlights the need for streamlined feedback loops to improve model accuracy and speed up compliance review workflows.

Manager oversight. A shared view of ML health to reduce analyst load and maintain quality at scale.

ML monitoring dashboard highlighting signal volume at scale and a weekly trend breakdown, showing how risk alerts change over time.

ML health dashboard showing overall model health trends over time and a validation progress snapshot for reviewed signals.

Testing

I made the wrong call: pattern over behavior

I reused a hover tooltip for ML feedback. I assumed users would find it since it matched an existing pattern. Users made decisions in the justification area, not in highlighted text.

Screenshot of the Behavox alert review interface showing a hover tooltip titled "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points to the tooltip from a caption reading "Where I expected feedback entry," highlighting the original feedback placement that users failed to discover.

"I wasn't able to find how to give feedback on flagged risk signals."

Screenshot of the Behavox alert panel showing the Risk scenario row with a "View justification" link highlighted. An arrow points to it from a caption reading "Where users actually went," showing the decision point where users naturally looked for context before taking action.

"I always go to the justification… it helps me clarify flagged risk content."

Feedback lived away from the decision point. I had designed around their mental model instead of into it.

"I wasn't able to find how to give feedback on flagged risk signals."

"I always go to the justification… it helps me clarify flagged risk content."

Feedback lived away from the decision point. I had designed around their mental model instead of into it.

Iteration

I moved ML feedback to where decisions happen

Testing revealed the gap. Users went to the justification first, every time. So I moved feedback entry there.

Before:

Flagged signal lacks emphasis, buried among secondary fields.

Regulatory data shows raw URL, interrupts the decision flow.

All fields carry equal weight, nothing signals what matters most.

Competing secondary statistical metadata.

Before state of Behavox risk alert justification component showing equal field weight and competing secondary metadata. Annotations 3 and 4 highlight information hierarchy failures in compliance alert review. UX case study by Yanick, senior product designer.

All fields carry equal weight, nothing signals what matters most.

Competing secondary statistical metadata.

Improved risk alert justification component in collapsed state. Flagged compliance signal elevated with orange left border. Risk scenario and investigation status surfaced inline. Supporting evidence hidden by default to reduce cognitive load. UX case study by Yanick, senior product designer.

Improved Behavox risk alert justification component, collapsed state. Flagged insider trading signal elevated with orange left border. Risk scenario and status surfaced inline. Secondary metadata hidden by default. UX case study by Yanick, senior product designer.

After:

Flagged signal leads. Elevated and visually distinct from supporting data.

Primary context surfaced inline. One scan, two key facts.

Improved risk alert justification component showing expanded Supporting evidence panel. Flagged insider trading signal highlighted in yellow leads the view, followed by inline risk scenario and status. Secondary regulatory metadata accessible via clean external link. UX case study by Yanick, senior product designer.

Improved Behavox risk alert justification component, expanded state. Supporting evidence panel reveals regulatory context with clean external link replacing raw URL. Secondary metadata organized in scannable grid. Compliance UX case study by Yanick, senior product designer.

Secondary metadata hidden by default. Surface stays focused on the decision.

Raw URL replaced with a clean external link. Regulatory detail accessible, not disruptive.

Compliance officers got to the risk alert justification faster. With less friction.

Testing

Two clearer flows shipped as a result

Testing

Two clearer flows shipped as a result

Side panel for scenario assignment. Chosen for fit and fast build. Users stayed in context while tagging new signals.

Manager analytics dashboard. Built around two goals: track ML health and monitor review contributions.

We prioritized tuning and deferred oversight to keep momentum

Pivot

Note: We were rolling out a new design system, so I updated my designs and added to the library.

We prioritized tuning and deferred oversight to keep momentum

Pivot

Note: We were rolling out a new design system, so I updated my designs and added to the library.

Pivot

We prioritized tuning and deferred oversight to keep momentum

Aggregation pipeline limitations blocked a full dashboard build.

With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.

Handoff

I delivered faster ML tuning
and set the foundation for scale

Note: We were rolling out a new design system, so I updated my designs and contributed to the library.

Phase 1: After validating the Compliance Officer flow, I handed it to devs. We shipped ML feedback entry in justification, enabling faster tuning with less friction.

Phase 2: We kept the Compliance Manager dashboard for future implementation to restore oversight, guide quality, and balance load.

Phase 2: The dashboard for compliance managers was prioritized in the backlog for a future release.

Handoff

I delivered faster ML tuning and set the foundation for scale

After testing confirmed our direction, I applied the new design system before handoff.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Handoff

I delivered faster ML tuning and set the foundation for scale

After testing confirmed our direction, I applied the new design system before handoff.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.

Artsiom Mezin

Sr. Engineering Manager

Learnings

Designing for clarity, speed, and momentum

Place actions where decisions happen. It lifts discoverability and starts immediately.

Validate now, scale later. Phased builds keep momentum when new constraints block.

Standardize the fastest review path. It speeds completion and cuts errors.

Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.

Artsiom Mezin

Sr. Engineering Manager

Want the full story?

I help teams remove friction and ship faster.

Let's connect

Want the full story?

I help teams remove friction and ship faster.

Let's connect

Want the full story?

I help teams remove friction and ship faster.

Let's connect

Wanna hear the full story?

I help teams remove friction and ship faster.

Let's connect

Up next

Logo

LOGO

• Lorem

Logo

LOGO

• Lorem

Logo

LOGO

• Lorem

Logo

LOGO

• Lorem

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Impact

Role

Scope

Boost in alert accuracy

27%

Raise in user satisfaction

Reduced operational drag

This case study reflects my work. Certain details were adjusted to honour confidentiality.

Transforming ML tuning for clearer, faster risk detection

Behavox’s ML model feedback loop was slow, manual, and code-heavy. Users relied on company analysts to tune models. This led to missed risks and reduced trust in the system.

Pivot

We prioritized faster tuning and deferred oversight to keep momentum

Aggregation pipeline limitations blocked a full dashboard build.

With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.

Up next

Lorem ipsum dolor sit amed

LOGO

• Lorem

Lorem ipsum dolor sit amed

LOGO

• Lorem

Lorem ipsum dolor sit amed

LOGO

• Lorem

Lorem ipsum dolor sit amed

LOGO

• Lorem

Learnings

Designing for clarity, speed, and momentum

Place actions where decisions happen. It lifts discoverability and starts immediately.

Validate now, scale later. Phased builds keep momentum when new constraints block.

Standardize the fastest review path. It speeds completion and cuts errors.

Problem

Analyst dependency slowed ML risk detection

Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make. This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.

Ideation

Four directions. One goal: get feedback closer to the model

My goal was to connect feedback, review, and oversight into a single flow. I aimed to enable faster tuning and clearer decisions.

"I wasn't able to find how to give feedback on flagged risk signals."

"I always go to the justification… it helps me clarify flagged risk content."

Test data confirmed users couldn’t find the feedback entry. The thumbs-up/down pattern blended with content, so most users never entered the loop.

Testing

I made the wrong call: pattern over behavior

I reused a hover tooltip for ML feedback. I assumed users would find it since it matched an existing pattern. Users made decisions in the justification area, not in highlighted text.

Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.

Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.