Transforming ML tuning for clearer, faster risk detection

Transforming ML tuning for clearer, faster risk detection

64%

Boost in risk alert accuracy

31%

Improvement in user satisfaction

Reduced reliance on company analysts

64%

Boost in ML risk alert accuracy

31%

Improvement in user satisfaction

Reduced reliance on company analysts

Boost in risk alert accuracy

31%
31%

Increase in user satisfactio

Reduced reliance on analysts

This case study reflects my work. Certain details were adjusted to honour confidentiality.

Problem

Analyst dependency slowed ML risk detection

Problem

Analyst dependency slowed ML risk detection

Problem

Complex ML tuning created risk-detection bottlenecks

Behavox’s ML tuning relied on analysts, slowing detection and hurting model quality. Compliance officers couldn’t adjust rules themselves. Queues grew, alerts lagged, and exposure to real threats increased.

Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make

This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.

Illustration of unclear ML-tuning rules shown through a confusing code snippet. A compliance officer avatar looks puzzled, highlighting low clarity in the model-tuning flow. Visual used in a product design case study to show how complexity and poor rule transparency hurt user trust and workflow efficiency.
Illustration of unclear ML-tuning rules shown through a confusing code snippet. A compliance officer avatar looks puzzled, highlighting low clarity in the model-tuning flow. Visual used in a product design case study to show how complexity and poor rule transparency hurt user trust and workflow efficiency.

Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.

Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.

Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”
Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”

Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.

Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.

UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.
UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.

Ideation

Ideation

Four directions. One goal: get feedback closer to the model

Four directions. One goal: get feedback closer to the model
My goal was to connect feedback, review, and oversight into a single flow.

My goal was to connect feedback, review, and oversight into a single flow.

In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.
In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.
In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.

Feedback-first design. Let compliance officers train ML models directly, in context.

Feedback-first design. Let compliance officers train ML models directly, in context.

Feedback-first design. Let compliance officers train ML models directly, in context.

In-context risk alert review showing a compliance officer validating a bribery signal directly on highlighted text to train an ML model.
Side-panel workflow for reviewing ML risk signals, allowing compliance officers to confirm accuracy and provide structured feedback.”
Side-panel workflow for reviewing ML risk signals, allowing compliance officers to confirm accuracy and provide structured feedback.”

Streamlined review paths. Reduced steps and errors by exploring panels, inline actions, and modals.

Streamlined review paths. Reduced steps and errors by exploring panels, inline actions, and modals.

Streamlined review paths. Reduced steps and errors by exploring panels, inline actions, and modals.

End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.
End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.

In-context scenario assignment. Let reviewers tag new signals in the moment, sharpening ML accuracy and cutting rework.

In-context scenario assignment. Let reviewers tag new signals in the moment, sharpening ML accuracy and cutting rework.

Inline context menu on highlighted email text with options to create a label or link the content to a compliance scenario.
Inline context menu on highlighted email text with options to create a label or link the content to a compliance scenario.
Side panel form for linking highlighted content to a compliance scenario, with dropdowns for scenario and use case and save actions.
Side panel form for linking highlighted content to a compliance scenario, with dropdowns for scenario and use case and save actions.
End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.
Dashboard UI showing high ML risk-signal volume at scale, with large numeric metrics (e.g., 1415 and 2057) and a trend chart. Visual represents growing alert volume and the need for faster, clearer ML tuning to improve risk detection accuracy and reduce analyst dependency.
Dashboard UI showing high ML risk-signal volume at scale, with large numeric metrics (e.g., 1415 and 2057) and a trend chart. Visual represents growing alert volume and the need for faster, clearer ML tuning to improve risk detection accuracy and reduce analyst dependency.
Weekly breakdown of ML risk signals (Week 1–4) displayed as horizontal bars with counts (145, 267, 223, 198). Visual illustrates signal fluctuations over time and highlights the need for streamlined feedback loops to improve model accuracy and speed up compliance review workflows.
Weekly breakdown of ML risk signals (Week 1–4) displayed as horizontal bars with counts (145, 267, 223, 198). Visual illustrates signal fluctuations over time and highlights the need for streamlined feedback loops to improve model accuracy and speed up compliance review workflows.
Dashboard UI showing high ML risk-signal volume at scale, with large numeric metrics (e.g., 1415 and 2057) and a trend chart. Visual represents growing alert volume and the need for faster, clearer ML tuning to improve risk detection accuracy and reduce analyst dependency.
Dashboard UI showing high ML risk-signal volume at scale, with large numeric metrics (e.g., 1415 and 2057) and a trend chart. Visual represents growing alert volume and the need for faster, clearer ML tuning to improve risk detection accuracy and reduce analyst dependency.
Weekly breakdown of ML risk signals (Week 1–4) displayed as horizontal bars with counts (145, 267, 223, 198). Visual illustrates signal fluctuations over time and highlights the need for streamlined feedback loops to improve model accuracy and speed up compliance review workflows.
Weekly breakdown of ML risk signals (Week 1–4) displayed as horizontal bars with counts (145, 267, 223, 198). Visual illustrates signal fluctuations over time and highlights the need for streamlined feedback loops to improve model accuracy and speed up compliance review workflows.

Manager oversight. A shared view of ML health to reduce analyst load and maintain quality at scale.

Manager oversight. A shared view of ML health to reduce analyst load and maintain quality at scale.

ML monitoring dashboard highlighting signal volume at scale and a weekly trend breakdown, showing how risk alerts change over time.
ML monitoring dashboard highlighting signal volume at scale and a weekly trend breakdown, showing how risk alerts change over time.
ML health dashboard showing overall model health trends over time and a validation progress snapshot for reviewed signals.
ML health dashboard showing overall model health trends over time and a validation progress snapshot for reviewed signals.

Testing

Testing

Testing

I made the wrong call: pattern over behavior

I made the wrong call: pattern over behavior

I reused a hover tooltip for ML feedback. I assumed users would find it since it matched an existing pattern. Users made decisions in the justification area, not in highlighted text.

Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.
Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.

"I wasn't able to find how to give feedback on flagged risk signals."

Profile photo of a female compliance officer who  participated in usability testing.
Profile photo of a female compliance officer who  participated in usability testing.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

"I always go to the justification… it helps me clarify flagged risk content."

Profile photo of a male compliance officer who  participated in usability testing.
Profile photo of a male compliance officer who  participated in usability testing.

Feedback lived away from the decision point. I had designed around their mental model instead of into it.

"I wasn't able to find how to give feedback on flagged risk signals."

Profile photo of a female compliance officer who  participated in usability testing.

"I always go to the justification… it helps me clarify flagged risk content."

Profile photo of a male compliance officer who  participated in usability testing.

Feedback lived away from the decision point. I had designed around their mental model instead of into it.

Iteration

I moved ML feedback to where decisions happen

I moved ML feedback to where decisions happen

I moved ML feedback to where decisions happen
Testing revealed the gap. Users went to the justification first, every time. So I moved feedback entry there.

Testing revealed the gap. Users went to the justification first, every time. So I moved feedback entry there.

Before state of Behavox risk alert justification component  showing UX issues. Annotations highlight buried risk signal  and raw regulatory URL disrupting compliance officer decision  flow. Product design case study by Yanick, senior UX designer.
Before:

Before:

1

1

Flagged signal lacks emphasis, buried among secondary fields.

2

2

Regulatory data shows raw URL, interrupts the decision flow.

3

All fields carry equal weight, nothing signals what matters most.

4

Competing secondary statistical metadata.

Before state of Behavox risk alert justification component  showing equal field weight and competing secondary metadata.  Annotations 3 and 4 highlight information hierarchy failures  in compliance alert review. UX case study by Yanick, senior  product designer.
Before state of Behavox risk alert justification component  showing equal field weight and competing secondary metadata.  Annotations 3 and 4 highlight information hierarchy failures  in compliance alert review. UX case study by Yanick, senior  product designer.

3

All fields carry equal weight, nothing signals what matters most.

4

Competing secondary statistical metadata.

Improved risk alert justification component in collapsed state.  Flagged compliance signal elevated with orange left border.  Risk scenario and investigation status surfaced inline.  Supporting evidence hidden by default to reduce cognitive load.  UX case study by Yanick, senior product designer.
Improved Behavox risk alert justification component, collapsed  state. Flagged insider trading signal elevated with orange left  border. Risk scenario and status surfaced inline. Secondary  metadata hidden by default. UX case study by Yanick, senior  product designer.
After:

After:

1

1

Flagged signal leads. Elevated and visually distinct from supporting data.

2

2

Primary context surfaced inline. One scan, two key facts.

Improved risk alert justification component showing expanded  Supporting evidence panel. Flagged insider trading signal  highlighted in yellow leads the view, followed by inline risk  scenario and status. Secondary regulatory metadata accessible  via clean external link. UX case study by Yanick, senior  product designer.
Improved Behavox risk alert justification component, expanded  state. Supporting evidence panel reveals regulatory context  with clean external link replacing raw URL. Secondary metadata  organized in scannable grid. Compliance UX case study by  Yanick, senior product designer.

3

3

Secondary metadata hidden by default. Surface stays focused on the decision.

4

4

Raw URL replaced with a clean external link. Regulatory detail accessible, not disruptive.

Compliance officers got to the risk alert justification faster. With less friction.

Compliance officers got to the risk alert justification faster. With less friction.

Testing

Two clearer flows shipped as a result

Testing

Two clearer flows shipped as a result

Two clearer flows shipped as a result

Side panel for scenario assignment. Chosen for fit and fast build. Users stayed in context while tagging new signals.

Side panel for scenario assignment. Chosen for fit and fast build. Users stayed in context while tagging new signals.

Manager analytics dashboard. Built around two goals: track ML health and monitor review contributions.

Manager analytics dashboard. Built around two goals: track ML health and monitor review contributions.

We prioritized tuning and deferred oversight to keep momentum

Pivot

Note: We were rolling out a new design system, so I updated my designs and added to the library.

We prioritized tuning and deferred oversight to keep momentum

Pivot

Note: We were rolling out a new design system, so I updated my designs and added to the library.

Pivot

We prioritized tuning and deferred oversight to keep momentum

Aggregation pipeline limitations blocked a full dashboard build.

Aggregation pipeline limitations blocked a full dashboard build.

With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.

With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.

Handoff

I delivered faster ML tuning
and set the foundation for scale

I delivered faster ML tuning
and set the foundation for scale

Note: We were rolling out a new design system, so I updated my designs and contributed to the library.

Phase 1: After validating the Compliance Officer flow, I handed it to devs. We shipped ML feedback entry in justification, enabling faster tuning with less friction.

Phase 1: After validating the Compliance Officer flow, I handed it to devs. We shipped ML feedback entry in justification, enabling faster tuning with less friction.

Green gradient background
Green gradient background
Green gradient background
Green gradient background

Phase 2: We kept the Compliance Manager dashboard for future implementation to restore oversight, guide quality, and balance load.

Green gradient background
Green gradient background
Green gradient background
Green gradient background
Green gradient background

Phase 2: The dashboard for compliance managers was prioritized in the backlog for a future release.

Handoff

I delivered faster ML tuning and set the foundation for scale

After testing confirmed our direction, I applied the new design system before handoff.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Handoff

I delivered faster ML tuning and set the foundation for scale

After testing confirmed our direction, I applied the new design system before handoff.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.

Artsiom Mezin

Sr. Engineering Manager

Learnings

Learnings

Designing for clarity, speed, and momentum

Designing for clarity, speed, and momentum

Place actions where decisions happen. It lifts discoverability and starts immediately.

Place actions where decisions happen. It lifts discoverability and starts immediately.

Validate now, scale later. Phased builds keep momentum when new constraints block.

Validate now, scale later. Phased builds keep momentum when new constraints block.

Standardize the fastest review path. It speeds completion and cuts errors.

Standardize the fastest review path. It speeds completion and cuts errors.

Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.

Artsiom Mezin

Sr. Engineering Manager

Want the full story?

I help teams remove friction and ship faster.

Want the full story?

I help teams remove friction and ship faster.

Want the full story?

I help teams remove friction and ship faster.

Wanna hear the full story?

I help teams remove friction and ship faster.

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Thanks for reading!

Let's stay in touch:

Boost in alert accuracy

27%
27%

Raise in user satisfaction

Reduced operational drag

Boost in alert accuracy

27%
27%

Raise in user satisfaction

Reduced operational drag

This case study reflects my work. Certain details were adjusted to honour confidentiality.

Transforming ML tuning for clearer, faster risk detection

Behavox’s ML model feedback loop was slow, manual, and code-heavy. Users relied on company analysts to tune models. This led to missed risks and reduced trust in the system.

Pivot

We prioritized faster tuning and deferred oversight to keep momentum

Aggregation pipeline limitations blocked a full dashboard build.

With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.

Up next

Learnings

Designing for clarity, speed, and momentum

Place actions where decisions happen. It lifts discoverability and starts immediately.

Validate now, scale later. Phased builds keep momentum when new constraints block.

Standardize the fastest review path. It speeds completion and cuts errors.

Problem

Analyst dependency slowed ML risk detection

Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make. This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.

Ideation

Four directions. One goal: get feedback closer to the model

My goal was to connect feedback, review, and oversight into a single flow. I aimed to enable faster tuning and clearer decisions.

Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.
Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.

"I wasn't able to find how to give feedback on flagged risk signals."

Profile photo of a female compliance officer who  participated in usability testing.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

"I always go to the justification… it helps me clarify flagged risk content."

Profile photo of a male compliance officer who  participated in usability testing.

Test data confirmed users couldn’t find the feedback entry. The thumbs-up/down pattern blended with content, so most users never entered the loop.

Testing

I made the wrong call: pattern over behavior

I reused a hover tooltip for ML feedback. I assumed users would find it since it matched an existing pattern. Users made decisions in the justification area, not in highlighted text.

Illustration of unclear ML-tuning rules shown through a confusing code snippet. A compliance officer avatar looks puzzled, highlighting low clarity in the model-tuning flow. Visual used in a product design case study to show how complexity and poor rule transparency hurt user trust and workflow efficiency.
Illustration of unclear ML-tuning rules shown through a confusing code snippet. A compliance officer avatar looks puzzled, highlighting low clarity in the model-tuning flow. Visual used in a product design case study to show how complexity and poor rule transparency hurt user trust and workflow efficiency.

Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.

Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”

Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.

UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.
UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.

Stalled loops: Review and tuning sat in queues. Real threats could slip through.

Create a free website with Framer, the website builder loved by startups, designers and agencies.