Turning risk alert reviews into an ML feedback loop

Compliance officers saw Behavox’s risk evidence first, but their feedback reached model tuning too late. I moved feedback into the review moment so analysts could act on officer judgment faster, without removing oversight.

64%

64%

Boost in risk alert accuracy

Boost in risk alert accuracy

31%

31%

Increase in user satisfaction

Increase in user satisfaction

Reduced analyst dependency

Reduced analyst dependency

ROLE

Sr. Product Designer

I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.

I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.

SCOPE

4 months

I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.

SCOPE

4 months

I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.

PROBLEM

ML training stalled because the clearest feedback signal lived outside the tuning path

PROBLEM

ML training stalled because the clearest feedback signal lived outside the tuning path

PROBLEM

ML training stalled because the clearest feedback signal lived outside the tuning path

Compliance officers saw the evidence first, but analysts owned most tuning work

That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.

That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.

Compliance officers saw the evidence first, but analysts owned most tuning work

UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.
UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.

Stalled training loops

Review and tuning sat in queues. Real threats could be missed.

Review and tuning sat in queues. Real threats could be missed.

  • High operational costs

  • No clear visibility

  • Delayed risk detection

Oversight needs were missing from the feedback loop

Compliance risk managers lacked oversight tools. This hurt model health monitoring.

Compliance risk managers lacked oversight tools. This hurt model health monitoring.

Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”
Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”

CHALLENGE

The solution had to shorten tuning without weakening oversight

The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.

CHALLENGE

The solution had to shorten tuning without weakening oversight

The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.

CHALLENGE

The solution had to shorten tuning without weakening oversight

The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.

Reduce analyst dependency

Officers had signal context. Analysts still owned tuning.

Officers had signal context. Analysts still owned tuning.

Preserve oversight

Managers needed visibility into review and tuning quality.

Managers needed visibility into review and tuning quality.

Fit data constraints

Pipeline limits shaped what could ship first.

Pipeline limitations shaped what we could ship first.

OPTIONS

The strongest path moved feedback into the review decision

I compared each concept against four criteria. I looked at feedback speed, build effort, analyst dependency, and data oversight.

The strongest option moved feedback into the review moment while keeping oversight available for a later phase.

In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.
In-context risk alert review modal allowing compliance officers to confirm or reject an ML bribery signal directly within highlighted text.

Train the model from the alert. Officers could confirm or reject signals while reviewing evidence.

Train the model from alerts. Officers could confirm/reject signals while reviewing evidence.

Train the model from the alert. Officers could confirm or reject signals while reviewing evidence.

End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.
End-to-end workflow showing highlighted text linked to a compliance scenario via an inline menu and a side panel form.
Inline context menu on highlighted email text with options to create a label or link the content to a compliance scenario.

Classify signals in context. Reviewers could tag new signals without leaving the investigation path.

Classify signals in context. Reviewers could tag new signals without leaving the investigation path.

Side panel form for linking highlighted content to a compliance scenario, with dropdowns for scenario and use case and save actions.
Side panel form for linking highlighted content to a compliance scenario, with dropdowns for scenario and use case and save actions.
A wireframe of the manager analytics dashboard designed to monitor ML health, track signal volume, display validation progress, and provide trend-level breakdowns for risk review oversight
A section of the manager analytics dashboard wireframe displaying ML health trends and validation progress, designed to help managers monitor review contributions and maintain oversight.
A detailed view of the manager analytics dashboard, highlighting key metrics including signal volume and a trend-level breakdown of data over four weeks to support risk review oversight.
A detailed view of the manager analytics dashboard, highlighting key metrics including signal volume and a trend-level breakdown of data over four weeks to support risk review oversight.

Track model health at scale. Managers could monitor coverage, contributions, and tuning quality.

Track model health at scale. Managers could monitor coverage, contributions, and tuning quality.

TESTING

I made the wrong call: I optimized for a familiar pattern over actual review behavior

I reused a hover tooltip because it matched an existing feedback pattern.

Testing showed officers made decisions in the justification area, not on highlighted text.

Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.

“I wasn’t able to find how to give feedback on flagged risk signals.”

“I wasn’t able to find how to give feedback on flagged risk signals.”

A professional headshot representing the compliance officers who now directly contribute to ML signal tuning within the risk review flow.

I matched a familiar pattern, but missed the decision moment.

I matched a familiar pattern, but missed the decision moment.

Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.
Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

“I always go to the justification… it helps me clarify flagged risk content.”

“I always go to the justification… it helps me clarify flagged risk content.”

A professional headshot representing the managers who oversee model health and review contributions, focusing on the system's Phase 2 oversight capabilities.
A professional headshot representing the managers who oversee model health and review contributions, focusing on the system's Phase 2 oversight capabilities.

The issue was not discoverability alone. Feedback was outside the place where officers formed judgment.

Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

SOLUTION

I moved ML feedback into the risk signal justification moment

Testing revealed the gap. Officers went to the justification first, every time. So I moved feedback entry there.

Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Before:

Before:

1

1

Flagged signal lacks emphasis, buried among secondary fields.

2

2

Regulatory data shows raw URL, interrupts the decision flow.

3

All fields carry equal weight, nothing signals what matters most.

4

Competing secondary statistical metadata.

3

All fields carry equal weight, nothing signals what matters most.

4

Competing secondary statistical metadata.

Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Improved risk alert justification component in collapsed state.  Flagged compliance signal elevated with orange left border.  Risk scenario and investigation status surfaced inline.  Supporting evidence hidden by default to reduce cognitive load.  UX case study by Yanick, senior product designer.

Collapsed

Improved risk alert justification component in collapsed state.  Flagged compliance signal elevated with orange left border.  Risk scenario and investigation status surfaced inline.  Supporting evidence hidden by default to reduce cognitive load.  UX case study by Yanick, senior product designer.

Collapsed

After:

After:

1

1

Lead with the flagged signal.

2

2

Surface only the context needed to judge the signal.

Improved risk alert justification component showing expanded  Supporting evidence panel. Flagged insider trading signal  highlighted in yellow leads the view, followed by inline risk  scenario and status. Secondary regulatory metadata accessible  via clean external link. UX case study by Yanick, senior  product designer.

Expanded

Improved risk alert justification component showing expanded  Supporting evidence panel. Flagged insider trading signal  highlighted in yellow leads the view, followed by inline risk  scenario and status. Secondary regulatory metadata accessible  via clean external link. UX case study by Yanick, senior  product designer.

Expanded

3

3

Hide secondary metadata until needed.

4

4

Keep regulatory evidence accessible without disrupting review.

Officers could address the signal while keeping secondary metadata available but out of the way.

PIVOT

We shipped tuning first and deferred oversight to protect momentum

A view of the manager monitoring dashboard, showcasing data on alert accuracy, weekly signal detection, scenario validation coverage, and review activity, with a label indicating that specific oversight features were deferred to Phase 2.

Deferred for phase 2

Aggregation pipeline limits made the manager dashboard risky for launch.

A view of the manager monitoring dashboard, showcasing data on alert accuracy, weekly signal detection, scenario validation coverage, and review activity, with a label indicating that specific oversight features were deferred to Phase 2.

Deferred

A view of the manager monitoring dashboard, showcasing data on alert accuracy, weekly signal detection, scenario validation coverage, and review activity, with a label indicating that specific oversight features were deferred to Phase 2.

Deferred

I partnered with the PM to protect a two-week validation window for the officer flow.

I partnered with the PM to protect a two-week validation window for the officer flow.

COMPLEXITY

We measured speed without ignoring oversight risk

The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.

A UI card titled "Speed + Adoption" outlining key performance indicators for the feedback path, including feedback completion, review friction, and alert accuracy.
A UI card titled "Speed + Adoption" outlining key performance indicators for the feedback path, including feedback completion, review friction, and alert accuracy.
A UI card titled "Quality + Oversight" outlining key metrics for system health, specifically measuring the reduction of false positives and monitoring manager visibility.
A UI card titled "Quality + Oversight" outlining key metrics for system health, specifically measuring the reduction of false positives and monitoring manager visibility.

HANDOFF

I shipped compliance officer feedback first and prepared oversight for Phase 2

After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.

Phase 1: Officer feedback path

Shipped ML feedback inside the risk justification flow.

Shipped ML feedback inside the risk justification flow.

Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

Phase 2: Manager oversight

Prepared dashboard patterns for ML health, review contributions, and tuning quality once data aggregation was ready.

Prepared dashboard patterns for ML health, review contributions, and tuning quality once data aggregation was ready.

Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

HANDOFF

I shipped compliance officer feedback first and prepared oversight for Phase 2

After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

HANDOFF

I shipped compliance officer feedback first and prepared oversight for Phase 2

After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.

We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.

MEASUREMENT

We measured speed without ignoring oversight risk

The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.

MEASUREMENT

We measured speed without ignoring oversight risk

The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.

MEASUREMENT

We measured speed without ignoring oversight risk

The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.

A UI card titled "Speed + Adoption" outlining key performance indicators for the feedback path, including feedback completion, review friction, and alert accuracy.
A UI card titled "Speed + Adoption" outlining key performance indicators for the feedback path, including feedback completion, review friction, and alert accuracy.
A UI card titled "Quality + Oversight" outlining key metrics for system health, specifically measuring the reduction of false positives and monitoring manager visibility.
A UI card titled "Quality + Oversight" outlining key metrics for system health, specifically measuring the reduction of false positives and monitoring manager visibility.
A UI card titled "Quality + Oversight" outlining key metrics for system health, specifically measuring the reduction of false positives and monitoring manager visibility.

“Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.“

P{rofile photo of Gustavo Pelaez, Senior Product Designer

Artsiom Mezin

Sr. Engineering Manager

“Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.“

P{rofile photo of Gustavo Pelaez, Senior Product Designer

Artsiom Mezin

Sr. Engineering Manager

“Yanick pushed our products forward in terms of design. His general ingenuity had a significant impact on Behavox's UI.“

P{rofile photo of Gustavo Pelaez, Senior Product Designer

Artsiom Mezin

Sr. Engineering Manager

LESSONS

LESSONS

Feedback systems work only when they meet real judgment behavior

1

Behavior beats familiar patterns

Behavior beats familiar patterns

Compliance oficers ignored the tooltip as they made decisions in the justification flow.

Officers ignored the tooltip because decisions happened in the justification flow.

Compliance officers ignored the tooltip as they made decisions in the justification flow.

2

Feedback belongs at the judgment moment

Feedback belongs at the judgment moment

Feedback quality improved once officers could act where they reviewed evidence.

Feedback quality improved once officers could act where they reviewed evidence.

3

Phasing protects delivery momentum

Phasing protects delivery momentum

The tightest feedback path gave us the clearest signal before scaling oversight.

The tightest feedback path gave us the clearest signal before scaling oversight.

Want the full story?

This case study is the high-level view. Happy to go deeper in conversation.

Want the full story?

This case study is the high-level view. Happy to go deeper in conversation.

Want the full story?

This case study is the high-level view. Happy to go deeper in conversation.

Want the full story?

This case study is the high-level view. Happy to go deeper in conversation.

PIVOT

We shipped tuning first and deferred oversight to protect momentum

Aggregation pipeline limits made the manager dashboard risky for launch.

I partnered with the PM to protect a two-week validation window for the officer flow. We shipped the tuning path first and moved oversight to Phase 2.

A view of the manager monitoring dashboard, showcasing data on alert accuracy, weekly signal detection, scenario validation coverage, and review activity, with a label indicating that specific oversight features were deferred to Phase 2.

Deferred for phase 2

A view of the manager monitoring dashboard, showcasing data on alert accuracy, weekly signal detection, scenario validation coverage, and review activity, with a label indicating that specific oversight features were deferred to Phase 2.

Deferred for phase 2

Up next

PROBLEM

ML training stalled because the clearest feedback signal lived outside the tuning path

Compliance officers saw the evidence first, but analysts owned most tuning work

That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.

Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.
Screenshot of the Behavox alert review interface showing a hover tooltip titled  "Review risk alert signal" with Accurate and Inaccurate buttons. An arrow points  to the tooltip from a caption reading "Where I expected feedback entry,"  highlighting the original feedback placement that users failed to discover.

“I wasn’t able to find how to give feedback on flagged risk signals.”

A professional headshot representing the compliance officers who now directly contribute to ML signal tuning within the risk review flow.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.
Screenshot of the Behavox alert panel showing the Risk scenario row with a  "View justification" link highlighted. An arrow points to it from a caption  reading "Where users actually went," showing the decision point where users  naturally looked for context before taking action.

“I always go to the justification… it helps me clarify flagged risk content.”

A professional headshot representing the managers who oversee model health and review contributions, focusing on the system's Phase 2 oversight capabilities.

I matched a familiar pattern, but missed the decision moment. The issue was not discoverability alone. Feedback was outside the place where officers formed judgment.

Testing

I made the wrong call: I optimized for a familiar pattern over actual review behavior

I reused a hover tooltip for ML feedback because it matched an existing pattern. Testing showed officers made decisions in the justification area, not on highlighted text.

LESSONS

What this taught me about feedback loops, behavior, and scale

1

Behavior beats familiar patterns

Compliance officers ignored the tooltip as they made decisions in the justification flow.

2

Feedback belongs at judgment

Feedback quality improved once officers could act where they reviewed evidence.

3

Phasing protects momentum

Shipping the tightest feedback path gave us the clearest signal before scaling oversight.

Turning risk alert reviews into an ML feedback loop

Compliance officers saw Behavox’s risk evidence first, but their feedback reached model tuning too late. I moved feedback into the review moment so analysts could act on officer judgment faster, without removing oversight.

IMPACT

64%

Boost in risk alert accuracy

11%

Increase linked user accounts

Reduced analyst dependency

ROLE

Sr. Product Designer

I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.

SCOPE

4 months

I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.

A code snippet illustration showing a configuration screen with Python and pseudo-SQL logic, highlighting "tuning rules" for a machine learning model, with an icon of a person looking confused by the logic.

Opaque tuning process

Users described the flow as slow and confusing. They could not see how feedback improved the model.

Stalled training loops

Review and tuning sat in queues. Real threats could be missed.

  • High operational costs

  • No clear visibility

  • Delayed risk detection

UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.
UX flow diagram showing the compliance workflow for ML risk review at Behavox. Highlights a bottleneck between Compliance Officer and Behavox Analyst during model fine-tuning, illustrating inefficiencies in the feedback loop.
Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”
Product design visualization showing three compliance roles: Compliance Officer, Compliance Manager, and Behavox Analyst. The Compliance Manager card is highlighted to show an overlooked role in the ML model monitoring/ process.”

Oversight needs were missing from the feedback loop

Compliance risk managers lacked oversight tools. This hurt model health monitoring.

COMPLEXITY

The solution had to shorten tuning without weakening oversight

Our new model had to work across onboarding, referral, travel rewards, and future incentives.

Reduce dependency

Officers had signal context. Analysts still owned tuning.

Preserve oversight

Managers needed visibility into review and tuning quality.

Fit data constraints

Pipeline limitations shaped what we could ship first.

Thanks for reading.

Thanks for reading.

Thanks for reading.

Thanks for reading.

Create a free website with Framer, the website builder loved by startups, designers and agencies.