Turning risk alert reviews into an ML feedback loop






Compliance officers saw Behavox’s risk evidence first, but their feedback reached model tuning too late. I moved feedback into the review moment so analysts could act on officer judgment faster, without removing oversight.
64%
64%
Boost in risk alert accuracy
Boost in risk alert accuracy
31%
31%
Increase in user satisfaction
Increase in user satisfaction
Reduced analyst dependency
Reduced analyst dependency
ROLE
Sr. Product Designer
I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.
I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.
SCOPE
4 months
I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.
SCOPE
4 months
I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.
PROBLEM
ML training stalled because the clearest feedback signal lived outside the tuning path
PROBLEM
ML training stalled because the clearest feedback signal lived outside the tuning path
PROBLEM
ML training stalled because the clearest feedback signal lived outside the tuning path
Compliance officers saw the evidence first, but analysts owned most tuning work
That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.
That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.
Compliance officers saw the evidence first, but analysts owned most tuning work


Stalled training loops
Review and tuning sat in queues. Real threats could be missed.
Review and tuning sat in queues. Real threats could be missed.
High operational costs
No clear visibility
Delayed risk detection
Oversight needs were missing from the feedback loop
Compliance risk managers lacked oversight tools. This hurt model health monitoring.
Compliance risk managers lacked oversight tools. This hurt model health monitoring.


CHALLENGE
The solution had to shorten tuning without weakening oversight
The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.
CHALLENGE
The solution had to shorten tuning without weakening oversight
The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.
CHALLENGE
The solution had to shorten tuning without weakening oversight
The new flow had to let officers act in context without breaking analyst oversight, model quality, or delivery speed.
Reduce analyst dependency
Officers had signal context. Analysts still owned tuning.
Officers had signal context. Analysts still owned tuning.
Preserve oversight
Managers needed visibility into review and tuning quality.
Managers needed visibility into review and tuning quality.
Fit data constraints
Pipeline limits shaped what could ship first.
Pipeline limitations shaped what we could ship first.
OPTIONS
The strongest path moved feedback into the review decision
I compared each concept against four criteria. I looked at feedback speed, build effort, analyst dependency, and data oversight.
The strongest option moved feedback into the review moment while keeping oversight available for a later phase.


Train the model from the alert. Officers could confirm or reject signals while reviewing evidence.
Train the model from alerts. Officers could confirm/reject signals while reviewing evidence.
Train the model from the alert. Officers could confirm or reject signals while reviewing evidence.



Classify signals in context. Reviewers could tag new signals without leaving the investigation path.
Classify signals in context. Reviewers could tag new signals without leaving the investigation path.






Track model health at scale. Managers could monitor coverage, contributions, and tuning quality.
Track model health at scale. Managers could monitor coverage, contributions, and tuning quality.
TESTING
I made the wrong call: I optimized for a familiar pattern over actual review behavior
I reused a hover tooltip because it matched an existing feedback pattern.
Testing showed officers made decisions in the justification area, not on highlighted text.

“I wasn’t able to find how to give feedback on flagged risk signals.”
“I wasn’t able to find how to give feedback on flagged risk signals.”

I matched a familiar pattern, but missed the decision moment.
I matched a familiar pattern, but missed the decision moment.



“I always go to the justification… it helps me clarify flagged risk content.”
“I always go to the justification… it helps me clarify flagged risk content.”


The issue was not discoverability alone. Feedback was outside the place where officers formed judgment.


SOLUTION
I moved ML feedback into the risk signal justification moment
Testing revealed the gap. Officers went to the justification first, every time. So I moved feedback entry there.


Before:
Before:
1
1
Flagged signal lacks emphasis, buried among secondary fields.
2
2
Regulatory data shows raw URL, interrupts the decision flow.
3
All fields carry equal weight, nothing signals what matters most.
4
Competing secondary statistical metadata.
3
All fields carry equal weight, nothing signals what matters most.
4
Competing secondary statistical metadata.



Collapsed

Collapsed
After:
After:
1
1
Lead with the flagged signal.
2
2
Surface only the context needed to judge the signal.

Expanded

Expanded
3
3
Hide secondary metadata until needed.
4
4
Keep regulatory evidence accessible without disrupting review.
Officers could address the signal while keeping secondary metadata available but out of the way.
PIVOT
We shipped tuning first and deferred oversight to protect momentum

Deferred for phase 2
Aggregation pipeline limits made the manager dashboard risky for launch.

Deferred

Deferred
I partnered with the PM to protect a two-week validation window for the officer flow.
I partnered with the PM to protect a two-week validation window for the officer flow.
COMPLEXITY
We measured speed without ignoring oversight risk
The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.




HANDOFF
I shipped compliance officer feedback first and prepared oversight for Phase 2
After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.
Phase 1: Officer feedback path
Shipped ML feedback inside the risk justification flow.
Shipped ML feedback inside the risk justification flow.


Phase 2: Manager oversight
Prepared dashboard patterns for ML health, review contributions, and tuning quality once data aggregation was ready.
Prepared dashboard patterns for ML health, review contributions, and tuning quality once data aggregation was ready.


HANDOFF
I shipped compliance officer feedback first and prepared oversight for Phase 2
After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
HANDOFF
I shipped compliance officer feedback first and prepared oversight for Phase 2
After a final testing round, I handed off the validated officer flow, aligned components with the new design system, and kept the manager dashboard ready for Phase 2.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
MEASUREMENT
We measured speed without ignoring oversight risk
The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.
MEASUREMENT
We measured speed without ignoring oversight risk
The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.
MEASUREMENT
We measured speed without ignoring oversight risk
The goal was not only faster feedback. We also tracked whether the new path improved signal quality without weakening oversight.





LESSONS
LESSONS
Feedback systems work only when they meet real judgment behavior
1
Behavior beats familiar patterns
Behavior beats familiar patterns
Compliance oficers ignored the tooltip as they made decisions in the justification flow.
Officers ignored the tooltip because decisions happened in the justification flow.
Compliance officers ignored the tooltip as they made decisions in the justification flow.
2
Feedback belongs at the judgment moment
Feedback belongs at the judgment moment
Feedback quality improved once officers could act where they reviewed evidence.
Feedback quality improved once officers could act where they reviewed evidence.
3
Phasing protects delivery momentum
Phasing protects delivery momentum
The tightest feedback path gave us the clearest signal before scaling oversight.
The tightest feedback path gave us the clearest signal before scaling oversight.
Want the full story?
This case study is the high-level view. Happy to go deeper in conversation.
Want the full story?
This case study is the high-level view. Happy to go deeper in conversation.
Want the full story?
This case study is the high-level view. Happy to go deeper in conversation.
Want the full story?
This case study is the high-level view. Happy to go deeper in conversation.
PIVOT
We shipped tuning first and deferred oversight to protect momentum
Aggregation pipeline limits made the manager dashboard risky for launch.
I partnered with the PM to protect a two-week validation window for the officer flow. We shipped the tuning path first and moved oversight to Phase 2.

Deferred for phase 2

Deferred for phase 2
PROBLEM
ML training stalled because the clearest feedback signal lived outside the tuning path
Compliance officers saw the evidence first, but analysts owned most tuning work
That created queues and slowed signal correction. Weaker alerts stayed in circulation longer than necessary.


“I wasn’t able to find how to give feedback on flagged risk signals.”



“I always go to the justification… it helps me clarify flagged risk content.”

I matched a familiar pattern, but missed the decision moment. The issue was not discoverability alone. Feedback was outside the place where officers formed judgment.
Testing
I made the wrong call: I optimized for a familiar pattern over actual review behavior
I reused a hover tooltip for ML feedback because it matched an existing pattern. Testing showed officers made decisions in the justification area, not on highlighted text.
LESSONS
What this taught me about feedback loops, behavior, and scale
1
Behavior beats familiar patterns
Compliance officers ignored the tooltip as they made decisions in the justification flow.
2
Feedback belongs at judgment
Feedback quality improved once officers could act where they reviewed evidence.
3
Phasing protects momentum
Shipping the tightest feedback path gave us the clearest signal before scaling oversight.
Turning risk alert reviews into an ML feedback loop

Compliance officers saw Behavox’s risk evidence first, but their feedback reached model tuning too late. I moved feedback into the review moment so analysts could act on officer judgment faster, without removing oversight.
IMPACT
64%
Boost in risk alert accuracy
11%
Increase linked user accounts
Reduced analyst dependency
ROLE
Sr. Product Designer
I led the review-flow redesign and tested where officers made feedback decisions. I helped phase the work around data readiness: officer feedback first, manager oversight later.
SCOPE
4 months
I partnered with PM, engineering, and research to ship the officer feedback path first. I prepared the manager oversight dashboard for a later data-ready phase.

Opaque tuning process
Users described the flow as slow and confusing. They could not see how feedback improved the model.
Stalled training loops
Review and tuning sat in queues. Real threats could be missed.
High operational costs
No clear visibility
Delayed risk detection




Oversight needs were missing from the feedback loop
Compliance risk managers lacked oversight tools. This hurt model health monitoring.
COMPLEXITY
The solution had to shorten tuning without weakening oversight
Our new model had to work across onboarding, referral, travel rewards, and future incentives.
Reduce dependency
Officers had signal context. Analysts still owned tuning.
Preserve oversight
Managers needed visibility into review and tuning quality.
Fit data constraints
Pipeline limitations shaped what we could ship first.


