


Transforming ML tuning for clearer, faster risk detection
Transforming ML tuning for clearer, faster risk detection
64%
Boost in risk alert accuracy
31%
Improvement in user satisfaction
Reduced reliance on company analysts

64%
Boost in ML risk alert accuracy
31%
Improvement in user satisfaction
Reduced reliance on company analysts

This project showcase reflects my work. Certain details were adjusted to honor confidentiality.
Problem
Analyst dependency slowed ML risk detection
Problem
Analyst dependency slowed ML risk detection
Problem
Complex ML tuning created risk-detection bottlenecks
Behavox’s ML tuning relied on analysts, slowing detection and hurting model quality. Compliance officers couldn’t adjust rules themselves. Queues grew, alerts lagged, and exposure to real threats increased.
Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make
This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.


Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.
Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.


Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.
Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.


Stalled loops: Review and tuning sat in queues. Real threats could slip through.
Stalled loops: Review and tuning sat in queues. Real threats could slip through. |
Stalled loops: Review and tuning sat in queues. Real threats could slip through. |
Ideation
Ideation
I explored 4 design bets to cut tuning time and remove ML bottlenecks
I explored 4 design bets to cut tuning time and remove ML bottlenecks
My goal was to connect feedback, review, and oversight into a flow that unlocked faster tuning and clearer decisions.
My goal was to connect feedback, review, and oversight into a flow that unlocked faster tuning and clearer decisions.

1/4
1.
Feedback-first design. Let compliance officers train ML models directly. This cuts wait times and improves risk signal accuracy.

2/4
2.
Streamlined review paths. Cut steps to speed reviews. I explored panels, inline actions, and modals to reduce clicks and errors.

3/4
3.
In-context scenario assignment. Tag new signals (new content) in the moment. This sharpens ML model accuracy and reduces rework.

4/4
4.
Manager oversight. Give risk compliance managers a view of ML health. This reduces analyst load and guides quality.

1/4
1- Feedback-first design
Let compliance officers refine ML models directly. This cut delays and improves risk signal accuracy.

2/4
2- Streamlined review paths
Reduce steps to speed reviews. I explored panels, inline actions, and modals to reduce clicks and errors.

3/4
3- In-context scenario assignment
Tag new signals (content) in the moment. This sharpens ML model accuracy and reduces rework.

4/4
4- Manager oversight
Give risk compliance managers a view of ML health. This reduces analyst load and guides quality.

1/4
1.
Feedback-first design. Let compliance officers train ML models directly. This cuts wait times and improves risk signal accuracy.

2/4
2.
Streamlined review paths. Cut steps to speed reviews. I explored panels, inline actions, and modals to reduce clicks and errors.

3/4
3.
In-context scenario assignment. Tag new signals (new content) in the moment. This sharpens ML model accuracy and reduces rework.

4/4
4.
Manager oversight. Give risk compliance managers a view of ML health. This reduces analyst load and guides quality.

4/4
4.
Manager oversight. Give risk compliance managers a view of ML health. This reduces analyst load and guides quality.
Testing
Testing
Our assumption failed, users missed the feedback entry
Our assumption failed, users missed the feedback entry
We expected users to hover over flagged text to open a tooltip, a pattern already used in the product. But most users didn’t discover it intuitively, which broke the feedback loop.
We expected users to hover over flagged text to open a tooltip, a pattern already used in the product.
Yet, users didn’t discover it intuitively, which broke the feedback loop.




"I wasn't able to find how to give feedback on flagged risk signals."
"I wasn't able to find how to give feedback on flagged risk signals."


"I always go to the justification… it helps me clarify flagged risk content."
"I always go to the justification… it helps me clarify flagged risk content."


Testing
The new risk signal linking and oversight flows proved clearer
Testing
The new risk signal linking and oversight flows proved clearer
The new risk signal linking and oversight flows proved clearer


Side panel for scenario assignment. Chosen for fit and fast build. This kept users in context while tagging new signals.
Side panel for scenario assignment. Chosen for fit and fast build. This kept users in context while tagging new signals.


Manager analytics dashboard. Scoped around 2 main goals: track ML health and monitor review contributions. |
Manager analytics dashboard. Scoped around 2 main goals: track ML health and monitor review contributions. |
Iteration
I moved feedback to the decision point to boost visibility and speed
I moved feedback to the decision point to boost visibility and speed
I moved feedback to the decision point to boost visibility and speed
Users read risk alert justifications first when reviewing. I moved the feedback entry there to raise visibility and cut time to feedback.To achieve this I had to make changes:
Users read risk alert justifications first when reviewing. I moved the feedback entry there to raise visibility and cut time to feedback.To achieve this I had to make changes:
Users read risk alert justifications first when reviewing. I moved the feedback entry there to raise visibility and cut time to feedback:



I improved spacing so compliance officers could scan key details fast.
I improved spacing so compliance officers could scan key details fast.



I collapsed secondary details to protect focus.
I collapsed secondary details to protect focus.
I collapsed secondary information to
enhance focus and reduce visual noise.



I added a tooltip for regulatory data to support and speed up decisions.
I added a tooltip for regulatory data to support and speed up decisions.
I centralized regulatory details in a tooltip to improve access and reduce clutter.
We prioritized tuning and deferred oversight to keep momentum
Pivot
Note: We were rolling out a new design system, so I updated my designs and added to the library.
We prioritized tuning and deferred oversight to keep momentum
Pivot
Note: We were rolling out a new design system, so I updated my designs and added to the library.


Pivot
We prioritized tuning and deferred oversight to keep momentum
Aggregation pipeline limitations blocked a full dashboard build.
Aggregation pipeline limitations blocked a full dashboard build.
With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.
With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.


Handoff
I delivered faster ML tuning
and set the foundation for scale
I delivered faster ML tuning
and set the foundation for scale
Note: We were rolling out a new design system, so I updated my designs and contributed to the library.
Phase 1: After validating the Compliance Officer flow, I handed it to devs. We shipped ML feedback entry in justification, enabling faster tuning with less friction.
Phase 1: After validating the Compliance Officer flow, I handed it to devs. We shipped ML feedback entry in justification, enabling faster tuning with less friction.




Phase 2: We kept the Compliance Manager dashboard for future implementation to restore oversight, guide quality, and balance load.





Phase 2: The dashboard for compliance managers was prioritized in the backlog for a future release.
Handoff
I delivered faster ML tuning and set the foundation for scale
After testing confirmed our direction, I applied the new design system before handoff.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
Handoff
I delivered faster ML tuning and set the foundation for scale
After testing confirmed our direction, I applied the new design system before handoff.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
Learnings
Learnings
Designing for clarity, speed, and momentum
Designing for clarity, speed, and momentum
Place actions where decisions happen. It lifts discoverability and starts immediately.
Place actions where decisions happen. It lifts discoverability and starts immediately.
Validate now, scale later. Phased builds keep momentum when new constraints block. |
Validate now, scale later. Phased builds keep momentum when new constraints block. |
Standardize the fastest review path. It speeds completion and cuts errors.
Standardize the fastest review path. It speeds completion and cuts errors.
Want the full story?
I help teams remove friction and ship faster.
Want the full story?
I help teams remove friction and ship faster.
Want the full story?
I help teams remove friction and ship faster.
Wanna hear the full story?
I help teams remove friction and ship faster.


This project showcase reflects my work. Certain details were adjusted to honor confidentiality.
Transforming ML tuning for clearer, faster risk detection
Behavox’s ML model feedback loop was slow, manual, and code-heavy. Users relied on company analysts to tune models. This led to missed risks and reduced trust in the system.
Pivot
We prioritized faster tuning and deferred oversight to keep momentum


Aggregation pipeline limitations blocked a full dashboard build.
With the PM, I secured a two-week window to validate the officer flow and launch it. We moved the dashboard to Phase 2 to keep speed and cut rework.
Learnings
Designing for clarity, speed, and momentum
Place actions where decisions happen. It lifts discoverability and starts immediately.
Validate now, scale later. Phased builds keep momentum when new constraints block.
Standardize the fastest review path. It speeds completion and cuts errors.
Problem
Analyst dependency slowed ML risk detection
Behavox’s ML model tuning required manual code changes that most Risk Compliance officers could't make. This created a dependency on Behavox analysts. It slowed feedback loops and delayed ML risk prediction.
Ideation
I explored 4 design bets to cut tuning time and remove ML bottlenecks
My goal was to connect feedback, review, and oversight into a flow that unlocked faster tuning and clearer decisions.

1/4
1.
Feedback-first design. Let compliance officers train ML models directly. This cuts wait times and improves risk signal accuracy.

2/4
2.
Streamlined review paths. Cut steps to speed reviews. I explored panels, inline actions, and modals to reduce clicks and errors.

3/4
3.
In-context scenario assignment. Tag new signals (new content) in the moment. This sharpens ML model accuracy and reduces rework.

4/4
4.
Manager oversight. Give risk compliance managers a view of ML health. This reduces analyst load and guides quality.
Testing
The scenario assignment and oversight flows proved clearer
We selected patterns that were familiar to users and easier for devs to implement:


Binary feedback. We reused a known feedback pattern. Users could hovering highlighted (Flagged) text to give feedback. We expected simple inputs to speed reviews. |
Manager analytics dashboard. Scoped around 2 main goals: track ML health and monitor review contributions. |


Side panel for scenario assignment. We chose this for fit and fast build. A panel opens on selection and keeps users in context.
Side panel for scenario assignment. Chosen for fit and fast build. This kept users in context while tagging new signals.

Manager analytics dashboard. Scoped 2 goals for Compliance Risk Managers: track ML health and feedback contributions. |
Testing
Our assumption failed, users missed the feedback entry


Test data confirmed users couldn’t find the feedback entry. The thumbs-up/down pattern blended with content, so most users never entered the loop.
•
"I wasn't able to find how to give feedback on flagged risk signals."
•
"I always go to the justification… it helps me clarify flagged risk content."



Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.
Low clarity: Users called the flow “slow and opaque”. User trust and tool usage dropped.



Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.
Unmet needs: Compliance risk managers lacked oversight tools. This hurt model health monitoring.



Stalled loops: Review and tuning sat in queues. Real threats could slip through.
Stalled loops: Review and tuning sat in queues. Real threats could slip through. |






