
Behavox

Behavox

Behavox
A streamlined ML model-tuning UX
A streamlined ML model-tuning UX
Behavox’s ML model feedback loop was slow, manual, and code-heavy. Compliance officers relied on company analysts to tune models. This led to missed risks and reduced trust in the system.
64%
Boost in risk alert accuracy
31%
Improvement in user satisfaction
Reduced reliance on company analysts

64%
Boost in ML risk alert accuracy
31%
Improvement in user satisfaction
Reduced reliance on company analysts

The content of this case study is my own and does not represent the views of NBC or Valtech. Details have been modified to comply with my NDA.
DISCOVERY
Finding the breakdowns hurting ML model tuning
DISCOVERY
Finding the breakdowns hurting ML model tuning
DISCOVERY
Finding the breakdowns hurting ML model tuning
I mapped the compliance officer's workflows and, with my research partner, identified their key pains.
I mapped the compliance officer's workflows and, with my research partner, identified their key pains.


Feedback loop analysis revealed critical bottlenecks where tuning stalled or failed.
Feedback loop analysis revealed critical bottlenecks where tuning stalled or failed.


We discovered an overlooked user: Compliance Risk Managers, central to tuning oversight in large orgs.
We discovered an overlooked user: Compliance Risk Managers, central to tuning oversight in large orgs.


Clients over-relied on analysts for tuning. This caused operational overhead and slow feedback cycles.
Clients over-relied on analysts for tuning. This caused operational overhead and slow feedback cycles.
STRATEGY
Designing for usability, accuracy, and scale
Designing for usability, accuracy, and scale
Designing for usability, accuracy, and scale
To make ML refinement intuitive for non-technical users, I designed a system around four strategic moves:
To make ML refinement intuitive for non-technical users, I designed a system around 4 strategic moves:
1/4
1.
Prioritizing feedback-first design
Sketching helped me align quickly with my team. We explored how compliance officers could train models directly.
2/4
2.
Exploring multiple review flows
I tested review panels, inline modals, and dropdowns to find the best UX for reviewing flagged risk content.
3/4
3.
Streamlining scenario assignment
The “link to scenario” feature let users categorize risk content in-context, improving model accuracy.
4/4
4.
Empowering compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
1/4
1- Prioritizing feedback-first design
Sketching helped me align quickly with my team. We explored how compliance officers could train models directly.
2/4
2- Exploring multiple review flows
I tested review panels, inline modals, and dropdowns to find the best UX for reviewing flagged risk content.
3/4
3- Streamlining scenario assignment
The “link to scenario” feature let users categorize risk content in-context, improving model accuracy.
4/4
4- Enabling compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
1/4
1.
Prioritizing feedback-first design
Sketching helped me align quickly with my team. We explored how compliance officers could train models directly.
2/4
2.
Exploring multiple review flows
I tested review panels, inline modals, and dropdowns to find the best UX for reviewing flagged risk content.
3/4
3.
Streamlining scenario assignment
The “link to scenario” feature let users categorize risk content in-context, improving model accuracy.
4/4
4.
Empowering compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
4/4
4.
Empowering compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
1/4
1.
Prioritizing feedback-first design
Sketching helped me align quickly with my team. We explored how compliance officers could train models directly.
2/4
2.
Exploring multiple review flows
I tested review panels, inline modals, and dropdowns to find the best UX for reviewing flagged risk content.
3/4
3.
Streamlining scenario assignment
The “link to scenario” feature let users categorize risk content in-context, improving model accuracy.
4/4
4.
Empowering compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
4/4
4.
Empowering compliance managers
I designed dashboards to centralize ML health data. This would improve decision making, and reduce reliance on analysts.
TESTING
Validating solutions early
With my research partner, I tested the top 3 approaches our team had prioritized:
TESTING
Validating solutions early
With my research partner, I tested the top 3 approaches
our team had prioritized:
A binary feedback system to speed up ML model risk alert review
A binary feedback system to speed
up ML model risk alert review
A side panel to assign new risk signals directly to ML model scenarios.
A side panel to assign new risk signals directly to ML model scenarios.
A manager analytics dashboard to track ML health and feedback contributors.
A manager analytics dashboard to track
ML health and feedback contributors.
CHALLENGE
Keeping momentum despite backend limits
CHALLENGE
Keeping momentum despite backend limits
CHALLENGE
Keeping momentum despite backend setbacks
To stay on track, I partnered with the PM to secure a 2‑week testing window for the compliance officer flow. We validated key interactions and deferred the dashboard to a future release.
To stay on track, I partnered with the PM to secure a 2‑week testing window for the compliance officer flow. We validated key interactions and deferred the dashboard to a future release.



ITERATION
Turning justifications
into a feedback hub
ITERATION
Turning justifications
into a feedback hub
ITERATION
Turning justifications into a feedback hub


Testing revealed that users didn’t expect to interact with text highlights for feedback. They focused on the justification section and assumed that’s where feedback would go.
Testing revealed that users didn’t expect to interact with text highlights for feedback. They focused on the justification section and assumed that’s where feedback would go.
•
"I wasn't able to find how to give feedback on flagged risk signals."
"I wasn't able to find how to give feedback on flagged risk signals."
•
"I always go to the justification… it helps me clarify flagged risk content."
"I always go to the justification… it helps me clarify flagged risk content."
I redesigned the justification layout to better support decision-making:
I redesigned the justification layout to better support decision-making:
I redesigned the justification layout to better support decision-making:


I improved layout spacing to reduce clutter and make content easy to scan.
I improved layout spacing to reduce clutter and make content easy to scan.


I collapsed secondary information to enhance focus and reduce visual noise.
I collapsed secondary information to enhance focus and reduce visual noise.
I collapsed secondary information to
enhance focus and reduce visual noise.


I centralized regulatory details in a tooltip to improve access and reduce clutter.
I centralized regulatory details in a tooltip to improve access and reduce clutter.
I centralized regulatory details in a tooltip
to improve access and reduce clutter.
HANDOFF
Shipping what mattered, planning what’s next
Shipping what mattered, planning what’s next
After testing confirmed our direction, I applied the new design system before handoff.




Phase 1: We shipped the new ML feedback flow, enabling compliance officers to tune models faster, with less friction.
Phase 1: We shipped the new ML feedback flow, enabling compliance officers to tune models faster, with less friction.



Phase 2: The dashboard for compliance managers was prioritized in the backlog for a future release.


Phase 2: The dashboard for compliance managers was prioritized in the backlog for a future release.
HANDOFF
Shipping what mattered, planning what’s next
After testing confirmed our direction, I applied the new
design system before handoff.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
HANDOFF
Shipping what mattered, planning what’s next
After testing confirmed our direction, I applied the new
design system before handoff.
We were implementing a new design system at the time. I updated the project's designs to ensure consistency and scalability.
TAKEAWAYS
Removing ML roadblocks for faster risk detection
This wasn’t just a usability fix. It was a UX shift that made ML tuning faster, clearer, and more trusted.
Feedback tools work best when aligned with decision moments.
Discoverability can make or break UX in high-stakes workflows.
Smart scoping helps us move fast without compromising quality
TAKEAWAYS
Removing ML roadblocks for faster risk detection
This wasn’t just a usability fix. It was a UX shift that made ML tuning faster, clearer, and more trusted.
Feedback tools work best when aligned with decision moments.
Discoverability can make or break UX in high-stakes workflows.
Smart scoping helps us move fast without compromising quality
TAKEAWAYS
Removing ML roadblocks for faster risk detection
This wasn’t just a usability fix. It was a UX shift that made ML tuning faster, clearer, and more trusted.
Feedback tools work best when aligned with decision moments.
Discoverability can make or break UX in high-stakes workflows.
Smart scoping helps us move fast without compromising quality


TAKEAWAYS
Removing ML roadblocks for faster risk detection
This is just a preview! The full case study dives deeper into trade-offs, design decisions, and strategic insights.
Want the full story?
Message me
This is just a preview! The full case study dives deeper into trade-offs, design decisions, and strategic insights.
Want the full story?
Message me
This is just a preview! The full case study dives deeper into trade-offs, design decisions, and strategic insights.
Want the full story?
Message me
This is just a preview! The full case study dives deeper into trade-offs, decisions, and strategic insights.
Wanna hear the full story?
Message me
DISCOVERY
Finding the breakdowns hurting ML model tuning

Compliance officers struggled to fine-tune ML models.
I mapped the compliance officer's workflows and, with my research partner, identified their key pains.
Feedback loop analysis revealed critical bottlenecks where tuning stalled or failed.


We uncovered an overlooked user: Compliance Risk Managers, central to tuning oversight in large orgs.


Clients over-relied on analysts for tuning. This caused operational overhead and slow feedback cycles.


Users couldn’t add new alert signals. Enabling phrase input would improve model accuracy and adaptability.
USABILITY TESTING
Testing solutions to enhance ML training and workflows.
With my research partner, I tested the top 3 approaches our team had prioritized:
A binary feedback system to speed up ML model risk alert review
We decided to test the thumbs-up/down approach. This aimed to simplify feedback while keeping users focus on flagged content.
A side panel to assign new risk signals directly to ML model scenarios.
We assessed how users assigned missed risk content to scenarios. This helped refine ML models by training them with accurate inputs.
A manager analytics dashboard to track ML health and feedback contributors.
I refined the dashboard based on available backend data. We wanted to assesses how managers used it to track model health and feedback contributors.

Behavox

Behavox
The content of this case study is my own and does not represent the views of NBC or Valtech. Details have been modified to comply with my NDA.
A streamlined ML model-tuning UX
Behavox’s ML model feedback loop was slow, manual, and code-heavy. Users relied on company analysts to tune models. This led to missed risks and reduced trust in the system.
This wasn’t just a usability fix. It was a UX shift that made ML tuning faster, clearer, and more trusted.
Discoverability can make or break
UX in high-stakes workflows.
Feedback tools work best when aligned with decision moments.
Smart scoping helps us move fast
without compromising quality


TAKEAWAYS
Removing ML roadblocks for faster risk detection
CHALLENGE
Keeping momentum despite backend limits


To stay on track, I partnered with the PM to secure a 2‑week testing window for the compliance officer flow.
We validated key interactions and deferred the dashboard to a future release.
To avoid delays, I worked with the PM to secure a 2-week testing window. We validated the compliance officer flow and deferred the dashboard to a future release.
ITERATION
Turning justifications into a feedback hub


Testing revealed that users didn’t expect to interact with text highlights for feedback.
They focused on the justification section and assumed that’s where feedback would go.
They focused on the justification section and assumed that’s where feedback would go.
•
"I wasn't able to find how to give feedback on flagged risk signals."
•
"I always go to the justification… it helps me clarify flagged risk content."