Spaces:

mokshak
/

vera-rubric-decision-engine

Running

App Files Files Community

vera-rubric-decision-engine / examples /case-studies.md

mokshak

Deploy Vera deterministic bot

aec2fdf verified 6 days ago

preview code

raw

history blame contribute delete

18 kB

10 Case Studies — What "Good" Looks Like

These case studies serve as anchor examples for the AI judge and as targets for candidates. Each shows:

The 4-context input
The good composed message
The compulsion levers used
The dimension-by-dimension score (out of 50)

Two cases per category × 5 categories = 10 case studies. Mix of merchant-facing and customer-facing scopes.

Case Study 1 — Dentists / Research Digest (merchant-facing)

Category: dentists Merchant: Dr. Meera's Dental Clinic, Lajpat Nagar Delhi, CTR 2.1% (below peer 3.0%), 124 high-risk adult patients in roster Trigger: research_digest — JIDA Oct 2026 paper: 3-month fluoride recall vs 6-month, 38% caries reduction in high-risk adults Customer: none

Composed message (sent as Vera)

Dr. Meera, JIDA's Oct issue landed. One item relevant to your high-risk adult
patients — 2,100-patient trial showed 3-month fluoride recall cuts caries
recurrence 38% better than 6-month. Worth a look (2-min abstract). Want me
to pull it + draft a patient-ed WhatsApp you can share?  — JIDA Oct 2026 p.14

Compulsion levers: source citation (credibility), merchant-specific anchor (her high-risk-adult cohort), reciprocity ("I'll pull it for you"), low-friction CTA, specificity (2,100 / 38% / page number).

Score breakdown (50/50 = perfect)

Dimension	Score	Why
Specificity	10	Numbers + source + page reference
Category fit	10	Clinical vocabulary, peer tone, no overclaim
Merchant fit	10	"your high-risk adult patients" derived from her customer_aggregate
Trigger relevance	10	Explicitly references the digest as the reason for messaging
Engagement compulsion	10	Curiosity + reciprocity + clear next step

Case Study 2 — Dentists / Recall Reminder (customer-facing)

Category: dentists (customer-facing voice rules: no medical claims) Merchant: Dr. Meera's Dental Clinic — active offer "Dental Cleaning @ ₹299", available slots Wed 5 Nov 6pm + Thu 6 Nov 5pm Trigger: recall_due — Priya's 6-month cleaning recall window opened Customer: Priya — lapsed_soft (5mo since last visit), prefers weekday evenings, hi-en mix

Composed message (sent as merchant_on_behalf, from Dr. Meera's WA number)

Hi Priya, Dr. Meera's clinic here 🦷 It's been 5 months since your last visit
— your 6-month cleaning recall is due. Apke liye 2 slots ready hain: Wed 5
Nov, 6pm ya Thu 6 Nov, 5pm. ₹299 cleaning + complimentary fluoride. Reply 1
for Wed, 2 for Thu, or tell us a time that works.

Compulsion levers: name personalization, language-pref match (hi-en mix), specific dates + price + free-add ("complimentary fluoride"), low-friction multi-choice CTA, dental emoji for warmth.

Score breakdown (49/50)

Dimension	Score	Why
Specificity	10	Real slots + real price + concrete recall window
Category fit	10	No medical claims; warm-clinical tone
Merchant fit	10	Real offer from catalog; merchant's actual slots
Trigger relevance	10	Explicit 6-month recall framing
Engagement compulsion	9	Strong but Reply 1/2 multi-choice is borderline (binary preferred); the "or tell us a time that works" rescues it

Case Study 3 — Salons / Active Planning (merchant-facing)

Category: salons Merchant: Studio11 Family Salon, Kapra Hyderabad, calls +20% w/w, retention 62%, peer-leading Trigger: active_planning_intent — kids yoga summer camp planning conversation

Wait — this is the gym category. Let me redo with a salon trigger.

Trigger (corrected): bridal_followup — Kavya, bride-to-be (wedding 2026-11-08), did Bridal Trial 5 weeks ago, in skin-prep window now Customer: Kavya — new state, wedding date 2026-11-08, opted in for bridal followup

Note: this is also customer-facing. Treating this as a customer-facing case for variety.

Composed message (sent as merchant_on_behalf)

Hi Kavya 💍 Lakshmi from Studio11 Kapra here. 196 days to your wedding — perfect
window to start the 30-day skin-prep program before serious bridal bookings
roll in. ₹2,499 covers 4 sessions + a take-home kit. Want me to block your
preferred Saturday 4pm slot for the first session next week?

Compulsion levers: relationship continuity ("from your trial"), wedding-date specificity, urgency framing (window opens), preference honored (Saturday 4pm), single binary commit.

Score breakdown (47/50)

Dimension	Score	Why
Specificity	10	Days-to-wedding count, program structure, price
Category fit	9	Warm-practical tone; emojis used appropriately
Merchant fit	9	References Lakshmi (owner first name) + Studio11 + her trial
Trigger relevance	10	Explicitly the bridal followup window
Engagement compulsion	9	Strong but assumes the program exists in catalog (verify in MerchantContext.offers)

Case Study 4 — Salons / Curious Ask (merchant-facing)

Category: salons Merchant: Studio11 Family Salon, growing strongly, last Vera touch 3 days ago Trigger: curious_ask_due — weekly "what's in demand?" cadence Customer: none

Composed message (sent as Vera)

Hi Lakshmi! Quick check — what service has been most asked-for this week
at Studio11? I'll turn the answer into a Google post + a 4-line WhatsApp
reply you can use when customers ask about pricing. Takes 5 min.

Compulsion levers: low-stakes question (no commitment), reciprocity offered up-front (Google post + reply draft), effort externalization (5 min), respects merchant time.

Score breakdown (44/50)

Dimension	Score	Why
Specificity	8	"5 min" anchor; could be sharper with a guess ("Is it the keratin treatment?")
Category fit	9	Warm tone, fellow-operator register
Merchant fit	8	Uses owner first name; could reference recent service mix
Trigger relevance	10	This IS the curious-ask family — exactly the right shape
Engagement compulsion	9	Asking-the-merchant lever is the strongest Cialdini hook for engaged merchants

Case Study 5 — Restaurants / IPL Match Day (merchant-facing)

Category: restaurants Merchant: SK Pizza Junction, Sant Nagar Delhi, trial-tier subscriber, BOGO pizza Tue-Thu active Trigger: ipl_match_today — DC vs MI at Arun Jaitley Stadium, 7:30pm match today (Sat — not a weeknight) Customer: none

Composed message (sent as Vera)

Quick heads-up Suresh — DC vs MI at Arun Jaitley tonight, 7:30pm. Important:
Saturday IPL matches usually shift -12% restaurant covers (people watch at
home). Skip the match-night promo today; instead push your BOGO pizza
(already active) as a delivery-only Saturday special. Want me to draft the
Swiggy banner + an Insta story? Live in 10 min.

Compulsion levers: counter-intuitive specific data ("Saturday IPL = -12% covers"), saves merchant from a bad decision, leverages existing offer, concrete deliverables (Swiggy banner + Insta story), 10-min commitment cap.

Score breakdown (50/50)

Dimension	Score	Why
Specificity	10	Match details, percentage delta, time-bound
Category fit	10	Operator-to-operator voice ("covers", "match-night")
Merchant fit	10	Existing BOGO referenced; Swiggy/Insta channels appropriate
Trigger relevance	10	Direct response to IPL trigger with informed nuance
Engagement compulsion	10	Loss aversion (-12%) + existing-offer leverage + 10-min effort cap

This case is illustrative because it shows the bot adding value beyond the trigger itself — the trigger said "IPL match today"; the bot interpreted it correctly and gave a contrarian recommendation.

Case Study 6 — Restaurants / Active Planning Intent (merchant-facing)

Category: restaurants Merchant: Mylari South Indian Cafe, Indiranagar Bangalore — high-volume, weekday lunch thali doing 18 orders/day Trigger: active_planning_intent — merchant asked about a corporate-bulk thali package; explicitly said "yes good idea, what would it look like" Customer: none

Composed message (sent as Vera, in direct conversation)

Suresh, here's a starter version — you can edit:

Mylari Corporate Thali — for offices in Indiranagar
- 10 thalis @ ₹125 each (₹25 off retail) + free delivery
- 25 thalis @ ₹115 each + 2 free filter coffees
- 50+: ₹105 each + 1 free dosa platter
- WhatsApp the day-before by 5pm; we deliver between 12:30-1pm

3 offices in Indiranagar are in your delivery radius (Embassy Tech, RMZ Eco,
Sigma Soft). Want me to draft a 3-line WhatsApp to send their facilities
managers?

Compulsion levers: complete drafted artifact (no extra work for merchant), tiered pricing matches B2B logic, concrete radius + named buildings, follow-on offer to handle outreach.

Score breakdown (49/50)

Dimension	Score	Why
Specificity	10	Tier prices, radius, building names, time windows
Category fit	10	Operator language ("delivery radius", "facilities managers")
Merchant fit	10	Indiranagar locality + filter-coffee dosa nods to South Indian cafe identity
Trigger relevance	10	Direct continuation of merchant's planning intent
Engagement compulsion	9	Strong but assumes building data is in MerchantContext or fabricates — judge will check

Case Study 7 — Gyms / Seasonal Dip Reframe (merchant-facing)

Category: gyms Merchant: PowerHouse Fitness, HSR Layout Bangalore, views -30% w/w (April seasonal drop), 245 active members Trigger: seasonal_perf_dip — expected April-June low; flagged as not-a-problem Customer: none

Composed message (sent as Vera)

Karthik, your views are down 30% this week — but I want to flag this is the
normal April-June acquisition lull (every metro gym sees -25 to -35% in this
window). Action: skip ad spend now, save it for Sept-Oct when conversion is
2x. For now, focus retention on your 245 members. Want me to draft a "summer
attendance challenge" to keep them through the dip?

Compulsion levers: anxiety pre-emption (the dip is normal), data anchor (-25 to -35%), reframe as opportunity (save spend), specific member count, action proposal.

Score breakdown (48/50)

Dimension	Score	Why
Specificity	10	Percentage range, months, member count
Category fit	10	Coach-to-operator tone; uses "ad spend", "conversion" correctly
Merchant fit	10	His exact dip number + his member count + HSR locality (peer benchmark)
Trigger relevance	10	Directly addresses the seasonal dip with reframe
Engagement compulsion	8	Could be sharper — "save it for Sept-Oct" is a bit hand-wavy without numbers

Case Study 8 — Gyms / Customer Lapse Winback (customer-facing)

Category: gyms (customer-facing voice: warm, no-shame, no guilt-trip) Merchant: PowerHouse Fitness — first month @ ₹499 offer active Trigger: customer_lapsed_hard — Rashmi, 57 days since last visit, previous focus was weight loss, was a 5-month member Customer: Rashmi

Composed message (sent as merchant_on_behalf)

Hi Rashmi 👋 Karthik from PowerHouse here. It's been about 8 weeks — happens
to most members at some point, no judgment. We've added a Tue/Thu evening
HIIT class that fits weight-loss goals well (45 min, 6:30pm). Want me to
hold a free trial spot for you next Tue, 30 Apr? Reply YES — no commitment,
no auto-charge.

Compulsion levers: name + warmth, no-shame framing, addresses her past goal (weight loss), specific new offering matches goal, no-commitment trial, single binary CTA.

Score breakdown (50/50)

Dimension	Score	Why
Specificity	10	Class name, time, day, date, duration
Category fit	10	Coach voice, no shame, evidence-based
Merchant fit	10	Karthik (owner first name), PowerHouse, real new class
Trigger relevance	10	Explicitly addresses lapse without being needy
Engagement compulsion	10	"no judgment" + "no commitment, no auto-charge" removes 2 common barriers in one line

Case Study 9 — Pharmacies / Compliance Alert (merchant-facing)

Category: pharmacies Merchant: Apollo Health Plus Pharmacy, Malviya Nagar Jaipur — engaged, has 240 chronic-Rx customers Trigger: supply_alert — voluntary recall on atorvastatin batches AT2024-1102 + AT2024-1108 by Mfr Z Customer: none

Composed message (sent as Vera)

Ramesh, urgent: voluntary recall on 2 atorvastatin batches (AT2024-1102,
AT2024-1108) by Mfr Z — sub-potency, no safety risk, but customers should be
informed for replacement. Pulled your repeat-Rx list: 22 of your chronic-Rx
customers were dispensed these batches in last 90 days. Want me to draft
their WhatsApp note + the replacement-pickup workflow?

Compulsion levers: urgency + specificity (batch numbers), risk-but-bounded framing, derived count from merchant data (22 of 240), end-to-end workflow offer (note + pickup).

Score breakdown (50/50)

Dimension	Score	Why
Specificity	10	Batch numbers, manufacturer, count of affected customers
Category fit	10	Trustworthy-precise tone; correct use of "sub-potency" without alarming
Merchant fit	10	Pulls from his customer aggregate; uses his name
Trigger relevance	10	Direct compliance action
Engagement compulsion	10	Urgency + clear offer of complete artifact

Case Study 10 — Pharmacies / Chronic Refill Reminder (customer-facing)

Category: pharmacies (customer-facing voice: trustworthy, precise, respectful of senior) Merchant: Apollo Health Plus Pharmacy — free home delivery >₹499, senior 15% off active Trigger: chronic_refill_due — Mr. Sharma's metformin/atorvastatin/telmisartan run out 2026-04-28 Customer: Mr. Sharma — 65-75 age band, senior citizen, channel via son's WhatsApp, address saved, opted in

Composed message (sent as merchant_on_behalf, addressed to son)

Namaste — Apollo Health Plus Malviya Nagar yahan. Sharma ji ki 3 monthly
medicines (metformin, atorvastatin, telmisartan) 28 April ko khatam hongi.
Same dose, same brand pack ready hai. Senior discount 15% applied — total
₹1,420 (₹240 saved). Free home delivery to saved address by 5pm tomorrow.
Reply CONFIRM to dispatch, or call 9876543210 if any change in dosage.

Compulsion levers: namaste salutation (respectful), full molecule names (precision), specific date, total + savings shown clearly, two-channel option (reply OR call), senior-citizen norms honored.

Score breakdown (49/50)

Dimension	Score	Why
Specificity	10	Three molecule names, exact date, total + savings, time window
Category fit	10	Trustworthy-precise voice; namaste salutation appropriate
Merchant fit	10	Apollo's actual offers (free delivery, senior 15%); Malviya Nagar locality
Trigger relevance	10	Refill due date is the central anchor
Engagement compulsion	9	Strong; could nudge with "stocks may take 24h" if scarcity were real

Cross-case patterns the judge looks for

Reading the 10 cases together, here are the patterns that consistently score 9-10/10:

Source citation when claiming research/compliance — JIDA p.14, DCI circular, batch numbers. No citation = score capped at 7.
Numbers from the contexts, not invented — "22 of your chronic-Rx customers" is computed from the merchant's customer_aggregate; "245 active members" is from MerchantContext directly. Numbers without provenance get scored as fabrication.
Owner/merchant first name when present — Dr. Meera, Suresh, Karthik, Ramesh. Generic "Hi" loses 1 point on merchant fit.
Single most important next step framed as low-friction commitment — "Want me to draft X? Live in 10 min" / "Reply YES — no commitment, no auto-charge". Multi-action asks dilute.
Customer-facing messages honor language preference + relationship state — Hindi-English mix for Priya, namaste for Mr. Sharma's son. Treating every customer the same loses 2 points on customer fit.
Domain-specific vocabulary used correctly — "covers", "AOV", "sub-potency", "fluoride varnish", "ad spend", "conversion". Wrong vocabulary or absent vocabulary signals the bot didn't actually use the CategoryContext.voice.
The bot adds judgment, not just templating — Case Study 5 (IPL) shows the bot recommending not to push the IPL promo on a Saturday. That kind of contrarian, data-informed call is the highest signal of category understanding.
The conversation_id is meaningful — conv_priya_recall_2026_11 is good (decodable, resumable). conv_001 is acceptable. UUIDs without context lose nothing but help nothing.
The rationale field is concise and reflects actual reasoning — judge cross-checks rationale against the message; mismatch = penalty.
No repetition, no fabrication — these are the operational floor. Any of them in the message and the case is capped at 5/dimension regardless of quality.

How the judge uses these cases

For each submission, the judge LLM:

Reads the candidate's composition for the same (category, merchant, trigger, customer) tuple.
Compares against the case-study output above.
Scores each of the 5 dimensions on a 0-10 scale, citing what's better/worse.
Aggregates into the per-test-pair score.

Candidates can review these cases as a north star, but direct copying the body text of a case study counts as plagiarism — the judge runs a similarity check on submissions vs the case studies and penalizes near-duplicates.

The cases are meant to teach the shape of good output: specificity, category fit, merchant fit, trigger relevance, compulsion. Your wording must be your own.