Spaces:
Sleeping
Persona Data Sources
This document lists the canonical source material behind each of the 14 personas in this project. Our dataset is not synthetically generated from thin air — every persona is anchored in a real person's published memoirs / interviews or a well-documented fictional character from a novel, film, or long-running television series. Memory chunks are drawn from these sources; profile fields reflect documented biographical facts.
We do not claim to speak as these individuals. The prompt disclaims it explicitly ("your voice and thoughts are fully your own"). This project is a character-study and voice-exploration exercise, not an impersonation service, not a commercial product, and not a substitute for the individuals' own writing or performance.
Living persons in the roster: Tito Mukhopadhyay, Wendy Mitchell, Gabby Giffords, Jason Becker, Michael J. Fox. Chunks draw only from their published memoirs, blogs, and public record. We avoid invented political positions or private content that would contradict the public record.
Contents
Real-person personas
Stephen Hawking
Condition modelled: ALS (mid-stage, long-term survivor) Primary sources
- Hawking, Stephen. My Brief History. Bantam Books, 2013. (Autobiography.)
- Hawking, Stephen. A Brief History of Time. Bantam Books, 1988. (Biographical passages.)
- Hawking, Jane. Travelling to Infinity: My Life with Stephen. Alma Books, 2007. (First wife's memoir.)
- Ferguson, Kitty. Stephen Hawking: An Unfettered Mind. Palgrave Macmillan, 2012.
- Public interviews and documented speeches (BBC, CNN, TED, Royal Society addresses).
- Documentary: Hawking (dir. Stephen Finnigan, 2013).
- Film basis (biographical, not canonical): The Theory of Everything (dir. James Marsh, 2014).
- Intel ACAT / Words Plus communication-system engineering documentation (public).
Michael J. Fox
Condition modelled: Young-onset Parkinson's disease Primary sources
- Fox, Michael J. Lucky Man. Hyperion, 2002. (First memoir.)
- Fox, Michael J. Always Looking Up. Hyperion, 2009.
- Fox, Michael J. A Funny Thing Happened on the Way to the Future. Hyperion, 2010.
- Fox, Michael J. No Time Like the Future. Flatiron Books, 2020.
- Public social media (Twitter/X, Instagram — managed with his team).
- The Michael J. Fox Foundation for Parkinson's Research public communications.
- Documentary: Still: A Michael J. Fox Movie (dir. Davis Guggenheim, Apple TV+, 2023).
- Published interviews (Colbert, Sawyer, People magazine, New York Times).
Wendy Mitchell
Condition modelled: Young-onset Alzheimer's disease Primary sources
- Mitchell, Wendy and Anna Wharton. Somebody I Used to Know. Bloomsbury, 2018.
- Mitchell, Wendy and Anna Wharton. What I Wish People Knew About Dementia. Bloomsbury, 2022.
- Mitchell, Wendy and Anna Wharton. One Last Thing: How to Live With the End in Mind. Bloomsbury, 2023.
- Mitchell, Wendy. Which Me Am I Today? blog (ongoing since 2014). https://whichmeamitoday.wordpress.com/
- DEEP (Dementia Engagement and Empowerment Project) published materials.
- Alzheimer's Society ambassadorial talks (recorded).
- BBC and Guardian interviews post-diagnosis.
Christopher Reeve
Condition modelled: C1-C2 complete spinal cord injury (quadriplegia), ventilator-dependent Primary sources
- Reeve, Christopher. Still Me. Random House, 1998. (Autobiography.)
- Reeve, Christopher. Nothing Is Impossible: Reflections on a New Life. Random House, 2002.
- Reeve, Dana. Care Packages: Letters to Christopher Reeve from Strangers and Other Friends. Random House, 1999.
- Christopher & Dana Reeve Foundation public record (1996–present).
- Documented testimony before the US Senate on spinal cord research funding.
- 20/20 interview with Barbara Walters (ABC, August 1995).
- New York Times obituary (October 11, 2004).
- Documentary: Super/Man: The Christopher Reeve Story (dir. Ian Bonhôte & Peter Ettedgui, 2024).
Christy Brown
Condition modelled: Cerebral palsy (spastic quadriplegia, adult) Primary sources
- Brown, Christy. My Left Foot. Martin Secker & Warburg, 1954. (Autobiography.)
- Brown, Christy. Down All the Days. Martin Secker & Warburg, 1970. (Autobiographical novel.)
- Brown, Christy. A Shadow on Summer. Martin Secker & Warburg, 1974.
- Brown, Christy. Poetry collections including Come Softly to My Wake (1971) and Background Music (1973).
- Collis, Robert. The Silver Fleece: An Autobiography. Thomas Nelson & Sons, 1936. (Brown's paediatrician and first advocate.)
- Film: My Left Foot (dir. Jim Sheridan, 1989, screenplay based on Brown's autobiography).
- Irish Times and Guardian obituaries (1981).
Gabby Giffords
Condition modelled: Aphasia and right-side hemiparesis following traumatic brain injury (2011 Tucson shooting) Primary sources
- Giffords, Gabrielle and Mark Kelly, with Jeffrey Zaslow. Gabby: A Story of Courage and Hope. Scribner, 2011.
- Public speeches post-recovery (State of the Union address 2013; March For Our Lives 2018; DNC addresses; Giffords courage awards).
- Congressional voting record and speeches, AZ-08, 2007–2012.
- Giffords organization (formerly Americans for Responsible Solutions) published advocacy materials.
- Documentary: Gabby Giffords Won't Back Down (dir. Betsy West & Julie Cohen, 2022).
- Twitter/X posts, co-managed account.
- CNN, ABC, CBS, 60 Minutes interviews during and after recovery.
Jason Becker
Condition modelled: ALS (late-stage, long-term AAC user via eye-gaze) Primary sources
- Documentary: Jason Becker: Not Dead Yet (dir. Jesse Vile, 2012).
- Cacophony. Speed Metal Symphony. Shrapnel Records, 1987. (Becker's pre-ALS work with Marty Friedman.)
- Albums and liner notes: Perpetual Burn (1988), Perspective (1996), Collection (2008), Triumphant Hearts (2018).
- Astley-Brown, Michael. "Jason Becker on his heroes, career regrets & unreleased music." Guitar World, 2023.
- Facebook and Instagram — Jason's active public accounts (co-managed with his family).
- Guitar World and Total Guitar magazine interviews over three decades.
- Public correspondence with Marty Friedman (Cacophony bandmate), Steve Vai, Joe Satriani.
- Warheads and David Lee Roth recording sessions for A Little Ain't Enough (1991) — documented in producer notes and band interviews.
Jean-Dominique Bauby
Condition modelled: Locked-in syndrome Primary sources
- Bauby, Jean-Dominique. Le scaphandre et le papillon. Robert Laffont, 1997. (Original French.)
- English translation: The Diving Bell and the Butterfly, trans. Jeremy Leggatt. Alfred A. Knopf, 1997.
- French press coverage around the book's publication (Le Monde, Libération, March 1997).
- Obituary coverage (The Guardian, The New York Times, March 1997).
- Documentation of the ESARINTULOMDPCFBVHGJQZYXKW frequency-ordered alphabet method used with amanuensis Claude Mendibil at the Berck-sur-Mer Maritime Hospital.
- Film: Le scaphandre et le papillon (dir. Julian Schnabel, 2007) — based on the memoir.
- Notes: as Bauby died ten days after the book's French publication, this memoir is the primary canonical source; no
social_postchunks are used because Bauby is pre-internet.
Tito Rajarshi Mukhopadhyay
Condition modelled: Non-verbal autism spectrum disorder with apraxia Primary sources
- Mukhopadhyay, Tito. Beyond the Silence: My Life, The World, and Autism. National Autistic Society, 2000. (Published at age 11.)
- Mukhopadhyay, Tito. The Mind Tree: A Miraculous Child Breaks the Silence of Autism. Arcade Publishing, 2003.
- Mukhopadhyay, Tito. How Can I Talk If My Lips Don't Move? Inside My Autistic Mind. Arcade Publishing, 2008.
- Mukhopadhyay, Tito. Plankton Dreams: What I Learned in Special-Ed. Open Humanities Press, 2015.
- Biklen, Douglas. Autism and the Myth of the Person Alone. New York University Press, 2005. (Includes a dedicated interview-chapter with Tito Mukhopadhyay, pp. 110-116.)
- Mukhopadhyay, Soma. Understanding Autism Through Rapid Prompting Method. Outskirts Press, 2008.
- HALO (Helping Autism through Learning and Outreach), Austin, TX — published materials.
- Feature articles: 60 Minutes segment (CBS, 2003); The Guardian; Douglas Biklen's research publications at Syracuse University.
Fictional-character personas
Abed Nadir
Condition modelled: Autism (verbal, meta-narrative voice — canonically coded, not explicitly diagnosed on the show) Primary sources
- Community (TV series), created by Dan Harmon. NBC 2009–2014, Yahoo! Screen 2015. 110 episodes, 6 seasons.
- Character played by Danny Pudi throughout the series.
- Episodes drawn on heavily: "Pilot" (S1E1), "Contemporary American Poultry" (S1E21), "Modern Warfare" (S1E23), "A Fistful of Paintballs" / "For A Few Paintballs More" (S2E23–24), "Abed's Uncontrollable Christmas" (S2E11), "Remedial Chaos Theory" (S3E4), "Pillows and Blankets" (S3E14), "Paradigms of Human Memory" (S2E21), "Basic Lupine Urology" (S3E17), "Introduction to Finality" (S3E22), "Inspector Spacetime"-adjacent episodes, "Emotional Consequences of Broadcast Television" (S6E13, finale).
- Published showrunner commentary (Dan Harmon's podcast Harmontown).
- DVD/Blu-ray cast and writer commentaries.
Allie Hamilton Calhoun
Condition modelled: Late-stage Alzheimer's disease (progressive dementia, elderly, Southern US) Primary sources
- Sparks, Nicholas. The Notebook. Warner Books, 1996. (Novel.)
- Sparks, Nicholas. The Wedding. Warner Books, 2003. (Sequel, Noah's perspective, also referenced.)
- Film: The Notebook (dir. Nick Cassavetes, 2004). Allie played by Rachel McAdams (young) and Gena Rowlands (elderly).
- Notes: The Notebook has relatively thin canon compared with long memoirs or TV series; Allie's persona deliberately lands at a lower chunk count (~140) rather than being padded.
Forrest Gump
Condition modelled: Intellectual disability (IQ approximately 75, per the film's explicit framing) Primary sources
- Groom, Winston. Forrest Gump. Doubleday, 1986. (Novel.)
- Groom, Winston. Gump & Co. Pocket Books, 1995. (Sequel.)
- Film: Forrest Gump (dir. Robert Zemeckis, 1994), screenplay by Eric Roth. Forrest played by Tom Hanks.
- Soundtrack and production notes (documented soundtrack choices anchor several chunks).
Walter "Flynn" White Jr.
Condition modelled: Cerebral palsy (spastic, ambulatory with crutches, teen American) Primary sources
- Breaking Bad (TV series), created by Vince Gilligan. AMC 2008–2013. 62 episodes, 5 seasons.
- Character played by RJ Mitte (who himself has cerebral palsy).
- Episodes referenced: "Pilot" (S1E1), "Gray Matter" (S1E5), "Grilled" (S2E2), "Over" (S2E10), "Phoenix" (S2E12), "ABQ" (S2E13 — SaveWalterWhite.com storyline), "Gliding Over All" (S5E8), "Ozymandias" (S5E14), "Granite State" (S5E15), "Felina" (S5E16, finale).
- Published Gilligan/Moore showrunner commentary and episodic AV Club coverage.
- RJ Mitte interviews about his own CP and the character (People, Esquire, Cerebral Palsy Foundation).
Raymond Babbitt
Condition modelled: Autism with savant syndrome (memory and mathematical savantism) Primary sources
- Film: Rain Man (dir. Barry Levinson, 1988). Screenplay by Ronald Bass and Barry Morrow. Raymond played by Dustin Hoffman.
- Morrow, Barry. Production notes and interviews on the genesis of the screenplay.
- Peek, Fran. The Real Rain Man, Kim Peek. Harkness Publishing Consultants, 1997. (Background on Kim Peek, the real megasavant who partly inspired Raymond; Peek had FG syndrome, not autism — the film blended traits.)
- Treffert, Darold A. and Daniel D. Christensen. "Inside the Mind of a Savant." Scientific American Mind, vol. 17, no. 3, 2006, pp. 50-55. (Treffert was Rain Man's scientific advisor; the article explicitly discusses Kim Peek as the film's inspiration.)
- Levinson, Barry. Director commentary (DVD release).
- Notes: Rain Man is the thinnest canon of any persona (single
130-minute film). Raymond's persona deliberately lands below the 200-chunk target (119 chunks) rather than padding with invented events.
Chunk-count provenance
Per-persona chunk totals reflect the depth of available source material, not an arbitrary quota. Personas with rich multi-book canons (Fox, Hawking, Reeve, Christy Brown) cluster between 160 and 210 chunks. Personas with thinner canons (Raymond Babbitt from one film, Walter Jr. as a secondary Breaking Bad character, Allie from one novel plus adaptations) cluster between 119 and 141 chunks. This is the intended "no-filler" outcome.
| Persona | Chunks | Canon depth |
|---|---|---|
| Abed Nadir | 207 | 6 seasons of TV (110 episodes) |
| Christy Brown | 202 | Autobiography + novel + film + poetry |
| Jason Becker | 200 | Documentary + 35-year ongoing public record |
| Gabby Giffords | 199 | Memoir + public record 2007–present |
| Tito Mukhopadhyay | 180 | 5 books + ongoing writing |
| Jean-Dominique Bauby | 180 | One memoir (finite by definition) |
| Stephen Hawking | 167 | Autobiography + lifelong public record |
| Michael J. Fox | 160 | 4 memoirs + ongoing public record |
| Forrest Gump | 141 | One film + two novels |
| Allie Calhoun | 140 | One novel + one film |
| Walter Jr. White | 133 | Secondary character across 62 episodes |
| Wendy Mitchell | 130 | 3 books + 10+ years of blog |
| Christopher Reeve | 121 | 2 memoirs + foundation record |
| Raymond Babbitt | 119 | Single film (thinnest canon) |
| Total | 2,279 |
Chunk-type provenance
Each memory chunk is tagged as one of three types:
narrative— first-person autobiographical passage, drawn from memoirs, interviews, or in the case of fictional characters, reconstructed from canonical events.social_post— short public-facing messages. For modern personas (Fox, Becker, Gabby, Wendy, Abed, Walter Jr.), these reflect real social-media voice. For historical or pre-internet personas (Bauby, Christy Brown, Raymond, Forrest, Allie), this category is either omitted or repurposed as handwritten notes, toasts, letters-to-editor excerpts, or similar era-appropriate artefacts. Bauby and Raymond have zerosocial_postchunks; others are tuned per-persona.chat_log— multi-turn dialogue exchanges, drawn from documented conversations in memoirs, transcribed interviews, or (for fictional characters) canonical scripted scenes. Rendered asMe:/PartnerName:lines.
The type field is stored in the retrieval index and surfaced to the LLM in the prompt as
[bucket/type] so that the generation step can distinguish a tweet-style memory from a
memoir-style memory from a scripted exchange.
Ethical notes
- No commercial claim. This project is a university coursework / research demo. It is not distributed as a product and makes no claim to represent the individuals named.
- Living persons treated with care. For Tito Mukhopadhyay, Wendy Mitchell, Gabby Giffords, Jason Becker, and Michael J. Fox, chunks draw only from published autobiographical and public-record material. No private or invented content.
- No invented political positions. Gabby Giffords and Michael J. Fox both have explicit public advocacy positions. Chunks reflect their documented positions and do not extrapolate to policy stances outside the public record.
- Fictional characters remain fictional. We do not conflate the actor with the role. Walter Jr. is the character; RJ Mitte is the actor. Dustin Hoffman's Raymond Babbitt is distinct from Kim Peek, the real person who partly inspired the role.
- The AAC-framing disclaimer is embedded in the system prompt on every turn: "You have {condition} and communicate through {access_method}, but your voice and thoughts are fully your own." This acknowledges the constraint while refusing to flatten the persona into their disability.
Reproducing or extending the dataset
To add or regenerate a persona:
- Draft a structured-profile JSON following the schema in any existing
data/memories/*.jsonfile. Required top-level profile fields:id,name,age,gender,cultural_background,condition,diagnosis_details,communication_traits,access_needs,stylistic_preferences,personal_background. - Populate
memory_bucketswith all 5 keys (family,medical,hobbies,daily_routine,social). Each chunk is{"text": "...", "type": "narrative" | "social_post" | "chat_log"}. - Draw chunk content from canonical sources. Do not invent biographical facts not in the public record (for real persons) or not established in canon (for fictional characters).
- Run
python data/generate_users.pyto refreshdata/users.json. - Run
python -m backend.retrieval.vector_storeto rebuild the vector index.
The embeddings model is BAAI/bge-small-en-v1.5. Switching to a larger model requires
only changing embed_model in backend/config/settings.py and rebuilding the indexes.