A 10-14 week installation of Teresa Torres's Continuous Discovery Habits methodology layered with T2D3 proprietary IP for $5-30M ARR B2B SaaS teams stuck in the feature-factory pattern. Installs product trios, weekly customer interview cadence, Opportunity Solution Trees, assumption testing across the five risk categories, a research repository with tagging governance and traceability, and the Pain-Claim-Gain stakeholder operating mechanism. Reference methodology: https://www.producttalk.org/. Total scope: 7 modules, 28 sections, 62 tasks, 292 hours of canonical effort. Ongoing operating layer post-installation.
Establish the smallest viable decision-making units (trios), pick a single product outcome per trio, and translate that outcome into an OKR. This module is the load-bearing first 2 weeks of the installation. OKR design captured per task: O1 (Become outcome-driven, KRs: interview cadence, OST coverage, assumption tests/opportunity, cycle time, repo coverage) and O2 (Improve product outcomes, KRs: activation lift, feature adoption, invalidation rate).
Run the executive pre-mortem on the feature-factory pattern and audit the last 5 features for outcome traceability. Establishes the leadership commitment that downstream cycle-time SLAs depend on.
Run pre-mortem on the feature-factory pattern with executive team
Run a 90-minute executive-level pre-mortem session to surface and document the feature-factory pattern in the current organization. Use Torres's framing that 'the culture is overwhelmingly solution-focused. Roadmaps are built around features, meetings are about prioritizing ideas, and success is measured by output' (https://www.producttalk.org/) as the diagnostic prompt. Capture specific examples of features shipped that did not move outcomes, identify the leadership behaviors that perpetuate the pattern, and produce a one-page feature-factory diagnosis artifact. This task is a leadership-alignment dependency for the entire playbook - without explicit executive acknowledgment, downstream cycle-time SLAs do not survive sprint-level pressure.
Audit current discovery practice and feature-to-outcome traceability
Audit the last 5 features shipped in the past 2 quarters: for each feature, document (a) the customer evidence that justified it, (b) whether it was tied to a documented opportunity, (c) the assumptions tested before build, (d) the outcome lift it produced. This baseline is required to set realistic OKR targets and to demonstrate progress against KR1.2 (>=80% of features tied to OST opportunity). Reference Torres's anti-pattern guidance on cycle-time vs counting metrics (https://www.producttalk.org/2020/06/measure-discovery/) - the audit should report cycle time between activities, not interview counts. The output is the baseline scorecard the playbook is measured against.
Define the trio roster (PM + Designer + Engineer per outcome) using the T2D3 Syntropy capacity model and document trio rules of engagement (decision rights, conflict resolution, working cadence).
Define trio roster using the Syntropy capacity model
Define the trio roster (PM + Designer + Engineer per outcome) using the T2D3 Syntropy capacity model - sub-$5M ARR: founder-led blended trio; $5-30M: dedicated trios per outcome; $30M+: trios + research-ops shared service. Apply Torres's smallest-team principle (https://www.producttalk.org/product-trios/): 'the bigger your decision-making team gets, the slower you will move.' Quad only if domain expertise (data scientist for ML, security engineer for sensitive data) is structurally required. Output: named trio roster with one product outcome per trio and the engineer-onboarding-ladder commitments.
Document trio rules of engagement and conflict-resolution playbook
Document trio rules of engagement covering decision rights (each member shares accountability across all 5 risk areas: desirability/viability/feasibility/usability/ethical per https://www.producttalk.org/2022/02/responsibility-in-a-product-trio/), conflict resolution (explore perspectives -> expertise check -> assumption test for high-stakes splits), working cadence (weekly trio working session, daily 15-min check-in), and the engineer-onboarding ladder. Surface and pre-empt the silo-relapse dysfunction where engineers default to feasibility-only and PMs default to roadmap-only.
Translate the company-level business outcome into a trio-level product outcome and formalize O1/O2 plus their 8 KRs in the OKR system. Discovery cadence KRs become leading indicators.
Translate business outcome to product outcome (T2D3 worksheet)
Translate the company-level business outcome (revenue, retention) to a trio-level product outcome (changes in customer behavior). Per Torres (https://www.producttalk.org/okrs-vs-outcomes/), product outcomes are leading indicators within trio influence. Use the T2D3 outcome-translation worksheet to walk from business outcome -> product outcome -> input metrics -> behaviors. Output: one product outcome per trio with a 12-18 month time horizon. The translation has to land on a metric the trio can directly influence within a quarter, not a downstream business metric they can only nudge.
Formalize O1/O2 + KRs in the OKR system
Formalize O1 (Become outcome-driven) and O2 (Improve product outcomes) as the playbook's anchor OKRs. Lock in 5 KRs under O1 - interview cadence (KR1.1: 100% trios interviewing >=1/wk for 12 wks), OST coverage (KR1.2: >=80%), assumption tests/opportunity (KR1.3: >=3 average), cycle time (KR1.4: <=14 days), repo coverage (KR1.5: >=90%) - as discovery-cadence leading indicators. Plus 3 KRs under O2: activation lift +15%, feature adoption +25%, invalidation rate 30-50%. Mirror Torres's cycle-time emphasis (https://www.producttalk.org/2020/06/measure-discovery/) by anchoring KR1.4 on max days between activities, not counts.
Build the Pain-Claim-Gain executive pitch deck and the stakeholder influence/impact matrix with per-stakeholder communication SLAs. Locks in HIPPO-intervention response protocol.
Build Pain-Claim-Gain executive pitch deck
Build the Pain-Claim-Gain executive pitch deck (T2D3 IP). Pain = current outcome gap (use baseline-audit data); Claim = the discovery-habits installation bet; Gain = projected outcome lift backed by case-study benchmarks (Snagajob: weekly cadence, MVP launches without mid-sprint guidance per https://www.producttalk.org/2018/03/continuous-discovery-case-study/). Structure: 12 slides max, ends with the executive cadence commitment and the discovery KR scoreboard. The deck is the operating mechanism that defends discovery cadence under sprint pressure for the next 12 months.
Build stakeholder influence/impact map and per-stakeholder communication SLAs
Build stakeholder influence/impact matrix and per-stakeholder communication SLAs (executive monthly Pain-Claim-Gain update; mid-management bi-weekly OST review; engineering weekly digest). Surface the HIPPO-intervention risk and define the response protocol (Pain-Claim-Gain narrative + pre-mortem on every executive feature ask). Reference https://www.producttalk.org/ for the high-influence-stakeholder pattern. The map is the operating contract that prevents stakeholder churn from derailing the cadence.
T2D3 proprietary IP layer. The Torres methodology assumes interviewees show up; in practice, recruiting is the #1 reason continuous discovery fails. This module installs the recruiting machine: in-product prompts, CSM-referral SOP, panel-vendor protocol, scheduling automation, incentives policy, and GDPR/CCPA-compliant consent. Highest-leverage T2D3 IP layer in the playbook.
Stand up the four recruiting channels: in-product prompt (Pendo/Appcues/Sprig), CSM-referral SOP, panel-vendor contract (User Interviews / Respondent / Userlytics), and customer advisory community.
Build in-product recruit prompt (Pendo / Appcues / Sprig)
Stand up an in-product recruit prompt using Pendo, Appcues, or Sprig (https://docs.appcues.com/en_US/use-cases/use-cases-for-in-app-user-research-recruitment) that surfaces an interstitial after a target behavior (e.g., 3rd login, completed core flow). Pattern: 20-min ask, clear value statement, screener question, Calendly link. Echo the Snagajob example ($20 for 20 minutes, B2C hourly worker context per https://www.producttalk.org/2018/03/continuous-discovery-case-study/). Lock in frequency capping to avoid prompt fatigue. This channel is the bottom-of-funnel recruiting that survives CSM-referral droughts and panel-vendor cost spikes.
Build CSM-referral SOP with weekly quotas and warm-intro template
Build the customer-success referral SOP: weekly quota per CSM (1 referral / CSM / week), warm-intro email template, call-out triggers (NPS detractor, expansion candidate, churn risk), and feedback loop (CSM hears the resulting OST update). Address Torres's note (https://www.producttalk.org/2022/12/customer-interviews/) on overcoming sales/CS resistance by starting with personal-network warm intros and broadcasting learnings back. CSM referrals are the highest-conversion channel - 60-70% of warm intros convert to a scheduled interview, vs 10-15% for in-product prompts.
Negotiate panel-vendor contract for top-up recruiting
Negotiate panel-vendor contract (User Interviews, Respondent, Userlytics) for top-up recruiting when in-product + CSM channels under-deliver. Define screener fidelity, incentive pass-through, GDPR contracts, and per-quarter spend caps. Vendor selection criteria: response SLA (<48h), audience fit, incentive flexibility, integrations with Calendly/Zoom. Reference https://www.userinterviews.com/blog/the-user-researchers-guide-to-gdpr for the data-protection requirements that the contract has to cover. Panel vendors are the third-tier recruiting layer behind in-product and CSM-referrals.
Stand up customer advisory community for repeated lightweight research
Stand up a customer advisory community / panel of 30-50 opt-in members for repeated lightweight research. Run on Slack, Circle, or Discourse with a quarterly perks/access cadence (early-access features, executive AMA, exclusive webinars). This is the long-tail recruiting layer that survives in-product prompt fatigue and panel-vendor cost spikes. Per Torres (https://www.producttalk.org/2022/12/customer-interviews/), a small opted-in community delivers higher-quality interviews than ad-hoc cold recruiting because rapport accelerates story extraction.
Define the B2C cash/credit/swag and B2B non-cash incentives ladder, and lock finance/procurement sign-off on the disbursement mechanism (Tremendous/Rybbon/direct payouts).
Define B2C incentives ladder and B2B non-cash menu
Define the incentives ladder: B2C cash/credit/swag (Snagajob's $20/20min as the floor benchmark per https://www.userinterviews.com/blog/how-to-interview-customers-continuously-with-teresa-torres-of-product-talk); B2B non-cash menu (exclusive webinar invites, premium support tier, priority roadmap access). Per Torres, default to 'no-incentive first' - a small time ask + clear customer benefit + explicit use-of-time framing often clears the bar without cash. The matrix sets one floor and one ceiling so individual interviewers do not over- or under-incentivize and skew recruitment funnels.
Get finance/procurement sign-off on incentive disbursement mechanism
Get finance/procurement sign-off on the incentive disbursement mechanism. Recommended: Tremendous or Rybbon for digital-gift-card delivery; direct payouts via vendor for cash; Stripe/PayPal for B2C credit. Lock in tax reporting (1099 thresholds in the US, equivalent abroad per https://www.userinterviews.com/blog/the-user-researchers-guide-to-gdpr) and the per-quarter spend ceiling. Without finance sign-off, every interview blocks on a one-off PO - the recruiting pipeline will not survive that friction past month 2.
Stand up Calendly + auto-Zoom-recording + transcript pipeline (Otter/Grain/Fathom) and define the no-show recovery flow with per-channel benchmarks.
Stand up Calendly + auto-Zoom-recording + transcript pipeline
Stand up Calendly + auto-Zoom-recording + transcript pipeline (Otter, Grain, or Fathom). Configure trio-shared availability, 20-min default slots, automated reminder cadence, and post-interview transcript drop into the research repository. Aim for zero-touch interview operations - the trio shows up, conducts, and the artifact pipeline runs itself. Reference https://www.userinterviews.com/blog/how-to-interview-customers-continuously-with-teresa-torres-of-product-talk for the time-budget rationale (recurring interviews demand zero scheduling overhead per session).
Define no-show recovery flow with per-channel benchmarks
Define the no-show recovery flow: auto-reschedule offer within 5 minutes; replacement-recruit threshold at 50% no-show rate; per-channel no-show benchmarks (in-product: 25-35%; CSM-referral: 10-20%; panel: 15-25%). Wire to an alert if the trio is below cadence for a given week. Without a no-show recovery flow, weekly cadence collapses every time 1 of 3 interviews bails - and that is the #1 way the cadence dies in week 4. Reference https://www.userinterviews.com/ux-research-field-guide-chapter/ongoing-customer-research for the per-channel benchmarks.
Build the GDPR/CCPA-compliant consent template, define PII retention + deletion policy, and lock in the cadence quota: >=1 customer interview per trio per week, exec-sponsored. Includes the EU/GDPR-gated consent flow.
Build GDPR/CCPA-compliant consent template
Build the GDPR/CCPA-compliant consent template covering recording retention (default 18 months), repository inclusion (separate opt-in), withdrawal rights, third-party processors, and PII minimization. Reference Consent Kit (https://consentkit.com/gdpr-for-user-research) and Participant Kit (https://participantkit.com/gdpr-compliance-research-ops). Get legal review before deployment. Without a documented consent template, every interview is legally fragile - and a single GDPR complaint can shut down the recruiting pipeline for the EU customer base.
Define PII retention + deletion policy
Define PII retention + deletion policy: 18-month default for raw recordings, re-consent for repository inclusion, automated deletion job after retention window, named DPO accountability. Map the data flow from interview -> transcript -> repository -> snippet -> opportunity to identify all PII touchpoints. Reference https://www.userinterviews.com/blog/the-user-researchers-guide-to-gdpr for the data-flow obligations. The retention policy is what keeps the repository legal at the 18-month mark when the first set of recordings hits the deletion threshold.
Train the trio in Torres's story-based interview craft. Most teams default to leading questions and feature-validation prompts; this module installs the muscle for 'Tell me about the last time you...' and the discipline to capture every interview as a one-page snapshot artifact.
Create the story-based interview guide template, run the 4-hour trio interviewing workshop with mock-interview replay, and define the engineer onboarding ladder (watch -> observe -> notes -> conduct).
Create story-based interview guide template
Create the story-based interview guide template anchored on Torres's 'Tell me about the last time you...' pattern (https://www.producttalk.org/story-based-customer-interviews/). Include the research-question-to-prompt translator that converts what-you-want-to-learn into specific past-behavior prompts. Cover the broad-vs-focused scoping rule ('last time you did something fun' vs 'last time you watched Netflix on the go' per https://www.producttalk.org/2022/04/best-customer-interview-questions/). The guide is the trio's go-to-meeting artifact - never let interviewers improvise without it.
Run 4-hour trio interviewing workshop with mock-interview replay
Run a 4-hour live trio interviewing workshop covering: story-based question craft, follow-up techniques ('what happened first?' / 'then what?'), excavating-the-story drills, common mistakes (closed questions, hypotheticals, summary questions). Include 30 min of mock interviews + replay critique. Reference Erika Hall's interview principles ('open ended, probe, leave silences' per https://www.amazon.com/Just-Enough-Research-Erika-Hall/dp/1952616468) for complementary craft. The workshop is the only place the trio practices before the first real interview - skipping it sets up the silo-relapse pattern.
Define engineer onboarding ladder for interview craft
Define the engineer onboarding ladder per Torres's product-trio guidance (https://www.producttalk.org/product-trios/): watch interview recording -> observe interview live -> take notes -> suggest follow-up questions -> conduct interview. Lock in target weeks 1, 2, 3, 4, 5 for ladder steps. Treat engineer disengagement / fear-of-the-unknown as the #1 silo-relapse risk. The ladder is the explicit ramp that converts engineers from 'feasibility-only' silo participants into full trio members - without it, engineers default back to their old role within 6 weeks.
Adopt the Product Talk one-page interview snapshot template and run the first 4 pilot interviews per trio with full attendance and snapshot completion within 24 hours.
Adopt Product Talk one-page interview snapshot template
Adopt and customize Torres's one-page interview snapshot template (https://www.producttalk.org/2024/02/interview-snapshot/, https://miro.com/miroverse/interview-snapshot-template/). Sections: photo + quick facts, memorable quote, opportunities, insights, experience-map drawing. Customize quick-facts fields for the org (B2B: company size, role, account tier; B2C: persona, lifecycle stage). The snapshot is the interview's atomic artifact - it lives in the repository forever and is the unit of synthesis for the OST.
Run first 4 pilot interviews per trio with full attendance
Run the first 4 pilot interviews per trio with full attendance (PM + Designer + Engineer all present) and a snapshot completed within 24 hours of each interview. Apply Torres's 'scrappy first' mindset (https://www.producttalk.org/2018/03/continuous-discovery-case-study/) - done > polished. Each trio surfaces the recurring quick-fact patterns and 2-4 insights per interview as practitioner benchmarks. The pilot interviews are the first proof-of-concept that the trio can do this together, and they generate the seed data for the v1 OST in M4.
Build the interview replay rubric (closed-question count, leading-question flags, story-vs-summary ratio) and stand up the weekly trio interview retro. Surfaces craft drift fast at scale.
Build interview replay rubric with red-flag library
Build the interview replay rubric scoring closed-question count, leading-question flags, story-vs-summary ratio, and time-talking-vs-listening (target: participant talking >=75% of the session). Lock in red flags (asking about feature preferences, hypothetical futures, summary questions like 'how often do you usually...') per Torres's bias guidance (https://www.producttalk.org/story-based-customer-interviews/). The rubric is the operating mechanism for craft maintenance - it converts subjective gut-feel into a scored conversation the trio can have weekly.
Stand up weekly 30-minute trio interview retro
Stand up the weekly 30-minute trio interview retro: review one randomly-selected interview against the rubric, capture coaching observations, log the cycle-time data point. Treat as the pair-programming equivalent for interview craft - surfaces drift fast, sustains skill at scale. Reference https://www.producttalk.org/2020/06/measure-discovery/ for the cycle-time-emphasis rationale. The weekly retro is the operating mechanism that defends interview quality past month 3 when novelty fades.
Install the OST as the trio's shared mental model and the bridge between discovery and roadmap. Covers tooling (Vistaly / Miro / FigJam), anti-pattern audits (solutions-disguised-as-opportunities, single-tree-per-trio), target-opportunity selection via importance x tractability, and the discipline of compare-and-contrast solution brainstorming.
Select the OST tool (Vistaly / Miro / FigJam) and build the OST template with anti-pattern callouts and sample-tree examples drawn from Snagajob and FCSAmerica.
Select OST tool (Vistaly / Miro / FigJam) and configure trio workspace
Select OST tool. Recommended: Vistaly (purpose-built per https://blog.vistaly.com/posts/metric-goal-guide), Miro with Torres's template (https://miro.com/blog/mapping-product-teams-teresa-torres/), or FigJam. Configure trio-level workspace, naming convention (one tree per trio per outcome), and version-history retention. The tool choice trades off purpose-built node typing (Vistaly) vs general-purpose flexibility (Miro/FigJam) vs price - a small organization can ship with FigJam for free, but Vistaly's structure pays off when 3+ trios run in parallel.
Build OST template with anti-pattern callouts and sample tree library
Build the OST template: Outcome (root) -> Opportunity (children) -> Solution (3+ per target opportunity) -> Assumption Test (per solution). Layer in anti-pattern callouts: solutions-disguised-as-opportunities check, single-tree-per-trio rule, prerequisites checklist (https://www.producttalk.org/opportunity-solution-trees/). Include sample-tree examples drawn from Snagajob (https://www.producttalk.org/2018/03/continuous-discovery-case-study/) and FCSAmerica (https://www.producttalk.org/2019/05/continuous-discovery-is-for-everyone/). The template is the trio's shared mental model - if it does not exist as an artifact, every trio reinvents the OST and inconsistency compounds.
Synthesize the first 4 interview snapshots into a v1 opportunity space (8-15 nodes) and run the anti-pattern audit to remove solutions-disguised-as-opportunities.
Synthesize first 4 snapshots into v1 opportunity space
Synthesize the first 4 interview snapshots into a v1 opportunity space. Cluster snippets by theme; lift opportunities (unmet customer need / pain / desire) to OST nodes. Embrace the 'crummy first draft' - Torres explicitly warns against perfectionism in the first OST (https://www.producttalk.org/opportunity-solution-trees/). Target: 8-15 opportunity nodes, hierarchical (parent/child) where natural. This is the moment the trio's interview practice meets its first OST - the synthesis quality bar is low; the bar that matters is shipping a v1 within 1 week of the 4th pilot interview.
Run v1 OST anti-pattern audit
Run the v1 OST anti-pattern audit: (1) for every node, ask 'is there more than one way to address this?' - if no, it's a solution in disguise; (2) is there one outcome at the root or many? - split by segment/journey if many; (3) is the tree scoped to one trio? - split if not. Per https://www.producttalk.org/opportunity-solution-trees/, 'company-wide trees never turn out well.' The audit is the only quality gate the v1 OST gets before becoming the operating tree for the next 6 weeks - skipping it is how trios end up with 60-node solution-trees-disguised-as-OSTs.
Choose one target opportunity using the importance x tractability rubric (effort excluded) and brainstorm a minimum of 3 candidate solutions for compare-and-contrast.
Choose target opportunity using importance x tractability rubric
Choose one target opportunity using the importance x tractability rubric. Importance = expected outcome impact (informed by interview frequency + customer-segment value); tractability = solution feasibility within the cycle. Per Torres (https://www.producttalk.org/opportunity-solution-trees/), do NOT include effort in the opportunity assessment - 'solutions take effort, opportunities don't.' Picking the wrong opportunity dooms the cycle - the trio invests 6 weeks of testing on something that will not move the outcome. The rubric forces the conversation to land on impact and feasibility, not effort.
Brainstorm minimum 3 candidate solutions per target opportunity
Brainstorm a minimum of 3 candidate solutions per target opportunity. Three-minimum forces compare-and-contrast decision-making (vs binary go/no-go on a single idea). Use crazy-eights or design-studio formats to widen the solution space. Per https://www.producttalk.org/opportunity-solution-trees/, this is 'where most learning happens.' If the trio cannot generate 3, the opportunity is too narrow or the trio is anchoring on a preferred solution - either way, the brainstorm is the prompt that surfaces the dysfunction.
Establish the bi-weekly OST update working session and the monthly cross-trio OST review. Stops the tree drifting into a stale wishlist and surfaces duplicate opportunities across trios.
Establish bi-weekly OST update working session
Establish the bi-weekly OST update working session: 90-minute trio session every other week to synthesize new snapshots into the tree, prune stale solutions, and update assumption test results. Stops the OST from drifting to a stale wishlist. Reference Torres's measurement-discovery guidance (https://www.producttalk.org/2020/06/measure-discovery/) for the cycle-time emphasis - the bi-weekly cadence keeps OST cycle time under 14 days. Without the bi-weekly session, the OST becomes a museum piece by week 6.
Stand up monthly cross-trio OST review
Stand up the monthly cross-trio OST review: each trio walks its tree to peers in 15 minutes; surface duplicate opportunities, shared learnings, and cross-trio dependencies. Defends against silo formation when 3+ trios run discovery in parallel. Reference https://www.producttalk.org/product-trios/ for the multi-trio coordination patterns. The review converts the trios from independent silos into a coherent product organization - the cross-pollination is what unlocks compounding learning at the org level.
Surface and test assumptions across all 5 risk categories (desirability, viability, feasibility, usability, ethical) before building. Install the 12 lightweight test patterns and the cycle-time discipline that separates real discovery from theater. Achieves KR1.3 (>=3 tests per target opportunity) and KR1.4 (cycle time <=14 days).
Build the 5-category assumption-test design rubric, adopt David Bland's importance x evidence grid, and run the 2-hour trio workshop to surface >=10 assumptions per candidate solution.
Build 5-category assumption-test design rubric
Build the assumption-test design rubric covering all 5 categories (https://www.producttalk.org/five-types-of-assumptions/): desirability (will customers want this?), viability (will it be good for the business?), feasibility (can we build it?), usability (can customers use it?), ethical (any potential harm?). Each category gets prompt questions + example assumption statements + recommended test types (prototype, survey, data mining, research spike). Reference https://www.producttalk.org/assumption-testing/ for the rubric structure. The rubric is the operating mechanism that prevents trios from defaulting to feasibility-only assumption testing - the most common silo-relapse failure.
Adopt David Bland's importance x evidence grid
Adopt David Bland's importance x evidence grid for assumption prioritization (referenced at https://www.producttalk.org/assumption-testing/). High importance + low evidence -> test first. Visual placement, weekly review. The grid is the prioritization layer on top of the assumption-design rubric - without it, trios test the easiest assumptions first, not the riskiest. This is a 1-hour artifact that the trio uses for the next 12 months.
Run trio workshop to surface 10+ assumptions per candidate solution
Run a 2-hour trio workshop to surface >=10 assumptions per candidate solution across the 5 categories. Engineer leads on feasibility prompts; designer on usability; PM on viability and desirability; whole trio on ethical. Per https://www.producttalk.org/2022/02/responsibility-in-a-product-trio/, all members remain accountable for all categories - leverage expertise without rebuilding silos. The workshop is the load-bearing surfacing exercise that determines what the cycle's testing program will look like.
Build the 12-pattern test library (prototype, survey, data mining, research spike, concierge, Wizard of Oz, fake door, smoke, painted door, 5-second, card sort, A/B) and the assumption-type -> recommended pattern decision tree.
Build 12 lightweight test pattern library
Build the 12 lightweight test pattern library: prototype, one-question survey, data mining, research spike (Torres's 4 base types per https://www.producttalk.org/assumption-testing/), concierge (https://learningloop.io/plays/concierge), Wizard of Oz (https://learningloop.io/plays/wizard-of-oz), fake door (https://learningloop.io/plays/fake-door-testing), smoke test, painted door, 5-second test, card sort, A/B test. Each pattern: when to use, setup steps, time/cost estimate, example. The library is the trio's test-pattern menu - without it, every test is reinvented and assumption testing collapses under setup overhead.
Build assumption-type to recommended test pattern decision tree
Build the assumption-type -> recommended test pattern decision tree. Desirability + early stage -> fake door. Desirability + mid stage -> concierge. Usability -> prototype + 5-second test + card sort. Feasibility -> research spike. Viability -> data mining + one-question survey. Ethical -> red-team + structured peer review. Reference https://www.producttalk.org/assumption-testing/ for the matching logic. The decision tree is the operating shortcut that converts the 12-pattern library into a 30-second pick.
Run the first 3 assumption tests per trio against the riskiest assumptions (KR1.3 first cycle) and document the readout per solution with go/iterate/pivot/new-opportunity decision.
Run first 3 assumption tests against riskiest assumptions
Run the first 3 assumption tests per trio against the 3 riskiest assumptions on the prioritization grid. Document each as: assumption -> predicted result -> actual result -> decision (build / iterate / pivot / new opportunity). This is KR1.3's first cycle (https://www.producttalk.org/assumption-testing/). Aim for 30-50% invalidation - high invalidation indicates real testing, low invalidation suggests confirmation bias. The first 3 tests are the proof that the trio can move from talking-about-discovery to doing-discovery.
Document test results readout per solution
Document the test results readout per solution with go/iterate/pivot/new-opportunity decision. Walk results back up the OST: if a solution invalidates, does the opportunity still hold? Capture as a structured artifact for the repository and the monthly stakeholder update. Reference https://www.producttalk.org/assumption-testing/ for the decision-record format. The readout is the artifact that converts test results into a stakeholder-readable artifact - without it, the test loop ends inside the trio and stakeholders never see invalidation evidence.
Operating cadence section. Build the cycle-time dashboard (max days between any two discovery activities) and run the first quarterly discovery retro. Two tasks - exempt from the 8-task floor per Decision 2.
Build cycle-time dashboard with max-day SLA
Build the cycle-time dashboard. Metrics: max days between any two interviews, max days between any two assumption tests, max days between any two prototypes, idea-abandonment cycle time. Per https://www.producttalk.org/2020/06/measure-discovery/, 'limit the maximum number of days you'll go between any two activities' - the dashboard surfaces gaming early. The dashboard is the operating mechanism for KR1.4 - without it, cycle time gets gamed within 6 weeks because counting interviews is easier than counting cycle gaps.
Run first quarterly discovery retro
Run the first quarterly discovery retro. Two questions (https://www.producttalk.org/2020/06/measure-discovery/): 'what surprised us during the past quarter?' and 'how could we have learned that sooner?' Review cycle-time dashboard, idea-abandonment rate, KR scoreboard. Output: 3 cycle-time improvements for next quarter. The quarterly retro is the operating mechanism that prevents the cadence from going stale - without it, year-2 trios hit a quality plateau and never recover.
T2D3 proprietary IP layer. Snippets, taxonomy, governance, the 'research repository trap' defenses, and traceability from interview -> snapshot -> opportunity -> assumption test -> shipped feature. Second highest-leverage T2D3 IP layer, drawing on Anderson's repository-trap critique and Dovetail's tagging governance.
Select repository tool (Dovetail / Notion / EnjoyHQ) by first running stakeholder-needs interviews to avoid Anderson's research repository trap, and design the cross-linked entity schema.
Select repository tool with stakeholder-needs interviews
Select repository tool by first running stakeholder-needs interviews (avoid Anderson's 'research repository trap' per https://www.userresearchstrategist.com/p/the-research-repository-trap, which warns 'I designed it for me, not for them'). Candidates: Dovetail (https://dovetail.com/blog/global-project-tags-taxonomy-research-repository/), Notion, EnjoyHQ. Compare: tagging governance, AI-assisted snippet extraction, Slack/Confluence embed, cost. The tool choice is downstream of the stakeholder-needs interviews - skipping that step is the #1 way the repository gets abandoned within 6 months.
Design cross-linked repository schema
Design the repository schema with cross-linked entities: snippets / interviews / opportunities / solutions / tests / decisions. Each entity has structured fields + free-text. Cross-link: snippet -> interview -> opportunity -> solution -> test -> decision -> shipped feature. This schema is the backbone of KR1.2 (OST coverage) and KR1.5 (repo coverage) traceability. Reference GitLab's public Dovetail SOP (https://handbook.gitlab.com/handbook/product/ux/dovetail/) and Dovetail's tagging-taxonomy guide (https://dovetail.com/blog/global-project-tags-taxonomy-research-repository/). The schema is the load-bearing T2D3 IP that converts the repository from a notebook into a knowledge system.
Define the global tag set (pain points / motivations / behaviors / quotes / decisions), the project-tag set per outcome, and the taxonomy dictionary with change-control governance.
Define global tag set
Define the global tag set: pain points / motivations / behaviors / quotes / decisions. These are the 'global tags' in Dovetail's framework (https://dovetail.com/blog/global-project-tags-taxonomy-research-repository/) - used across every project, governed centrally. Lock in tag naming conventions (lowercase-hyphenated, present tense for behaviors, noun for pain points). The global tag set is the canonical layer of taxonomy - changes require change-control governance, which is the discipline that keeps the taxonomy alive past month 12.
Define project tag set per outcome
Define the project tag set per outcome - extends global tags with outcome-specific patterns (e.g., for activation outcome: friction-onboarding, time-to-aha, account-setup-blocker). Project tags are mutable; global tags require change control. Reference Dovetail's tagging-taxonomy framework (https://dovetail.com/blog/global-project-tags-taxonomy-research-repository/). The project layer keeps each trio's tagging granular enough to surface insight, without polluting the global tag set with outcome-specific noise.
Build taxonomy dictionary with change-control governance
Operating cadence section. Build the snapshot -> repository pipeline with auto-tagging assist and run the monthly cross-trio affinity-mapping session in MURAL/Miro. Two tasks - exempt from the 8-task floor per Decision 2.
Build interview-snapshot to repository pipeline with auto-tagging
Build the interview-snapshot -> repository pipeline with auto-tagging assist. Transcript drops in (Otter/Grain/Fathom) -> AI suggests snippets + tags -> trio reviews and confirms -> snippets land in repository with tag attribution. Reduces synthesis cycle time and lifts KR1.5 (repo coverage) above 90%. Reference current Dovetail AI capabilities (https://www.looppanel.com/blog/dovetail-ai) and Dovetail's tagging-taxonomy guide (https://dovetail.com/blog/global-project-tags-taxonomy-research-repository/). Without auto-tagging, repo-coverage stalls at 50-60% because the manual tagging burden is too high.
Run monthly cross-trio affinity-mapping session
Run the monthly cross-trio affinity-mapping session in MURAL or Miro to surface cross-trio patterns (https://www.mural.co/blog/continuous-product-design). 90 min: each trio brings top-5 surprising snippets; cluster across trios; identify shared opportunities. Output: any new global tag candidates and cross-trio dependencies. This session is the cross-pollination layer that turns 3 independent trios into a coherent product organization - without it, trios re-discover the same patterns in isolation for 6 months before noticing.
Build the insight-to-decision traceability log (each shipped feature linked to >=1 OST opportunity AND >=1 assumption test) and the OST-coverage report that measures KR1.2.
Build insight-to-decision traceability log
Build the insight-to-decision traceability log: each shipped feature is linked to >=1 OST opportunity AND >=1 assumption test. This is KR1.2's measurement spine (>=80% of features tied to opportunity) per https://www.producttalk.org/2020/06/measure-discovery/. Daily-syncing job, Looker/Metabase visualization, Slack alert when new feature ships without traces. The log is the operating mechanism that converts KR1.2 from a quarterly self-report into a daily-truth surface - without it, the team gets to claim 95% coverage when it is actually 30%.
Build OST-coverage report (KR1.2)
Build the OST-coverage report (KR1.2 - >=80% of shipped features tied to OST opportunity). Weekly auto-emit; surfaces any feature shipped without a documented opportunity. Used in the monthly Pain-Claim-Gain stakeholder update. Reference https://www.producttalk.org/opportunity-solution-trees/ for the coverage-as-discipline rationale. The report is the artifact that closes the loop between shipped features and the OST - without it, the OST drifts to a documentation exercise that nobody believes in.
Repository onboarding roadshow, exec-sponsor demo, per-team office hours, embedded Slack #research-snippets channel, and weekly highlight digest. Publishing/hand-off section - exempt from the 8-task floor per Decision 2.
Interview stakeholders on repository needs
Interview >=5 stakeholders (PMs, designers, engineers, marketing, sales, exec) to learn how they currently find research, what they actually want, and where in their workflow research-access should surface. Per Anderson's repository-trap critique (https://www.userresearchstrategist.com/p/the-research-repository-trap), this step is what separates used repositories from abandoned ones. Embed surfaces in Slack and Confluence rather than forcing tool adoption. Without these interviews, the repository becomes the 'tool nobody uses' - a guaranteed failure mode within 6 months.
Run repository onboarding roadshow
Run the repository onboarding roadshow: exec-sponsor demo (Pain-Claim-Gain framing), per-team office hours, embedded Slack #research-snippets channel, weekly highlight digest. Drives KR1.5 adoption and counters the 'tool nobody uses' trap (https://www.userresearchstrategist.com/p/the-research-repository-trap). The roadshow is the operating mechanism that converts the repository from 'PM-and-designer-only' into a cross-functional asset - without it, repo coverage stalls at 50-60% because non-product teams never adopt.
Move discovery from project to BAU. Pain-Claim-Gain monthly stakeholder updates, Now/Next/Later roadmap tied to OST opportunities, OKR re-anchoring at quarterly planning, trio compensation/recognition (T2D3 IP gap fill that Torres explicitly leaves open), and the STOP-framework hand-off to BAU runbook covering all 22 playbook artifacts.
Operating cadence section. Build the monthly stakeholder-update template using the T2D3 Pain-Claim-Gain narrative walking the OST top-down, and deliver the first monthly update to executive team and board. Two tasks - exempt from the 8-task floor per Decision 2.
Build Pain-Claim-Gain monthly stakeholder update template
Build the monthly stakeholder-update template using the T2D3 Pain-Claim-Gain narrative (Pain = current outcome gap; Claim = the opportunity we're betting on this cycle; Gain = the experiment evidence to date) walking the OST top-down. 6 slides max: (1) outcome scoreboard, (2) Pain - quantified gap, (3) Claim - chosen opportunity + 3 candidate solutions, (4) Gain - assumption test results, (5) decision (build/iterate/pivot), (6) ask of execs. Reference https://www.producttalk.org/opportunity-solution-trees/ for the OST-walk discipline. The template is the operating mechanism that defends discovery cadence under sprint pressure for 12+ months.
Deliver first monthly Pain-Claim-Gain stakeholder update
Deliver the first monthly Pain-Claim-Gain update to the executive team. Pain-Claim-Gain framing forces the conversation away from 'did we ship the feature' and toward 'did we move the outcome / what did we learn / what's the next bet.' This is the operating mechanism that defends discovery cadence under sprint pressure. Reference https://www.producttalk.org/ for the cadence-defense rationale. The first update sets the tone for the next 12 months - landing it well is the difference between leadership buy-in and quiet skepticism that erodes the cadence in month 4.
Build the Now/Next/Later roadmap with each lane traced to OST opportunities + assumption tests, and re-anchor product OKRs to validated opportunities at quarterly planning.
Build Now/Next/Later roadmap tied to OST opportunities
Build the Now/Next/Later roadmap with each lane traced to OST opportunities + assumption tests (https://www.producttalk.org/okrs-vs-outcomes/). Now = current cycle (specific experiments running this week); Next = next 1-2 cycles; Later = backlog. Map: Outcomes -> Objectives, validated Opportunities/Solutions -> KRs, Experiments -> the way we learn. Reference https://roadmap.one/blog/posts/blog9-9-opportunity-solution-tree/ for the OST-to-roadmap translation. The roadmap is the artifact that converts the OST into a stakeholder-readable narrative - without it, the OST stays inside product and the rest of the org never adopts the outcome lens.
Re-anchor product OKRs at quarterly planning
Re-anchor product OKRs to validated opportunities at quarterly planning. Each KR ties to (a) a product outcome quantitative lift target and (b) the OST opportunity expected to drive it. Discovery KRs (cadence, OST coverage, assumption tests) carry quarter-to-quarter as discipline metrics. Reference https://www.producttalk.org/okrs-vs-outcomes/ for the OKR-to-outcome translation. The re-anchor is the quarterly operating discipline that prevents OKR drift - without it, KRs decay into output metrics within 2 quarters.
Sign-off section (T2D3 IP gap fill). Define the trio comp/recognition philosophy (reward outcome lift not feature ship-count) and the spot-bonus rubric for assumption-test wins. Two tasks - exempt from the 8-task floor per Decision 2.
Define trio compensation/recognition philosophy (T2D3 IP)
Define trio comp/recognition philosophy: reward outcome lift (activation, adoption) NOT feature ship-count or velocity. T2D3 IP gap that Torres explicitly leaves open (https://www.producttalk.org/2020/06/measure-discovery/). Components: outcome-tied bonus pool, peer-nominated discovery awards (best-invalidated assumption, hardest cycle-time PR), public OKR scoreboard. Anchor in HR-policy-ready language. The philosophy is the cultural reinforcement layer that tells engineers and designers the org is serious about outcome accountability - without it, the comp signal stays output-tied and the trio's behavior eventually reverts.
Build discovery spot-bonus rubric
Build the discovery-spot-bonus rubric covering: validated invalidation (high-quality kill of a bad idea), opportunity discovery (newly identified high-importance opportunity), cycle-time PR (trio that closed cycle time fastest in a quarter), repository contribution (most-cited snippet). Quarterly cadence, peer-nominated, exec-approved. Reference https://www.producttalk.org/2020/06/measure-discovery/ for the cycle-time-PR rationale. The rubric is the operating mechanism that operationalizes the comp philosophy - without it, 'reward outcome lift' stays a slogan and never lands on actual paychecks.
Publishing / hand-off section. Run the STOP hand-off checklist (Standardize -> Templatize -> Optimize -> Productize) across all 22 playbook artifacts and publish the trio operating-cadence BAU runbook. Two tasks - exempt from the 8-task floor per Decision 2.
Run STOP hand-off checklist across 22 playbook artifacts
Run the STOP hand-off checklist (T2D3 IP framework: Standardize -> Templatize -> Optimize -> Productize) across all 22 playbook artifacts. For each artifact, mark its current STOP stage and the next-stage target. Identify which artifacts are mature enough to roll out beyond pilot trios. Reference https://www.producttalk.org/ for the BAU-handoff framing. The checklist is the operating mechanism that converts the playbook from a pilot project into a productized capability - without it, the artifacts stay 'project-scoped' and the discovery cadence collapses when the pilot ends.
Publish trio operating-cadence BAU runbook
Publish the trio operating-cadence BAU runbook: weekly trio working session (1h), daily 15-min trio check-in, weekly interview retro (30 min), bi-weekly OST update (90 min), monthly cross-trio review (90 min), monthly stakeholder update (Pain-Claim-Gain), quarterly retro. Owners, decision rights, escalation paths, KR scoreboard wiring. Reference https://www.producttalk.org/product-trios/ for the cadence rationale. The BAU runbook is the artifact that hands the discovery cadence over to the org as standing operating procedure - without it, the next CPO unwinds it within 6 months.
Lock in 1 interview per trio per week quota
Lock in the cadence quota: >=1 customer interview per trio per week, exec-sponsored as an SLA on KR1.1. This is the single hardest discipline to maintain (https://www.producttalk.org/) - bottom of the funnel. Defend with the cycle-time dashboard (m5-s4-t1) and monthly Pain-Claim-Gain updates that surface any week the trio missed cadence. The quota rule converts the OKR target into a binding operating SLA the org defends in writing.
Build the taxonomy dictionary with every tag's definition + 2 example snippets + change-control governance (who can promote a project tag to global, who can deprecate, change-log). Per Dovetail's tagging-taxonomy guidance (https://dovetail.com/blog/talking-tagging-taxonomies-bec-sareff-hibbert/, https://dovetail.com/blog/four-pitfalls-tagging-taxonomy-team-one/), governance is what separates living taxonomies from dead ones. The dictionary is the artifact that converts the tag set from 'a list' into 'a governed shared vocabulary' - without it, taxonomy drift kills the repository within 6 months.