The Cognitive Continuity Problem
How accumulated human judgment disappears at the moment of greatest consequence, and the architecture we built to prevent it.
The Problem Nobody Has Solved
Every person who has ever lived carried irreplaceable knowledge inside them. Not information in the sense of facts and dates. Those can be written down, indexed, retrieved. Something harder than that. The way they evaluated risk. The pattern they recognized in a failing relationship before the other person did. The instinct about a person in the first five minutes that proved correct thirty years later.
This is tacit knowledge in the formal sense defined by philosopher Michael Polanyi in 1958: knowledge that cannot be adequately expressed in words, that exists only in the practice of the person who holds it.[1] It is the knowledge that builds companies, holds families together, and shapes the character of the people around its owner. And it disappears completely at the moment of death or cognitive decline. Precisely when it is most needed.
"We can know more than we can tell."
MICHAEL POLANYI · The Tacit Dimension, 1966
The research on this loss is unambiguous. A 2024 systematic review published in Heliyon analyzed 28 studies on organizational knowledge transfer and found that the loss of tacit knowledge during generational change is one of the defining challenges of 21st-century organizations.[2] Harvard Business Review has estimated that poor succession planning, driven primarily by the failure to transfer tacit knowledge, costs organizations $1 trillion in lost market value annually.[3]
For families the calculus is not financial but it is no less real. Research presented at the 2025 CHI Conference on Human Factors in Computing Systems found that participants overwhelmingly identified the fading of family memory and the loss of generational wisdom as the central fear driving interest in cognitive preservation technology.[4]
The problem has three dimensions that previous approaches have failed to address simultaneously:
Basalith was built to address all three failure modes. The primary user participates intentionally. The cognitive reference is built from training data sourced exclusively from one specific person. And the underlying model layer improves continuously as AI advances, without requiring new input from the person after their death.
The Cognitive Fingerprint
The scientific basis for Basalith rests on a well-established phenomenon in cognitive neuroscience: the cognitive fingerprint. Research published in Scientific Reports demonstrated that individual behavioral patterns in controlled domains are measurably distinctive — establishing that cognitive signatures differ meaningfully between individuals.[5]
The study measured consistency in random number generation sequences, a constrained behavioral domain. Basalith applies the underlying principle — that individuals exhibit stable, distinctive behavioral patterns — across eleven broader dimensions of human cognition. Measuring consistency across domains as complex as Approach to Money or Relationship to Family presents a significantly harder problem than the original study addressed. We do not claim 96.5% accuracy across these dimensions. We claim that the patterns exist, are stable over time, and are meaningfully capturable through the methods described in Section 4.
Basalith builds an approximation of this fingerprint across eleven dimensions of human cognition. The word approximation is deliberate. What Basalith produces is an algorithmic reference model: a structured representation of how a person has been observed to think, decide, and respond. Not a simulation of consciousness. Not a reconstruction of a person. This distinction governs every design decision in the system.
Each dimension is scored on a continuous accuracy scale from 0 to 100. The score reflects not just the quantity of training data in that dimension but the specificity: how much the model has learned that is uniquely true of this person rather than generically true of people like them.
Human cognition is inherently contradictory. A person's Approach to Money may conflict sharply with their Core Values under financial pressure. The Basalith architecture does not resolve these contradictions. It preserves them. A person who held genuinely conflicting values is imperfectly represented by a model that smooths those conflicts away. The goal is fidelity, not coherence.
"The most revealing training data is not where a person was consistent. It is where they were not, and how they lived with that."
System Architecture
The Basalith platform is built on a multi-layer architecture designed for long-term data integrity, real-time cognitive reference interaction, and continuous learning from multiple input modalities.
The critical architectural point, addressed in detail in Section 5, is that the RAG layer and the behavioral fine-tuning layer are not sequential stages. They are permanent, parallel systems. The RAG layer provides factual grounding. The behavioral layer shapes expression. Neither replaces the other.
The stack: Vercel for edge compute, Supabase with PostgreSQL and Row Level Security for data persistence, and the Anthropic API for all language model operations. All storage is in private buckets. All tables enforce RLS at the policy level, independent of application code.
The Training Pipeline
The central technical challenge of building a cognitive reference model is not data collection. It is data quality. The Basalith training pipeline is built on one principle: specificity over volume. A single training pair that captures a person's response to a specific situation, with named people, real places, genuine consequences, is worth more than fifty training pairs of general opinion.
Scoring is performed by Claude Haiku. We acknowledge a limitation: Haiku's nuance in evaluating emotional honesty is bounded by its model capacity. High-stakes archives benefit from periodic review by a senior model. The scoring system is model-agnostic. The rubric and database schema remain stable across scoring model upgrades.
We acknowledge a known limitation of this approach. Claude Haiku's capacity to evaluate genuine emotional uniqueness — the difference between authentic understated truth and verbose cliché — is bounded by its model capacity. A lightweight economy model optimized for speed and cost efficiency is not the ideal instrument for evaluating the subtlest and most valuable category of training data.
The current pipeline addresses this through a two-tier escalation system. Pairs that score in the 60–75 range on the primary Haiku evaluation — ambiguous cases where the deposit may be significantly better or worse than the automated score suggests — are escalated to Claude Sonnet for secondary review before the included_in_training flag is set.
This two-tier architecture balances cost efficiency at volume with quality assurance for high-stakes scoring decisions. As archive density increases and fine-tuning decisions carry greater weight, the threshold for Sonnet escalation decreases — applying more rigorous evaluation precisely when it matters most.
Multi-perspective training is a core differentiator. The contributor network provides data self-report cannot generate. When a contributor's observation is confirmed by the primary user through the Wisdom Exchange correction mechanism, the resulting training pair carries the highest weight in the system.
RAG and Fine-Tuning: Two Permanent Layers
Fine-tuning a large language model on 500 training pairs does not inject episodic memory into the model's weights. It cannot. What fine-tuning at this scale accomplishes, and this is genuinely valuable, is modification of behavioral patterns: the model's tone, linguistic signature, reasoning style, and the characteristic ways it frames uncertainty and weighs competing values.
Factual grounding, the specific memories, the named people, the real events, must come from retrieval. This is what RAG provides.
"Fine-tuning teaches the model how this person thinks. RAG reminds it what they actually thought. A cognitive reference model needs both."
A response from a Basalith entity with only RAG and no behavioral fine-tuning will be accurate but generic in expression. A response with behavioral fine-tuning but no RAG will be stylistically accurate but factually unreliable. The model generates plausible content that was never said. The two layers are not alternatives. They are components of a single system that degrades meaningfully if either is removed.
This is why the engagement system, daily sparks, contributor questions, memory games, wisdom exchanges, is not an optional engagement feature. It is core infrastructure investment in RAG quality.
Accuracy, Measurement, and Contradiction
Measuring the accuracy of a cognitive reference model is inherently imperfect. Ground truth in this domain is not a fact verifiable against an external record. It is a judgment: does this response reflect how this person would actually respond?
Basalith uses three proxy mechanisms:
On contradiction: The Basalith architecture does not resolve conflicting values. It preserves them as data. A person who held genuinely conflicting values is imperfectly represented by a model that smooths those conflicts away. The goal is fidelity, not coherence.
Continuity Across Model Generations
The most important architectural decision in Basalith is the separation of training data from the model that processes it. The training pairs accumulated over years of deposits are stored in a structured database, independent of any specific AI model. When Claude Sonnet 4 is superseded, the archive does not revert. It migrates forward.
A precise framing: a more capable foundation model, applied to the same training data, will produce more articulate, more contextually sensitive responses. It will express the person's cognitive patterns with greater fidelity. What it will not do, and cannot do, is generate new cognitive content the person never provided. The model becomes a better instrument. It cannot add to what is in the archive.
"The data is the permanent asset. The model is the instrument. As instruments improve, the data they work with becomes more fully expressed. But the data itself does not change."
Model migration introduces a non-trivial risk that this framing understates. A new foundation model is not a passive lens applied to the same data. It brings different baseline reasoning patterns, systemic biases, moral weights, and cross-lingual handling. Applied to the same training data, a fundamentally different architecture may alter the perceived character of the entity in ways that are difficult to predict.
Basalith mitigates this through a regression testing protocol. During onboarding and at each Stage milestone, the archive owner approves a curated suite of prompt-response pairs that serve as a behavioral baseline — the owner confirming that these responses accurately reflect how they think and speak.
Before any model migration is deployed to a live archive, the new model's outputs are tested against this baseline. Significant divergence from approved responses triggers human review before the migration is applied to the live archive.
This protocol does not eliminate the risk of persona drift across model generations. It establishes a documented, owner-approved reference point that allows drift to be detected, measured, and mitigated before the archive owner's family experiences it.
This is why the early years of an archive, while the primary user is alive, are irreplaceable. Every deposit made during this period is a permanent asset that will be expressed more fully by every model generation that follows.
Voice portrait generation operates on the same principle. The voice recordings captured during a person's life are the permanent asset. Voice synthesis technology will produce increasingly accurate results as it advances. The recordings do not improve. The technology that processes them does.
Post-Mortem Governance
Version 1.0 of this paper did not address who controls the archive after the primary user dies. This omission has been corrected here.
An archive that can be altered, commercialized, or deleted by heirs against the owner's wishes is not a legacy tool. It is a liability. The primary user who builds a Basalith archive over years of intentional deposits has a reasonable expectation that what they built will be preserved in the form they built it.
We expect this to be an evolving area of law and ethics as cognitive legacy technology matures. The framework here represents Heritage Nexus Inc.'s current position, which we will update as the field develops.
Privacy and Data Architecture
At the database level, Row Level Security is enforced on all tables. Even a misconfigured application cannot access data across archive boundaries. The database enforces isolation at the query level independent of the application layer.
Research on LLM fine-tuning and privacy identified unintended memorization of sensitive information as a key risk.[7] The Basalith approach mitigates this by maintaining training data in a structured database rather than embedding it in model weights during the prompt-engineering phase.
Application to Organizational Succession
The cognitive continuity problem exists in organizations as acutely as it does in families. A 2024 systematic review in Heliyon found that the methods available for capturing tacit knowledge remain inadequate across 68% of organizations studied.[2]
The tacit knowledge most critical to organizational performance cannot be documented in a process manual. The judgment a founder brings to an acquisition decision. The instinct a senior executive has about a key hire. The pattern recognition that has guided a company through three economic cycles.
"Organizations face a potential knowledge vacuum due to the retirement of the baby boomer generation. Effective knowledge transfer strategies remain elusive in many organizations."
IGOA-IRAOLA & DIEZ · Heliyon, 2024
For the full enterprise treatment: basalith.ai/succession
Research Foundations
Five research areas underpin the Basalith architecture:
The Roadmap
The version of Basalith that exists today is the least capable version that will ever exist. Every advancement in foundation model quality, voice synthesis, and cognitive modeling directly improves the fidelity of every existing archive without any action required from the archive owner.
The clinical baseline application deserves emphasis. The same architecture that preserves cognitive patterns for legacy purposes can establish a documented cognitive baseline before decline begins. A person who builds a Basalith archive at 60 creates a measurable record of how they thought at peak cognitive function.
"You never truly leave if you leave enough of yourself behind."
BASALITH · 2026
Cited Research
Polanyi, M. (1966). The Tacit Dimension. University of Chicago Press.
Foundational text establishing tacit knowledge: knowledge that cannot be fully articulated in words.
Igoa-Iraola, E., & Diez, F. (2024). Procedures for transferring organizational knowledge during generational change: A systematic review. Heliyon, 10(4).doi:10.1016/j.heliyon.2024.e27092
28-study PRISMA review. 68% of organizations attempt tacit and explicit knowledge transfer. Effective methods remain elusive.
Harvard Business Review research cited in: University of Vermont. (2025). Knowledge Transfer and Succession Planning Certificate.learn.uvm.edu/program/knowledge-transfer-succession-planning-certificate
Poor succession planning linked to $1 trillion in lost market value annually.
Lei, Y. et al. (2025). AI Afterlife as Digital Legacy: Perceptions, Expectations, and Concerns. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems.doi:10.1145/3706598.3713933
Participants identified AI-generated agents preserving family memories as central value proposition. Interviews conducted June-August 2024.
Schulz, M-A., Baier, S., Timmermann, B., Bzdok, D., & Witt, K. (2021). A cognitive fingerprint in human random number generation. Scientific Reports, 11.doi:10.1038/s41598-021-98315-y
Same-author vs. different-author behavioral sequences distinguished at 96.5% AUC from 300 data points. Fingerprint stable over one week.
Brickman, J., Gupta, M., & Oltmanns, J.R. (2025). Large Language Models for Psychological Assessment: A Comprehensive Overview. Advances in Methods and Practices in Psychological Science.doi:10.1177/25152459251343582
Reviewing Simchon et al. (2023): fine-tuned model predicting personality traits from social media posts.
Unintended Memorization of Sensitive Information in Fine-Tuned Language Models. (2025). arXiv:2601.17480.
LLMs memorize training samples even when seen once. Curated training pipelines recommended.
Au, S. et al. (2025). A Survey of Personalized Large Language Models: Progress and Future Directions. arXiv:2502.11528.
Comprehensive taxonomy of PLLM approaches. Per-user PEFT paradigm.
Large Language Models for Oral History Understanding with Text Classification and Sentiment Analysis. (2025). arXiv:2508.06729.
Effective annotation across 92,191 sentences from 1,002 interviews in the JAIOH oral history collection.