Beyond Knowledge: The Four-Type Framework for LLM Wiki Reasoning
- Authors

- Name
- Nino
- Occupation
- Senior Tech Editor
Andrej Karpathy’s concept of the 'LLM Wiki' is transformative. The premise is simple yet powerful: take raw data, use a Large Language Model (LLM) to extract concepts and relationships, and build a structured, navigable personal knowledge base. It solves the fragmentation problem of traditional note-taking. However, as many developers building on this foundation have discovered, there is a glass ceiling. You can build a system that knows everything, yet understands nothing about how to apply that knowledge.
When we talk about 'Judgment' in AI, we are talking about the gap between a system that can recite a textbook and a system that can act as a mentor, a diagnostician, or a strategist. This article explores the evolution of the LLM Wiki from a simple fact-retrieval system into a reasoning engine, utilizing a four-type knowledge framework and the 'Mine' operation. To implement these advanced structures, developers require access to the world's most capable models, such as Claude 3.5 Sonnet and DeepSeek-V3, which are readily available via n1n.ai.
The Failure of Declarative Knowledge
Imagine an AI programming tutor. You feed it thousands of pages of documentation. When a student asks, 'What is a Promise?', the AI responds with a perfect MDN definition: 'A Promise is an object representing the eventual completion or failure of an asynchronous operation.'
Technically, the AI is correct. Pedagogically, it has failed. A human mentor knows that if a student is asking about Promises, the problem isn't a lack of a definition; it's likely a lack of understanding regarding the event loop or callback hell. The mentor doesn't give a definition; they ask a diagnostic question: 'Do you understand how the call stack handles asynchronous tasks?'
LLM Wiki 1.0 focuses on Declarative Knowledge—facts, definitions, and summaries. This is the 'What.' But expertise is built on the 'How,' the 'Why,' and the 'When.'
The Four-Type Knowledge Framework
To bridge the gap between information and judgment, we must categorize knowledge into four distinct buckets. Each requires a different extraction strategy and a different retrieval logic.
1. Declarative Knowledge (The 'What')
This is the foundation. It includes facts, concepts, and definitions.
- Example: The syntax for a Python decorator.
- Storage: Standard RAG (Retrieval-Augmented Generation) works well here.
2. Procedural Knowledge (The 'How to Reason')
This is the sequence of expert decisions. It’s not just knowing the facts; it’s knowing the order in which to apply them.
- Example: In medical diagnosis, you check for vital sign stability before looking at lab results. In programming, you verify the environment before debugging the logic.
- Storage: Reasoning paths and decision trees.
3. Experiential Knowledge (The 'Worked Examples')
This involves complete, unedited transcripts of experts at work, including their mistakes and pivots.
- Example: A 40-turn Socratic dialogue where a teacher sets a 'trap' to expose a student's misconception.
- Storage: High-context windows are required to process these long-form interactions, making models like Claude 3.5 Sonnet on n1n.ai ideal for this task.
4. Interaction Knowledge (The 'How to Guide')
This is the meta-knowledge of engagement. When do you tell the answer? When do you stay silent? When do you provide a hint?
- Example: If a student fails a 'Probe' twice, switch to a 'Tell' strategy.
- Storage: Pattern-based templates for agentic behavior.
The 'Mine' Operation: Beyond Simple Ingestion
In Karpathy's original framework, the primary operation is Ingest. You ingest a PDF, and the LLM extracts facts. To capture the other three types of knowledge, we introduce a second operation: Mine.
While 'Ingest' looks for facts, 'Mine' looks for decisions.
| Feature | Ingest (Wiki 1.0) | Mine (Wiki 2.0) |
|---|---|---|
| Target | Facts & Entities | Decision Points & Pivots |
| Output | Declarative Knowledge | Procedural & Experiential |
| Goal | Information Retrieval | Judgment & Reasoning |
| Model Requirement | Standard LLM (GPT-4o mini) | High-Reasoning LLM (DeepSeek-V3) |
Implementation Guide: Mining Decisions with n1n.ai
To build a Wiki 2.0, you need to process raw material (like dialogue transcripts or case studies) through a specific 'Mining' prompt. Using n1n.ai, you can switch between models to find the best extraction logic for your specific domain.
Step 1: The Mining Prompt
# Example of a 'Mining' prompt using n1n.ai API structure
mining_prompt = """
Analyze the following expert-student dialogue.
1. Identify every point where the expert made a choice (e.g., asked a question instead of answering).
2. Explain the 'Why' behind that choice (the hidden procedural rule).
3. Extract the 'Interaction Pattern' (e.g., If X happens, then the expert does Y).
Output this as a JSON object structured for a Procedural Knowledge Base.
"""
Step 2: Processing at Scale
When mining thousands of documents, cost and speed become critical. DeepSeek-V3 provides exceptional reasoning capabilities at a fraction of the cost of other frontier models. By using the unified API at n1n.ai, you can route 'Ingest' tasks to cheaper models and 'Mine' tasks to high-reasoning models like OpenAI o1 or DeepSeek-V3.
Case Study: Socratic Tutoring
Researchers at WashU analyzed nearly 100 TA sessions. They found that even trained TAs defaulted to 'Telling' (Declarative) 75% of the time, while Socratic questioning (Procedural/Interaction) happened in less than 1% of the interactions.
Why? Because under pressure, humans (and LLMs) default to the easiest path: giving the fact. By explicitly 'Mining' Socratic rules (like Anderson’s 23 rules) and storing them in your LLM Wiki, you provide the AI with a 'Reasoning Rail' that prevents it from defaulting to simple fact-dumping.
Technical Comparison of Models for Knowledge Mining
| Model | Reasoning Depth | Context Window | Best Use Case in Wiki 2.0 |
|---|---|---|---|
| Claude 3.5 Sonnet | Extreme | 200k | Mining long Experiential transcripts |
| DeepSeek-V3 | High | 128k | Cost-effective Procedural extraction |
| GPT-4o | High | 128k | General Declarative Ingestion |
| o1-preview | Elite | 128k | Complex Interaction logic synthesis |
All these models can be accessed via a single integration point at n1n.ai, allowing you to build a hybrid pipeline that is both smart and cost-efficient.
The Future of AI Judgment
Judgment isn't a problem of 'not knowing enough.' It is a problem of 'knowing the wrong things.' If your LLM Wiki only contains facts, your AI will only ever be a search engine.
By implementing the Four-Type Framework—Declarative, Procedural, Experiential, and Interaction—you transform your knowledge base into a wisdom base. You move from an AI that can pass a test to an AI that can perform a job.
As you begin building your next-generation LLM Wiki, remember that the quality of your 'Mine' operation depends on the quality of the model you use. Experiment with the latest frontier models to see which one extracts the most nuanced decision paths for your specific industry.
Get a free API key at n1n.ai.