Progressive Summarization for Technical Knowledge Management
Progressive Summarization for Technical Knowledge Management
The Challenge of Information Overload
Principal engineers face a relentless stream of technical information: API documentation, architecture proposals, research papers, code reviews, incident reports, conference talks, and blog posts. Traditional approaches to knowledge management—highlighting everything, taking verbatim notes, or saving links “to read later”—fail because:
- Signal drowns in noise: Too much information makes it impossible to find what matters
- Context fades: Detailed notes lose meaning weeks later without key insights highlighted
- Retrieval fails: Comprehensive notes are too dense to scan when you need quick answers
- Application stalls: Without distilled insights, knowledge doesn’t transfer to new situations
Progressive Summarization offers a solution: a layered approach to note-taking that compresses information in stages, making it progressively easier to extract value over time.
What is Progressive Summarization?
Developed by productivity expert Tiago Forte, Progressive Summarization is a technique for processing information through multiple passes, each adding a layer of highlighting or compression. Instead of trying to “process” information perfectly on first encounter, you interact with it multiple times, each pass guided by genuine need.
The Four Layers
Layer 1: Original Content
Save the raw source—article, documentation, meeting notes, code comments—in a note or personal knowledge base. No editing, just capture.
Layer 2: Bold Highlights
When you return to the note (because you need it), bold the most important 10-20% of passages—the sentences that capture key ideas, surprising insights, or actionable recommendations.
Layer 3: Highlighted Key Points
On a subsequent visit (driven by a project or question), highlight the most critical 10-20% of the bold text—the absolute essence. These are the phrases you’d want to remember 5 years from now.
Layer 4: Executive Summary
When the note proves highly valuable across multiple contexts, write a brief summary (3-5 sentences) at the top capturing the core insight and how it applies to your work.
Layer 5: Remix (Optional)
Create new artifacts: blog posts, architecture decision records, design docs, or presentations synthesizing insights from multiple progressively summarized notes.
Why It Works: Cognitive Science
Progressive Summarization aligns with several evidence-based learning principles:
Spaced Repetition
Each pass through the material constitutes a retrieval event. Neuroscience research shows that spaced retrieval strengthens memory formation more effectively than massed practice. By returning to notes when genuinely needed (not on an arbitrary schedule), you create organic spacing.
Desirable Difficulty
The act of deciding what to bold or highlight requires active judgment about relevance and importance. This cognitive effort (a “desirable difficulty”) strengthens learning compared to passive re-reading.
Levels of Processing
Psychologists Craik and Lockhart demonstrated that deeper semantic processing creates stronger memories than shallow perceptual processing. Progressive layers force increasingly semantic engagement: Layer 1 is shallow, Layer 4 requires deep comprehension.
Just-in-Time Learning
Each summarization pass is triggered by genuine information need, creating intrinsic motivation and clear application context. Knowledge extracted when you need it transfers better than knowledge acquired “just in case.”
Implementation for Technical Leaders
Setting Up Your System
Choose Your Tool
- Obsidian: Markdown files, local storage, bidirectional links
- Notion: Rich formatting, databases, team collaboration
- Roam Research: Outliner format, block references
- Standard Notes: Privacy-focused, encrypted, markdown
- Plain text files + Git: Maximum portability and longevity
Key requirement: Easy text styling (bold/highlight) and fast search.
Capture Layer (Layer 1)
When encountering valuable technical content:
# Distributed Tracing with OpenTelemetry
Source: OpenTelemetry Documentation
Date: 2025-11-16
Tags: #observability #distributed-systems #go
OpenTelemetry provides a unified standard for instrumenting applications
to generate telemetry data (traces, metrics, logs). Unlike vendor-specific
APM tools, OTel is open-source and vendor-neutral, allowing you to switch
backends (Jaeger, Zipkin, DataDog, Honeycomb) without code changes.
Key concepts:
- Spans represent operations, containing start/end times, tags, logs
- Traces are trees of spans representing end-to-end request flow
- Context propagation passes trace IDs across service boundaries
- Sampling reduces data volume while maintaining statistical accuracy
Go implementation uses the go.opentelemetry.io/otel package...
[Continue with raw content]
Capture rules:
- Include source and date for credibility assessment later
- Add tags for discoverability
- Don’t edit—preserve original language and structure
- Capture generously—better to have too much than miss key details
First Summarization Pass (Layer 2)
When a project requires distributed tracing knowledge, return to the note and bold key passages:
# Distributed Tracing with OpenTelemetry
Source: OpenTelemetry Documentation
Date: 2025-11-16
Tags: #observability #distributed-systems #go
**OpenTelemetry provides a unified standard for instrumenting applications
to generate telemetry data (traces, metrics, logs).** Unlike vendor-specific
APM tools, **OTel is open-source and vendor-neutral, allowing you to switch
backends (Jaeger, Zipkin, DataDog, Honeycomb) without code changes.**
Key concepts:
- **Spans represent operations, containing start/end times, tags, logs**
- **Traces are trees of spans representing end-to-end request flow**
- **Context propagation passes trace IDs across service boundaries**
- Sampling reduces data volume while maintaining statistical accuracy
**Go implementation uses the go.opentelemetry.io/otel package** with
middleware for automatic HTTP/gRPC instrumentation...
Bolding criteria:
- Definitions of core concepts
- Surprising or counterintuitive insights
- Actionable recommendations
- Code snippets worth revisiting
- Trade-offs or limitations
Second Summarization Pass (Layer 3)
When debugging a specific production incident involving trace propagation:
# Distributed Tracing with OpenTelemetry
Source: OpenTelemetry Documentation
Date: 2025-11-16
Tags: #observability #distributed-systems #go
**OpenTelemetry provides a unified standard for instrumenting applications
to generate telemetry data (traces, metrics, logs).** Unlike vendor-specific
APM tools, **==OTel is open-source and vendor-neutral, allowing you to switch
backends== without code changes.**
Key concepts:
- **Spans represent operations, containing start/end times, tags, logs**
- **Traces are trees of spans representing end-to-end request flow**
- **==Context propagation passes trace IDs across service boundaries==**
- Sampling reduces data volume while maintaining statistical accuracy
**Go implementation uses the go.opentelemetry.io/otel package**...
Highlighting criteria (10-20% of bold text):
- The absolute core insight
- Phrases you’d want to rediscover in 5 seconds when scanning
- Concepts relevant across multiple projects
Executive Summary (Layer 4)
After using this note across multiple observability projects:
# Distributed Tracing with OpenTelemetry
## Summary
OpenTelemetry is vendor-neutral instrumentation standard for distributed
tracing. Key advantage: switch backends without code changes. Context
propagation across services requires explicit header passing (HTTP) or
metadata (gRPC). Go SDK provides automatic middleware for common frameworks.
**OpenTelemetry provides a unified standard for instrumenting applications
to generate telemetry data (traces, metrics, logs).**...
Example: Architecture Decision Record
Here’s how progressive summarization transforms into a decision artifact:
Layer 1 Capture (from various sources):
- OTel documentation on instrumentation
- Blog post comparing APM solutions
- Incident report on missing traces
- Team discussion notes on observability gaps
Layer 2 Bold (project-specific): During architecture planning for new microservices platform
Layer 3 Highlight (decision-critical): When evaluating observability solutions
Layer 5 Remix:
# ADR-023: Adopt OpenTelemetry for Distributed Tracing
## Context
Our microservices platform lacks end-to-end request visibility. Recent
incidents took 4+ hours to diagnose due to missing trace context across
service boundaries.
## Decision
Adopt OpenTelemetry for distributed tracing with Jaeger backend initially,
preserving option to switch to managed solution (Honeycomb, DataDog) later.
## Rationale
- Vendor-neutral standard prevents lock-in
- Go SDK provides automatic HTTP/gRPC instrumentation
- Context propagation solves cross-service debugging
- Open-source Jaeger deployment for MVP, commercial options later
## Consequences
- Engineers must instrument services using otel.Tracer API
- 2-week migration period for existing services
- Improves MTTR for production incidents by estimated 60%
This ADR synthesizes insights from progressively summarized notes on OTel, Jaeger, incident reports, and team discussions.
Best Practices for Technical Leaders
1. Start Minimal
Don’t retroactively summarize old notes. Start with new captures and let layers accumulate organically through genuine use.
2. Trigger by Need
Only add layers when you return to a note for a real purpose: solving a problem, answering a question, making a decision. Need creates context and judgment.
3. Be Ruthless
Each layer should be 10-20% of the previous layer. If you’re bolding 50%, you’re not compressing enough. Force prioritization.
4. Link Notes
Connect related notes bidirectionally. Progressive summarization + linked notes = powerful knowledge network.
5. Review Executive Summaries
Periodically scan Layer 4 summaries across all notes—this creates unexpected connections and reveals knowledge gaps.
6. Share Strategically
Layer 4 summaries and Layer 5 remixes are perfect for sharing with teams (architecture docs, design proposals, retrospective insights).
Common Pitfalls
Over-summarizing on first pass: Resist the urge to immediately compress. Capture generously, summarize later when context is clearer.
Premature organization: Don’t spend hours designing taxonomies and folder structures. Tags and search are sufficient; organization emerges through links over time.
Perfectionism: Notes are working documents, not publications. Embrace messiness in early layers.
Neglecting revisitation: If notes never get revisited, they have no value. Choose what to capture based on likelihood of future relevance.
Hoarding without application: Progressive summarization serves action, not collection. Knowledge management is means, not end.
Tools and Workflows
Obsidian Workflow
<!-- Layer 2: Bold with **text** -->
<!-- Layer 3: Highlight with ==text== -->
<!-- Layer 4: Summary with ## Summary header -->
<!-- Tags: #distributed-systems #go #architecture -->
<!-- Links: [[Context Propagation]], [[Go Instrumentation]] -->
Notion Workflow
- Toggle blocks for collapsing long Layer 1 content
- Colored highlights for Layer 3
- Summary callout boxes for Layer 4
- Database properties for tags and status
Plain Text + Git Workflow
Bold: **text**
Highlight: <mark>text</mark>
Summary: First paragraph
Version control: Git commit messages track summarization passes
Measuring Success
Progressive summarization is working when:
- Retrieval time drops: You find relevant information in seconds, not minutes
- Application increases: Captured knowledge influences decisions and artifacts
- Connections emerge: You notice patterns across disparate notes
- Team impact grows: Your summaries become team resources (ADRs, wikis, onboarding docs)
- Confidence improves: You trust your notes to capture what matters
Conclusion
For principal engineers navigating constant information flow, Progressive Summarization offers a sustainable approach to knowledge management. By deferring comprehension and compression until genuinely needed, you avoid wasted effort on information that never becomes relevant. Each interaction with a note compounds its value, creating a personal knowledge base that actually serves you when it matters.
Start small: pick one note today, make Layer 2 pass. Let the system grow organically through use, not through elaborate upfront setup. Your future self—debugging a production issue, writing an architecture proposal, or mentoring a junior engineer—will thank you for the compounded insights readily at hand.