Chapter 7 The Canon Awakens

· rcliao's blog

A routine refactor uncovers a 2019 "Canon" document still controlling production—with 91% staleness risk and zero ownership.

Table of Contents

Chapter 7: The Canon Awakens #

Blue gopher, Le Auditor, Scout-nut reading book about canon

From the series: The Living Docs Library

Each sprint can uncover outdated documentation. In this instance, an unverified document with conflicting information was found, predating the current 'Canon' or source of truth.


1. The Mystery Begins #

The ticket title was: “Refactor Loyalty-Sync jobs”

The ticket, seemingly for routine cleanup, referenced four conflicting documents detailing the loyalty point synchronization flow between the ERP system and Salesforce Commerce. These documents lacked creation dates, ownership, and a definitive version.

Sprint Planning Document

Note: Ensure all team members review the flagged articles to avoid reliance on outdated sources.

Reference: Sprint planning template available in team documentation repository (/templates/sprint-planning-template.md).


2. Scout-nut’s Crawl #

A terminal command was executed:

1scout-nut crawl --target="loyalty-sync"

The 'Scout-nut' tool, an AI for archival and analysis, processed Git repositories and Confluence pages to extract metadata.

Scout-nut Output (Representative Example):

 1[
 2  {
 3    "doc_id": "doc_erp_webhook",
 4    "last_edit": "2021-08-14",
 5    "code_refs_count": 3,
 6    "linked_files": ["erp_webhook.py", "erp_utils.js"]
 7  },
 8  {
 9    "doc_id": "doc_temp_migration",
10    "last_edit": "2022-03-10",
11    "code_refs_count": 0,
12    "linked_files": []
13  },
14  {
15    "doc_id": "doc_data_flow",
16    "last_edit": "2024-01-12",
17    "code_refs_count": 5,
18    "linked_files": ["data_flow_diagram.png", "data_flow_spec.md"]
19  },
20  {
21    "doc_id": "doc_ancient_canon",
22    "last_edit": "2019-06-02",
23    "code_refs_count": 12,
24    "linked_files": ["ancient_canon_notes.txt", "fax_api_cron.sh"]
25  }
26]

Note: The linked_files field provides additional context for engineers to trace the origin of metadata.


3. Le Auditor Ranks the Truth #

Next, the 'Le Auditor' scoring agent was utilized.

The development team executed the evaluator script:

 1import math
 2
 3def freshness_weight(days_since_edit):
 4    """Returns 1.0 for recent edits (0 days), decreasing to 0.0 after 365 days"""
 5    return max(0, 1 - days_since_edit / 365)
 6
 7def calculate_relevancy(code_refs, days_since_edit):
 8    return math.log10(code_refs + 1) + freshness_weight(days_since_edit)
 9
10def staleness_risk(days_since_edit, max_days=1825):
11    """Calculate staleness percentage. max_days=1825 (5 years represents maximum 
12    practical document lifecycle based on technology refresh cycles)"""
13    return round(100 * days_since_edit / max_days)

Results poured in:

Rank Doc Title Relevancy Staleness Action
1 Loyalty-Sync Data Flow 92 % 10 % 👍 Read
2 Legacy ERP Webhook 78 % 84 % ⚠ Update
3 Temp Migration Plan 25 % 60 % 🚮 Decommission
4 The Canon 38 % 91 % 🚨 CRITICAL REVIEW

Note: Table shows calculated results based on Scout-nut metadata using Le Auditor scoring algorithm.

Sample Relevancy Calculation for Loyalty-Sync Data Flow:

Example Metadata Analysis:

Insight: Relevancy is determined by recent code references and edit freshness, not simply by the number of references. A staleness risk greater than 70% indicates outdated documentation.


4. The Canon Resurfaces #

The 'Scout-nut' tool identified a note in the 'doc_ancient_canon':

“See Fax API Cron (status: active?) for loyalty patch override.”

Reconstructed Code Example (based on legacy documentation references):

1#!/bin/bash
2# Fax API Cron Job
3# Last updated: 2019-05-30
4
5if [ "$LOYALTY_PATCH" == "active" ]; then
6  echo "Applying loyalty patch override..."
7  # Additional commands here
8fi

The entry indicated a potentially active, forgotten rule that might still be affecting production. This discovery immediately triggered architecture team consultation and emergency dependency mapping.

The team attempted to contact the last editor but received no response.

Action Required: Schedule comprehensive documentation integrity audit to address identified legacy dependencies.

🚨 Emergency Protocol: Immediately escalate discovered active legacy dependencies to system architecture team and product owners.


5. Blueprint for Engineers #

Engineers can replicate this workflow for any sprint:

🧭 Sprint Planning Workflow #

  1. Pre-refinement: Run Scout-nut crawl on story keywords.
  2. Extract doc metadata: last_edit, code_refs_count.
  3. Score & rank with Le Auditor.
  4. Review only the top-ranked documents.
  5. Flag documents with a staleness score greater than 70% as cleanup tickets.
  6. Repeat this process weekly using an automated sweep.

⚠️ Critical: All automated scores require domain expert validation before making archival or deprecation decisions.


6. Security Considerations #

When implementing automated documentation analysis, teams must address several critical security and privacy concerns:

Data Boundary Management #

Sensitive Content Handling #

Access Control & Permissions #

Implementation Guidelines #


7. Practical Prompts #

[Scout-nut] Doc-to-Code Correlation Prompt:

1For each linked document:
21. Extract file paths and class names.
32. Search the repository for matches.
43. Return the following details: {doc_id, last_edit, code_refs_count}.

[Le Auditor] Relevancy & Staleness Scoring:

1Relevancy = log₁₀(code_refs+1) + freshness_weight(days_since_edit)
2where freshness_weight(d) = max(0, 1 - d/365)
3
4Staleness Risk = (days_since_edit / max_days) × 100
5where max_days = 1825 (5-year technology refresh cycle)
6
7Documents are ranked based on these scores, and those with Staleness Risk > 70% are flagged for review.
last updated: