Codebase Understanding

How to Visualize Your Codebase Architecture

Vaibhav Verma

April 6, 2026

8 min read

architecturevisualizationdeveloper toolscodebase understandingdiagrams

How to Visualize Your Codebase Architecture

I spent a week creating a beautiful architecture diagram in Lucidchart when I joined my third company. Color-coded boxes, clean arrows, labeled data flows. By the time I presented it to the team, two services had been renamed and a new database had been added. The diagram was wrong before it was finished.

That experience taught me something I should have learned years earlier: hand-drawn architecture diagrams are a maintenance burden disguised as documentation. The only diagrams worth having are the ones generated from the code itself.

Why Manual Diagrams Fail

Manual diagrams fail for the same reason all static documentation fails: code changes faster than docs. But diagrams are worse than written docs because they're harder to update. Editing a paragraph takes 30 seconds. Rearranging boxes and arrows in a diagramming tool takes 30 minutes.

The result is predictable. Teams create diagrams during planning, maybe update them once after launch, and then abandon them. Six months later, a new hire finds the diagram, assumes it's accurate, and builds mental models on a foundation of lies.

The Four Types of Useful Visualizations

Not all visualizations are created equal. After experimenting with dozens of approaches, I've settled on four types that actually provide ongoing value.

Type 1: Dependency Graphs (Auto-Generated)

These show which modules import from which other modules. They're the most reliably useful visualization because they can be generated entirely from source code with zero manual input.

For JavaScript/TypeScript:

bash

# Install madge
npm install -g madge

# Generate a dependency graph as SVG
madge --image dependency-graph.svg src/index.ts

# Find circular dependencies
madge --circular src/index.ts

# Generate a graph for a specific module
madge --image orders.svg src/orders/index.ts

Madge produces clean, readable graphs for projects up to about 200 modules. Beyond that, the graphs become spaghetti. For larger projects, filter by directory:

bash

# Only show dependencies within the orders module
madge --image orders-internal.svg --exclude "^(?!src/orders)" src/orders/index.ts

For Go:

bash

# Built-in dependency visualization
go mod graph | modgraphviz | dot -Tsvg -o deps.svg

For Python:

bash

pip install pydeps
pydeps myproject --max-bacon=2 --cluster

Type 2: C4 Model Diagrams (Semi-Automated)

The C4 model (Context, Containers, Components, Code) by Simon Brown provides four zoom levels for architecture visualization. The key insight is that you need different diagrams for different audiences.

Level	Audience	Shows	Update Frequency
Context	Executives, new hires	System + external actors	Quarterly
Container	Architects, DevOps	Services, databases, queues	Monthly
Component	Developers	Modules within a service	As needed
Code	Developers	Classes/functions	Never (use IDE)

For C4 diagrams, I use Structurizr DSL because it's code-based and lives in the repo:

workspace {
  model {
    user = person "Customer"
    system = softwareSystem "E-Commerce Platform" {
      webapp = container "Web App" "Next.js" "TypeScript"
      api = container "API Server" "Express" "TypeScript"
      db = container "Database" "" "PostgreSQL"
      cache = container "Cache" "" "Redis"
      queue = container "Message Queue" "" "RabbitMQ"
    }

    user -> webapp "Browses products, places orders"
    webapp -> api "API calls" "HTTPS/JSON"
    api -> db "Reads/writes data" "SQL"
    api -> cache "Caches sessions, product data"
    api -> queue "Publishes order events"
  }

  views {
    container system {
      include *
      autolayout lr
    }
  }
}

This DSL file lives in docs/architecture/workspace.dsl. It can be rendered with the Structurizr CLI or the free Structurizr Lite tool. Because it's code, it goes through PR review when updated.

Type 3: Data Flow Diagrams (Manual but Stable)

Some visualizations need human judgment. Data flow diagrams show how data moves through the system: where it enters, how it transforms, where it persists. These change less frequently than code structure because data flows tend to be stable even as implementation details change.

I use Mermaid for these because it's text-based, renders in GitHub markdown, and is easy to update:

graph LR
    A[Client Browser] -->|POST /api/orders| B[API Gateway]
    B -->|Validate + Auth| C[Order Service]
    C -->|Check stock| D[Inventory Service]
    C -->|Create order| E[(Orders DB)]
    C -->|Publish event| F[Message Queue]
    F -->|order.created| G[Payment Service]
    F -->|order.created| H[Notification Service]
    G -->|Payment result| E

Type 4: Hotspot Maps (Auto-Generated)

These visualize which parts of your codebase get the most attention. Built from git history, they show file size as bubble size and change frequency as color intensity.

bash

# Using code-maat (Adam Tornhill's tool)
git log --format=format: --name-only --since="6 months ago" | \
  sort | uniq -c | sort -rn > churn.txt

# Combine with LOC counts for a treemap visualization
cloc --by-file --csv src/ > loc.csv

Tools like CodeScene generate these automatically and provide interactive treemaps. For a free alternative, you can feed the data into D3.js or use the git-of-theseus tool.

My Contrarian Take: Most Architecture Diagrams Should Be Ugly

I was wrong for years about diagram aesthetics. I spent hours making diagrams pretty: consistent colors, aligned boxes, curved arrows. Beautiful diagrams. Useless diagrams, because they were too expensive to update.

The diagrams that actually get maintained are the ugly ones. A Mermaid diagram in a markdown file. A text-based dependency graph. A whiteboard photo saved to the repo. These get updated because updating them takes 5 minutes, not 50.

Optimize for accuracy and maintainability, not beauty. A correct ugly diagram is worth infinitely more than a beautiful outdated one.

The Visualization Decision Tree

Use this to decide which visualization to create:

Do you need to show module dependencies? Use auto-generated dependency graphs (madge, go mod graph).
Do you need to explain the system to non-developers? Use C4 Context and Container diagrams.
Do you need to show how data flows through the system? Use Mermaid data flow diagrams.
Do you need to find refactoring targets? Use hotspot maps from git history.
Do you need to show class relationships? Don't. Use your IDE's "Go to Definition" instead.

Setting Up Automated Visualization

Here's a practical CI pipeline that generates architecture visualizations on every merge to main:

yaml

# .github/workflows/architecture-viz.yml
name: Generate Architecture Visualizations
on:
  push:
    branches: [main]
    paths: ["src/**"]

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for git analysis

      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - run: npm install -g madge

      - name: Generate dependency graph
        run: madge --image docs/generated/dependency-graph.svg src/index.ts

      - name: Check for circular dependencies
        run: |
          CIRCULAR=$(madge --circular src/index.ts)
          if [ -n "$CIRCULAR" ]; then
            echo "::warning::Circular dependencies found"
            echo "$CIRCULAR"
          fi

      - name: Commit generated diagrams
        run: |
          git config user.name "github-actions"
          git config user.email "actions@github.com"
          git add docs/generated/
          git diff --staged --quiet || git commit -m "Update architecture visualizations"
          git push

This ensures your dependency graph is always current. No human maintenance required.

Tools Worth Your Time

Tool	Type	Language	Cost
madge	Dependency graph	JS/TS	Free
Structurizr	C4 diagrams	Any	Free (Lite)
Mermaid	Flow diagrams	Any	Free
CodeScene	Hotspot maps	Any	Commercial
Dependency Cruiser	Dependency graph	JS/TS	Free
Arkit	Architecture diagram	JS/TS	Free
code-maat	Complexity analysis	Any	Free
Graphviz/dot	Graph rendering	Any	Free

Start with one auto-generated dependency graph and one hand-drawn C4 Container diagram. That's enough for most teams. Add more visualizations only when you have a specific question that existing visualizations can't answer.

The goal isn't to create diagrams. It's to make the codebase understandable. Sometimes the best visualization is a three-line Mermaid diagram in a README. Sometimes it's a generated SVG in CI. Match the tool to the need.