Building a Personal Knowledge System with Obsidian and LLMs

Building a Personal Knowledge System with Obsidian and LLMs January 9, 2026

The Problem

I take a lot of notes. Articles I read, things I learn at work, random ideas, meeting notes - it all goes somewhere. For years that "somewhere" was a graveyard. Notion databases I never looked at again. Apple Notes full of half-finished thoughts. Physical notebooks in a drawer.

The issue wasn't capturing information. I was great at that. The issue was that none of it ever came back out in a useful way. I'd vaguely remember "I read something about this once" and then spend 20 minutes failing to find it.

So I tried to fix it. Here's what I ended up with.

Why Obsidian?

I landed on Obsidian after trying Notion, Roam, and a few others. The main reasons:

Your notes are just markdown files sitting on your computer. If Obsidian disappears tomorrow, you still have your notes.
Linking between notes is dead simple - you just type [[note name]] and it links.
There's a graph view that shows how your notes connect. I was skeptical of this but it's actually useful (more on that later).

There are loads of plugins but honestly I don't use many of them. The core app does most of what I need.

Taking Notes Like a Lab Notebook

Here's the thing that actually made the difference for me. I started treating my notes like a lab notebook.

In science, researchers don't just write down results. They write down their reasoning, what they tried, what failed, what confused them. Every entry is dated. You're supposed to be able to hand your notebook to someone else and they can follow your thinking.

I started doing the same thing with my notes and it changed everything.

Date your entries

When I revisit a note, I add a new dated section rather than just editing the old content. So a note might look like:

# Vector Databases

## 2025-12-15
Read the pinecone docs. Still don't understand why you'd use 
this over just postgres with pgvector. Something about scale?

## 2026-01-09
Okay I get it now. The HNSW indexing thing means you trade 
accuracy for speed. You might not get the actual nearest 
neighbor but you get something close, fast. That matters 
when you have millions of vectors.

This is way more useful than a single "Vector Databases" note that I've edited into something polished. The dated entries show me how my understanding evolved.

Write down what confuses you

This sounds obvious but I never used to do it. If I read something and didn't understand part of it, I'd just skip over it or write down the bits I did understand.

Now I explicitly write "I don't get X" or "This seems wrong because Y". Half the time when I come back to the note, I immediately know the answer. The other half, at least I know what I need to look up.

Link stuff together

When I make a new note, I spend 30 seconds asking "what existing notes does this connect to?" and add those links. Obsidian makes this easy with the [[brackets]] syntax. It also shows you backlinks - notes that link TO the current note - which surfaces connections you forgot about.

My Folder Setup

I kept this simple because I've seen people spend weeks building elaborate folder hierarchies and then abandon the whole system. Here's what I use:

vault/
├── inbox/        # stuff I haven't processed yet
├── daily/        # daily notes
├── projects/     # active projects
├── resources/    # reference material, evergreen notes
└── archive/      # done projects, old stuff

New stuff goes in inbox. Once a day (ish) I go through inbox and either delete things, move them to the right folder, or link them to existing notes. That's it.

Actually Using the Graph View

The graph view shows all your notes as dots with lines connecting linked notes. I thought this was just a gimmick but it's genuinely useful for a few things.

When you're working on a note, you can open a "local graph" that shows just the notes connected to your current note. I use this all the time when writing - it reminds me of related stuff I'd forgotten about.

You can also filter the graph. I tag notes with types like #book or #concept or #project and then filter to show only certain types. Want to see how all the books you've read connect? Filter to #book.

The other useful thing is finding orphan notes - notes with no links. These are either new and need to be integrated, useless and should be deleted, or things you forgot about. I go through orphans every few weeks.

Adding LLMs to the Mix

So you've got all these notes. Now what? This is where I started experimenting with LLMs.

The simple version

The simplest thing that works: copy a few related notes into Claude or ChatGPT and ask it to summarise them or answer a question. I do this a lot when returning to a topic I haven't looked at in months. Something like:

Here are my notes on distributed systems:

[paste notes]

What's my current understanding of consensus algorithms 
based on these notes? What gaps do I have?

It's not fancy but it works.

Obsidian plugins

There are plugins that integrate LLMs directly into Obsidian. I've tried a few:

Smart Connections - finds semantically related notes using embeddings. Useful for surfacing notes you forgot about.
Obsidian Copilot - lets you query your notes with an LLM without leaving Obsidian.

I use Smart Connections occasionally. The others felt like overkill for my needs but your mileage may vary.

Building a proper RAG setup

If you want to go further, you can build a retrieval system over your notes. The idea: embed all your notes into vectors, store them in a vector database, then when you ask a question, find the relevant notes automatically and feed them to an LLM.

I hacked together a simple version of this with Python:

from chromadb import Client
from sentence_transformers import SentenceTransformer
import os

model = SentenceTransformer('all-MiniLM-L6-v2')
client = Client()
collection = client.create_collection("obsidian_notes")

def index_vault(vault_path):
    for root, dirs, files in os.walk(vault_path):
        for file in files:
            if file.endswith('.md'):
                filepath = os.path.join(root, file)
                with open(filepath, 'r') as f:
                    content = f.read()
                embedding = model.encode(content)
                collection.add(
                    embeddings=[embedding.tolist()],
                    documents=[content],
                    ids=[filepath]
                )

def query_notes(question, n_results=5):
    query_embedding = model.encode(question)
    results = collection.query(
        query_embeddings=[query_embedding.tolist()],
        n_results=n_results
    )
    return results['documents'][0]

This runs locally so your notes never leave your machine. Is it the most elegant code? No. Does it work? Yes. You can then pipe the results into whatever LLM you want.

Mistakes I Made

A few things I got wrong that might save you some time:

I spent way too long organising at first. I had nested folders, tagging taxonomies, the works. Most of that was wasted effort. Links between notes matter more than which folder something is in.

I also used to just save articles and highlights without processing them. This is useless. If you're not writing something in your own words and connecting it to existing notes, you're just hoarding. Better to have 50 notes you've actually thought about than 500 highlights you'll never look at again.

Conclusion

That's basically it. Obsidian for the notes, lab notebook style for making them useful, LLMs for getting stuff back out.

The main thing is just doing it consistently. Even 10 minutes a day of writing notes and linking them together adds up. After a year you've got something actually useful - a record of your thinking that you can search, query, and build on.

Is this the perfect system? Probably not. Are there better ways to do some of this? Almost certainly. But it works for me, and that's really all that matters.