Starting my January 2025 project: a Local RAG System from Scratch

This January, I’m taking on a new challenge: creating my very own local RAG-system to search through my notes with natural language. I’ve been using Obsidian for everything - blog drafts, social media ideas, and work notes - but sometimes I'm too lazy to actually look for the right note. So, I’m building a system that lets me ask questions like, “What were my ideas for improving RAG workflows?” and instantly get the most relevant snippets.

The twist? It should run entirely on my computer, no cloud required. I’ll start simple, using cloud APIs for the AI components, and gradually transition to open-source tools to make it all fully local.

Why I’m Doing This

Aside from making my life easier, this project also lines up with skills I’m exploring for work. At my job, we’ve been working on RAG pipelines with enterprise tools like Azure AI Search Service, but they’re expensive and a bit of a black box. I want to understand the “how” behind it all by building something similar from scratch - starting simple and growing as I learn.

I've also been itching to build something that is purely "mine" - I haven't really done that since I was a student and learning web development because machine learning is often involved in very complex projects and not easy to code a "full" application without also doing frontend. At least that was the excuse ... until now.

What It Will Look Like

By the end of January, I want to have a working system that:

  • Stores my notes as embeddings (fancy AI representations of text) in a local database.
  • Lets me type in questions or keywords and pulls up the most relevant snippets.
  • Generates summaries or answers using AI.

Like I said, ideally this will be fully local and open-source at the end. I’ll start with APIs like OpenAI for embeddings and LLMs, but I’m excited to eventually test smaller, open-source models.

The components of the project

Here’s roughly what parts will be in this project:

  • Get a database up and running: I’m planing to use PostgreSQL with an extension called pgvector to store my notes as embeddings. I hope there are some ready-to-use Docker images for this.
  • Parse my notes: I’ll write a script to convert my Markdown files (the format used in Obsidian) into plain text that’s ready for embedding.
  • Chunk the text: Large notes need to be split into smaller pieces so the embeddings are focused on precise topics and to keep the context in a reasonable frame. Think digestible paragraphs, not full books.
  • Generate embeddings: This is the first transformer-based task. Embeddings are like the AI’s way of understanding the “essence” of text and are stored in a vector of numbers. I’ll start with OpenAI’s API and explore open-source options later.
  • Search and retrieve: For any question I ask, the system will compare embeddings to find the most relevant chunks of text - using similarity search. This will probably be a very simple algorithm to start with.
  • Get answers: Once the system finds relevant chunks, it’ll feed them into a language model (again, starting with OpenAI) to generate answers or summaries.

There's not much to see at the time of initially posting this, but here is the repo:

https://github.com/GalaxyInfernoCodes/local-rag-system

Why Share This?

I’ve been feeling really stuck lately about what to share with you all. I’ve got this tendency to overthink and I don't feel advanced enough to give "expert-level advice". And talking about my job projects is difficult since they are mostly confidential.

This project approach is meant to change that: I want to take you along on this journey - not just to share the final results, but to show the messy, exciting, and sometimes overwhelming process of learning and building something new.

If this sounds like your kind of project, let me know! I’d love to hear your thoughts or questions. And if you’re curious about the tech behind this, I’ll be breaking down each step in future posts. Let’s build this together (virtually)!