Files
rags/docs/setup.org
T
2026-04-19 12:22:46 +08:00

4.9 KiB

Setup Notes

What We're Building and Why

Private learning tool. Ingest study materials → query concepts → export to Anki.

Five RAG frameworks were considered: LightRAG, Graphiti, Morphik, R2R, Agentset.

Why LightRAG

Graph-based RAG — it builds a knowledge graph from your documents, not just a flat vector index. Queries traverse relationships between concepts, which maps naturally to Anki's card/tag structure. File-based storage, minimal deps, works with Ollama.

Why Graphiti

Temporal knowledge graph designed for agent memory. Tracks when facts were learned and how they change over time. Complements LightRAG: LightRAG indexes your source material, Graphiti tracks your evolving understanding of it.

What Was Skipped and Why

Project Reason skipped
Morphik Multimodal (ColPali) — only useful if materials have images
R2R 6+ services (MinIO, RabbitMQ, Hatchet, 2x Postgres)
Agentset Bun/TypeScript monorepo, needs Supabase + Trigger.dev

Project Structure

Git repo with two submodules:

git init
git submodule add https://github.com/hkuds/lightrag lightrag
git submodule add https://github.com/getzep/graphiti graphiti

Nix Flake Design

Goal: impure but reproducible shells

Packaging Python with Nix properly (buildPythonPackage, wheels in the nix store) is slow and often breaks on native extensions. The tradeoff chosen:

  • Nix provides the runtime: Python 3.12, uv, Neo4j, curl
  • uv sync installs PyPI deps into a .venv outside the nix store at shell entry
  • .venv dirs are gitignored, recreated on first nix develop

This is impure — the .venv contents aren't pinned by Nix — but uv.lock in each submodule pins the exact PyPI versions, so it's reproducible enough.

Two devShells

devShells.${system} = {
  lightrag = pkgs.mkShell { ... };
  graphiti = pkgs.mkShell { ... };
};

Each shell:

  1. Sets UV_PYTHON to the nix-provided Python 3.12 binary
  2. Sets UV_PROJECT_ENVIRONMENT so uv puts the venv in the project dir
  3. Sets LD_LIBRARY_PATH for native wheel compatibility (see below)
  4. Runs uv sync on first entry
  5. Sources .env.<project> for runtime config

Neo4j as nix apps

apps.${system} = {
  neo4j-start = { type = "app"; program = "${startNeo4j}"; };
  neo4j-stop  = { type = "app"; program = "${stopNeo4j}"; };
};

pkgs.neo4j (version 2026.02.2) is in nixpkgs. The startup script writes a neo4j.conf to ./data/neo4j/conf/ at runtime and sets NEO4J_CONF to point there. Neo4j respects NEO4J_CONF as a directory containing neo4j.conf.

Auth is disabled (dbms.security.auth_enabled=false) for local dev.

Problems Solved

Wrong Python version (3.14 instead of 3.12)

The system Python on this machine is 3.14. uv was picking it up instead of the nix-provided python312. Fix: pin UV_PYTHON explicitly in the shellHook:

export UV_PYTHON = "${pkgs.python312}/bin/python3.12";

libstdc++.so.6 not found

PyPI wheels for numpy and other native extensions link against libstdc++.so.6. On NixOS this library isn't in standard paths. Fix: add to LD_LIBRARY_PATH in the shellHook:

export LD_LIBRARY_PATH = "${pkgs.lib.makeLibraryPath [
  pkgs.stdenv.cc.cc
  pkgs.zlib
]}:$LD_LIBRARY_PATH";

LightRAG server missing fastapi

uv sync alone doesn't install the API server deps — they're behind an optional extra. Fix: use uv sync --extra api in the lightrag shellHook.

Runtime paths in shellHook

builtins.toString ./. in a Nix flake evaluates to the flake's path in the nix store, not the user's working directory. Using it for cd and venv paths would point into /nix/store/.... Fix: use $PWD (the directory where the user runs nix develop) for all runtime paths:

RAGS_ROOT="$PWD"
export VIRTUAL_ENV="$RAGS_ROOT/lightrag/.venv"
cd "$RAGS_ROOT/lightrag"

Graphiti + Ollama

Graphiti's LLM and embedder clients are OpenAI SDK wrappers. Ollama exposes an OpenAI-compatible API at http://localhost:11434/v1. So Graphiti can use Ollama by setting:

OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=ollama  # SDK requires a non-empty value; Ollama ignores it

The embedder uses nomic-embed-text (768 dimensions). EMBEDDING_DIM must be set to match or Graphiti's index creation will use the wrong size.

Testing Done

Test Result
import lightrag in Python 3.12 ok
lightrag-server starts, binds port ok
import graphiti_core in Python 3.12 ok
Neo4j starts, responds on port 7474 ok
Graphiti connects to Neo4j via bolt ok
Neo4j stops cleanly ok