4495a3cc62
- Add AGENTS.md with repo guidelines - Add lightrag-mcp: FastMCP server exposing insert_documents() + query_documents() to LLM agents via stdio transport, talks to LightRAG REST API - Add scripts/patch-vllm-cpu.py for CPU inference patching - Add .env.vllm for vLLM configuration - Update flake.nix with expanded dev shell - Update .env.lightrag - Remove CLAUDE.md (replaced by AGENTS.md)
14 lines
395 B
Bash
14 lines
395 B
Bash
# vllm server configuration
|
|
# Used by: nix run .#vllm-start-llm and nix run .#vllm-start-embed
|
|
|
|
# Force CPU backend — no CUDA/ROCm GPU on this machine
|
|
VLLM_TARGET_DEVICE=cpu
|
|
|
|
VLLM_LLM_MODEL=Qwen/Qwen3-0.6B
|
|
VLLM_LLM_PORT=8000
|
|
# VLLM_LLM_EXTRA_ARGS=--dtype bfloat16 --max-model-len 4096
|
|
|
|
VLLM_EMBED_MODEL=Qwen/Qwen3-Embedding-0.6B
|
|
VLLM_EMBED_PORT=8001
|
|
# VLLM_EMBED_EXTRA_ARGS=--dtype bfloat16
|