Category
ローカルLLM・自分のPCでAI(Ollama / LM Studio / プライベートRAG)の始め方ガイド
ローカルLLMは『無料・プライバシー(データが外に出ない)・オフライン』でAIを使える、いま最も需要の高いテーマです。本クラスタは、Ollama / LM Studio の始め方、最大の疑問『自分のGPU(VRAM)でどのモデルが動くか』への即答、ChatGPTとの正直なコスト・プライバシー比較、そして手元の文書に答えさせるプライベートRAGの最小実装までを扱います。普通のPCで試す入門から、自前GPU・vLLM・社内RAGの本番運用へと地続きでつながるよう、実際にLLMを本番運用するエンジニアの視点で、型安全なコードとともに解説します。
3 articles in total
Foundational guide
Foundational guide (start here)
The complete guide to getting started with local LLMs: run AI on your own PC with Ollama / LM Studio (with model selection by VRAM)
An engineer who actually runs LLMs in production explains how to get started with 'local LLMs' — running AI for free, privately, and offline on your own PC. From choosing between Ollama / LM Studio, to a model-selection table by VRAM that answers the biggest question 'which model runs on my GPU (VRAM),' quantization (Q4_K_M), the reality of speed, and code to build your own app with the Ollama API.
Related practical articles
- 生成AILLMOllamaローカルLLMコスト最適化
Local LLM vs ChatGPT: an honest comparison of cost, privacy, and quality (which is the better deal)
Which is the better deal — a local LLM you run on your own PC, or ChatGPT (cloud)? It examines the misconception that 'local is free' with an honest cost estimate that includes hardware and electricity. It explains the differences in privacy, offline use, quality, and speed, and the break-even point for you, from the perspective of an engineer who actually runs LLMs in production.
9 min read - 生成AIRAGOllamaローカルLLMセルフホスト
Build an AI that answers from your own documents, locally: an intro to private RAG (your data never leaves)
An intro, by an engineer who actually runs RAG in production, to building an AI you can ask questions of your local PDFs, notes, and internal materials — entirely locally, without sending any data outside (private RAG). It introduces how RAG works, a minimal implementation with Ollama's embedding API and cosine similarity, tips to raise accuracy, and the path to production, with type-safe code.
8 min read