Add training workflow, datasets, and runbook

2025-12-23 21:17:22 -08:00
commit 619e87aacc
2140 changed files with 2513895 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,30 @@
+# Ollama gpt-oss-20b Options Training (RTX 5060 Ti)
+
+This repo contains the full workflow, scripts, and runbook to fine-tune
+`openai/gpt-oss-20b` on options-trading ebooks and deploy the merged model into
+Ollama as `trained-options-model` on a TrueNAS SCALE box with an NVIDIA GPU.
+
+Start here: `AGENTS.md` (detailed step-by-step, failures, fixes, commands, and
+paths).
+
+## Contents
+- `tools/` scripts for extraction, relevance filtering, dataset building, and
+  LoRA fine-tuning.
+- `training_data/` curated dataset + LoRA outputs used for training.
+- `remote/ollama/` copies of the remote Modelfile and template used to enable
+  tool calls in Ollama.
+- `Modelfile.trained-options-model` (local reference; see AGENTS for the exact
+  remote Modelfile used).
+
+## What is NOT included
+- Raw ebooks (`eBooks/`) are intentionally excluded.
+- Base model weights are not committed; download via Ollama or Hugging Face as
+  described in `AGENTS.md`.
+
+## Quick pointers
+- Training runs on TrueNAS SCALE (GPU via TrueNAS Apps/middlewared).
+- Ollama runs in a TrueNAS App and stores models under
+  `/mnt/fast.storage.rushg.me/datasets/apps/ollama.models`.
+- Tool-call support requires the correct TEMPLATE block (see `remote/ollama`).
+
+For full instructions and exact commands, see `AGENTS.md`.