Add training workflow, datasets, and runbook

This commit is contained in:
2025-12-23 21:17:22 -08:00
commit 619e87aacc
2140 changed files with 2513895 additions and 0 deletions

30
README.md Normal file
View File

@@ -0,0 +1,30 @@
# Ollama gpt-oss-20b Options Training (RTX 5060 Ti)
This repo contains the full workflow, scripts, and runbook to fine-tune
`openai/gpt-oss-20b` on options-trading ebooks and deploy the merged model into
Ollama as `trained-options-model` on a TrueNAS SCALE box with an NVIDIA GPU.
Start here: `AGENTS.md` (detailed step-by-step, failures, fixes, commands, and
paths).
## Contents
- `tools/` scripts for extraction, relevance filtering, dataset building, and
LoRA fine-tuning.
- `training_data/` curated dataset + LoRA outputs used for training.
- `remote/ollama/` copies of the remote Modelfile and template used to enable
tool calls in Ollama.
- `Modelfile.trained-options-model` (local reference; see AGENTS for the exact
remote Modelfile used).
## What is NOT included
- Raw ebooks (`eBooks/`) are intentionally excluded.
- Base model weights are not committed; download via Ollama or Hugging Face as
described in `AGENTS.md`.
## Quick pointers
- Training runs on TrueNAS SCALE (GPU via TrueNAS Apps/middlewared).
- Ollama runs in a TrueNAS App and stores models under
`/mnt/fast.storage.rushg.me/datasets/apps/ollama.models`.
- Tool-call support requires the correct TEMPLATE block (see `remote/ollama`).
For full instructions and exact commands, see `AGENTS.md`.