Codex TrueNAS Helper
This project is a collection of scripts, configurations, and applications to manage and enhance a TrueNAS SCALE server, with a special focus on running and interacting with large language models (LLMs) like those powered by llama.cpp and Ollama.
Features
llama.cppWrapper: A sophisticated wrapper for thellama.cppTrueNAS application that provides:- An OpenAI-compatible API for chat completions and embeddings.
- A web-based UI for managing models (listing, downloading).
- The ability to hot-swap models without restarting the
llama.cppcontainer by interacting with the TrueNAS API.
- TrueNAS Inventory: A snapshot of the TrueNAS server's configuration, including hardware, storage, networking, and running applications.
- Automation Scripts: A set of PowerShell and Python scripts for tasks like deploying the wrapper and testing remote endpoints.
- LLM Integration: Tools and configurations for working with various LLMs.
Directory Structure
AGENTS.md&AGENTS.full.md: These files contain detailed information and a complete inventory of the TrueNAS server's configuration.llamaCpp.Wrapper.app/: A Python-based application that wraps thellama.cppTrueNAS app with an OpenAI-compatible API and a model management UI.scripts/: Contains various scripts for deployment, testing, and other tasks.inventory_raw/: Raw data dumps from the TrueNAS server, used to generate the inventory inAGENTS.full.md.reports/: Contains generated reports, test results, and other artifacts.llamacpp_runs_remote/&ollama_runs_remote/: Logs and results from running LLMs.modelfiles/: Modelfiles for different language models.tests/: Python tests for thellamaCpp.Wrapper.app.
llamaCpp.Wrapper.app
This is the core component of the project. It's a Python application that acts as a proxy to the llama.cpp server running on TrueNAS, but with added features.
Running Locally
- Install the required Python packages:
pip install -r llamaCpp.Wrapper.app/requirements.txt - Run the application:
This will start two web servers: one for the API (default port 9093) and one for the UI (default port 9094).
python -m llamaCpp.Wrapper.app.run
Docker (TrueNAS)
The wrapper can be run as a Docker container on TrueNAS. See the llamaCpp.Wrapper.app/README.md file for a detailed example of the docker run command. The wrapper needs to be configured with the appropriate environment variables to connect to the TrueNAS API and the llama.cpp container.
Model Hot-Swapping
The wrapper can switch models in the llama.cpp server by updating the application's command via the TrueNAS API. This is a powerful feature that allows for dynamic model management without manual intervention.
Scripts
deploy_truenas_wrapper.py: A Python script to deploy thellamaCpp.Wrapper.appto TrueNAS.remote_wrapper_test.py: A Python script for testing the remote wrapper.update_llamacpp_flags.ps1: A PowerShell script to update thellama.cppflags.llamacpp_remote_test.ps1&ollama_remote_test.ps1: PowerShell scripts for testingllama.cppandOllamaremote endpoints.
Getting Started
- Explore the Inventory: Start by reading
AGENTS.mdandAGENTS.full.mdto understand the TrueNAS server's configuration. - Set up the Wrapper: If you want to use the
llama.cppwrapper, follow the instructions inllamaCpp.Wrapper.app/README.mdto run it either locally or as a Docker container on TrueNAS. - Use the Scripts: The scripts in the
scriptsdirectory can be used to automate various tasks.
Development
The llamaCpp.Wrapper.app has a suite of tests located in the tests/ directory. To run the tests, use pytest:
pytest