Files
ucsinfer/README.md
2025-08-27 22:05:17 +00:00

62 lines
1.6 KiB
Markdown

# ucsinfer
Universal Category System LLM toolkit.
## Install
Since this project is still experimental and not for production, it's not
packaged on PyPi. You should clone the project to your local machine and
do an [editable install](https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs)
in a [virtual environment](https://docs.python.org/3/library/venv.html).
Note: You will also need ffmpeg and ffprobe in order to interrogate audio
files for their metadata.
```sh
$ brew install ffmpeg
$ git clone https://git.squad51.us/jamie/ucsinfer.git
$ git submodule update --init
$ python -m venv .venv
$ source .venv/bin/activate # or whatever command is approprate for your shell
$ pip install -e .
```
Or alternately, this module is packaged with the [poetry][py-poetry] dependency
manager and can be run within a poetry virtualenv.
```sh
$ poetry run python -m ucsinfer
```
[py-poetry]: https://python-poetry.org/docs/1.8/cli/#run$0
## Running
```sh
python -m ucsinfer [command]
```
Pass `--help` to see a summary of subcommands and options.
The subcommands available at this time are `gather` and `evaluate`.
## Functions
* ~recommend~ (in-progress)
Infer a UCS category for a text description.
* gather
Scan files to capture existing text descriptions and UCS categories
and save as a dataset. This function is used to countruct datasets
that `evaluate` can use to test models and finetune can use to
refine them.
* ~finetune~ (planned)
Fine-tune an existing sentence embedding model with training data.
* evaluate
Use datasets to evaluate the performance of a model and fine-tuning.