1.8 KiB
ucsinfer
Universal Category System LLM toolkit.
Install
Since this project is still experimental and not for production, it's not packaged on PyPi. You should clone the project to your local machine and do an editable install in a virtual environment.
Note: You will also need ffmpeg and ffprobe in order to interrogate audio files for their metadata.
$ brew install ffmpeg
$ git clone https://git.squad51.us/jamie/ucsinfer.git
$ git submodule update --init
$ python -m venv .venv
$ source .venv/bin/activate # or whatever command is approprate for your shell
$ pip install -e .
Or alternately, this module is packaged with the poetry dependency manager and can be run within a poetry virtualenv.
$ poetry run python -m ucsinfer
Running
python -m ucsinfer [command]
Pass --help
to see a summary of subcommands and options.
The subcommands available at this time are gather
and evaluate
.
Functions
-
recommend
Infer a UCS category for a text description. Text metadata is extracted from audio files and the language model can recommend a corresponding list of appropriate categories, ranked by their alignment with the category definition.
-
gather
Scan files to capture existing text descriptions and UCS categories and save as a dataset. This function is used to countruct datasets that
evaluate
can use to test models and finetune can use to refine them. -
finetune(planned)Fine-tune an existing sentence embedding model with training data.
-
evaluate
Use datasets to evaluate the performance of a model and fine-tuning.