# ucsinfer Tools for applying UCS categories to sounds using large-language models ## Install Since this project is still experimental and not for production, it's not packaged on PyPi. You should clone the project to your local machine and do an [editable install](https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs) in a [virtual environment](https://docs.python.org/3/library/venv.html). Note: You will also need ffmpeg and ffprobe in order to interrogate audio files for their metadata. ```sh $ brew install ffmpeg $ git clone https://git.squad51.us/jamie/ucsinfer.git $ git submodule update --init $ python -m venv .venv $ source .venv/bin/activate # or whatever command is approprate for your shell $ pip install -e . ``` Or alternately, this module is packaged with the [poetry][py-poetry] dependency manager and can be run within a poetry virtualenv. ```sh $ poetry run python -m ucsinfer ``` [py-poetry]: https://python-poetry.org/docs/1.8/cli/#run$0 ## Running ```sh python -m ucsinfer [command] ``` Pass `--help` to see a summary of subcommands and options. ## Functions * recommend Infer a UCS category for a text description. Text metadata is extracted from audio files and the language model can recommend a corresponding list of appropriate categories, ranked by their alignment with the category definition. * gather Scan files to capture existing text descriptions and UCS categories and save as a dataset. This function is used to construct datasets that `evaluate` can use to test models and finetune can use to refine them. * ~finetune~ (planned) Fine-tune an existing sentence embedding model with training data. * evaluate Use datasets to evaluate the performance of a model and fine-tuning. # Demos and More Reading * [Category Inference Experiments With UCSINFER](https://squad51.us/notebook/category_inference_experiments_ucsinfer/) * [UCSINFER for Renaming Sounds](https://squad51.us/notebook/ucsinfer_to_rename_sounds/)