Model Discovery
Before you can do anything useful with speaches
, you'll need to want to download a machine learning model for your specific task. You can see all the supported models by querying the /v1/registry
endpoint. This will return a list of all available models, including their IDs, types, and other metadata.
Note
You'll want to make sure you have the SPEACHES_BASE_URL
environment variable set to the URL of your speaches
server. If you're running it locally, without modifying the default port, this will be http://localhost:8000
.
uvx speaches-cli registry ls
curl "$SPEACHES_BASE_URL/v1/registry"
The above command will display a list of all available models. You can filter the models by task type, such as automatic-speech-recognition
, text-to-speech
, etc. To do this, you'll need to specify the task
query parameter.
uvx speaches-cli registry ls --task automatic-speech-recognition
curl "$SPEACHES_BASE_URL/v1/registry?task=automatic-speech-recognition"
TIP: You may further filter the response by piping the output to jq
and using the select
function.
You'll then want to download the model you want to use. You can do this by making a POST /v1/models
request.
uvx speaches-cli model download Systran/faster-distil-whisper-small.en
curl "$SPEACHES_BASE_URL/v1/models/Systran/faster-distil-whisper-small.en" -X POST
The downloaded model will now be included in the list of available models when you query the /v1/models
endpoint (OpenAI compatible).
uvx speaches-cli model ls
curl "$SPEACHES_BASE_URL/v1/models"
Model Aliasing
speaches
supports model aliasing, which allows you to use friendly names for models instead of their full repository paths. This is particularly useful for maintaining compatibility with OpenAI's API naming conventions.
Model aliases are defined in the model_aliases.json
file in the root directory of your speaches
installation. The file contains a JSON object mapping alias names to actual model IDs:
{
"tts-1": "speaches-ai/Kokoro-82M-v1.0-ONNX",
"tts-1-hd": "speaches-ai/Kokoro-82M-v1.0-ONNX",
"whisper-1": "Systran/faster-whisper-large-v3"
}
Using Model Aliases
Once configured, you can use the alias name anywhere you would normally use a full model ID:
# Use alias instead of full model path
uvx speaches-cli model download whisper-1
# This is equivalent to:
uvx speaches-cli model download Systran/faster-whisper-large-v3
# Use alias in API requests
curl "$SPEACHES_BASE_URL/v1/models/whisper-1" -X POST
# This is equivalent to:
curl "$SPEACHES_BASE_URL/v1/models/Systran/faster-whisper-large-v3" -X POST
Configuring Model Aliases
To add or modify model aliases:
- Edit the
model_aliases.json
file in yourspeaches
root directory - Add or modify entries using the format
"alias_name": "actual_model_id"
- Restart the server for changes to take effect
Example:
{
"my-whisper": "openai/whisper-large-v3",
"fast-tts": "speaches-ai/Kokoro-82M-v1.0-ONNX"
}
Note
Model aliases are loaded only once when the server starts. You must restart the server after modifying the model_aliases.json
file.
Docker Deployment
If you're using a Docker deployment, you'll need to bind mount your local model_aliases.json
file to the container:
# Mount your local model_aliases.json file
docker run -v /path/to/your/model_aliases.json:/home/ubuntu/speaches/model_aliases.json speaches
Or in your compose.yaml
:
services:
speaches:
volumes:
- ./model_aliases.json:/home/ubuntu/speaches/model_aliases.json
FAQ
Where do the models go?
The models are downloaded into ~/.cache/huggingface/hub
by default. This is the default cache directory for Hugging Face models. You can change this by setting the HF_HUB_CACHE
environment variable to a different directory. If you are using Docker, you can mount a volume to this directory to persist the models across container restarts (the provided Docker Compose files do this).
Can I download the models directly using the huggingface-cli
command?
Theoretically, you could in some cases, but it's not recommended. The speaches
server manages the models and their versions. If you download the models directly, you may run into issues with model compatibility or versioning. It's best to use the speaches
CLI or API to download and manage the models.
How can I add a new model to the registry?
Model discovery works by querying the Hugging Face Hub with specific tags. If you want to add a new model to the registry, you can do so by adding the appropriate tags to your model on Hugging Face and ensuring your Hugging Face repository follows the expected structrure (you'll need to take a look at src/speaches/executors
to find out what the expected tags and repository structrure is). The server will automatically pick up the new model when it queries the registry.
Can I use private models from Hugging Face?
Yes. You'll want to first set the HF_TOKEN
(you may need to restart the server after setting this variable) environment variable to your Hugging Face token. This will allow the server to access private models.