Releases: LlamaEdge/whisper-api-server
Releases · LlamaEdge/whisper-api-server
LlamaEdge-Whisper 0.3.9
Major changes:
- (New) Support API key (see the
start server
section of README for details) - (New) Provide the
/v1/files
endpoint for uploading, removing, and listing files - Upgrade dependencies:
wavup v0.1.5
endpoints v0.23.2
llama-core v0.25.3
LlamaEdge-Whisper 0.3.8
Major changes:
- Upgrade wavup to
v0.1.4
- Improve the support for multi-channel audio in
trim_ending_silence
func
- Improve the support for multi-channel audio in
- Upgrade llama-core to
v0.24.1
- Keep the metadata state of whisper-api-server same as whisper plugin
LlamaEdge-Whisper 0.3.7
Major changes:
- Added the
use_new_context
field in the requests of transcriptions and translations to decide if creating a new whisper computation context for each request
LlamaEdge-Whisper 0.3.6
Major changes:
- Improve
language
anddetect_language
request params - Upgrade
wavup
dep to0.1.3
- Improve the design of trimming tailing silence audio samples
LlamaEdge-Whisper 0.3.5
Major changes:
- Upgrade
wavup
dep to optimize the trimming process
LlamaEdge-Whisper 0.3.3
LlamaEdge-Whisper 0.3.2
Major changes:
-
New endpoints
GET /v1/files/{file_id}
: Retrieve information of a specific file by idDELETE /v1/files/{file_id}
: Remove a specific file by id
-
Upgrade to
llama-core v0.22.0
LlamaEdge-Whisper 0.3.1
Major changes:
-
New CLI options:
threads
: Number of threads to use during computation. Defaults to 4.processors
: Number of processors to use during computation. Defaults to 1.task
: Task type. Default tofull
. Possible values:transcribe
,translate
,full
port
: Port number. Defaults to8080
.
-
Support new fields of transcription requests
language
: The language of the input audio. Defaults toen
.temperature
: Sampling temperature, between 0 and 1. Defaults to 0.00.prompt
: Text to guide the model's style or continue a previous audio segment. Defaults tonone
max_len
: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.split_on_word
: Split audio chunks on word rather than on token. Defaults to false.detect_language
: Automatically detect the spoken language in the provided audio input. Defaults to false.offset_time
: Time offset in milliseconds. Defaults to 0.duration
: Length of audio (in seconds) to be processed starting from the point defined by theoffset_time
field (or from the beginning by default). Defaults to 0.
-
Support new fields for translation requests
detect_language
: Automatically detect the spoken language in the provided audio input. Defaults to false.offset_time
: Time offset in milliseconds. Defaults to 0.duration
: Length of audio (in seconds) to be processed starting from the point defined by theoffset_time
field (or from the beginning by default). Defaults to 0.max_len
: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.split_on_word
: Split audio chunks on word rather than on token. Defaults to false.
LlamaEdge-Whisper 0.3.0
Major change:
- Remove the support for
/v1/audio/speech
endpoint. The endpoint will be supported in the comingtts-api-server
.
LlamaEdge-Whisper 0.2.2
Major change:
- Migrate to
WasmEdge v0.14.1