Java idiomatic client for Cloud Speech.
If you are using Maven with BOM, add this to your pom.xml file
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>libraries-bom</artifactId>
<version>8.1.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-speech</artifactId>
</dependency>
</dependencies>
If you are using Maven without BOM, add this to your dependencies:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-speech</artifactId>
<version>1.24.0</version>
</dependency>
If you are using Gradle, add this to your dependencies
compile 'com.google.cloud:google-cloud-speech:1.24.0'
If you are using SBT, add this to your dependencies
libraryDependencies += "com.google.cloud" % "google-cloud-speech" % "1.24.0"
See the Authentication section in the base directory's README.
You will need a Google Cloud Platform Console project with the Cloud Speech API enabled.
Follow these instructions to get your project set up. You will also need to set up the local development environment by
installing the Google Cloud SDK and running the following commands in command line:
gcloud auth login
and gcloud config set project [YOUR PROJECT ID]
.
You'll need to obtain the google-cloud-speech
library. See the Quickstart section
to add google-cloud-speech
as a dependency in your code.
Cloud Speech enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Speech-to-Text API service.
See the Cloud Speech client library docs to learn how to use this Cloud Speech Client Library.
The following code sample shows how to recognize speech using an audio file from a Cloud Storage bucket as input. First, add the following imports at the top of your file:
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.RecognitionAudio;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1.RecognizeResponse;
Then add the following code to do the speech recognization:
try (SpeechClient speechClient = SpeechClient.create()) {
RecognitionConfig.AudioEncoding encoding = RecognitionConfig.AudioEncoding.FLAC;
int sampleRateHertz = 44100;
String languageCode = "en-US";
RecognitionConfig config = RecognitionConfig.newBuilder()
.setEncoding(encoding)
.setSampleRateHertz(sampleRateHertz)
.setLanguageCode(languageCode)
.build();
String uri = "gs://bucket_name/file_name.flac";
RecognitionAudio audio = RecognitionAudio.newBuilder()
.setUri(uri)
.build();
RecognizeResponse response = speechClient.recognize(config, audio);
}
In RecognizeSpeech.java we put a quick start example, which shows how you can use Google Speech API to automatically recognize speech based on a local file.
For an example audio file, you can use the audio.raw file from the samples repository.
Note, to play the file on Unix-like system you may use the following command: play -t raw -r 16k -e signed -b 16 -c 1 audio.raw
Samples are in the samples/
directory. The samples' README.md
has instructions for running the samples.
Sample | Source Code | Try it |
---|---|---|
Transcribe Audio File using Long Running Operation (Local File) (LRO) | source code | |
Transcript Audio File using Long Running Operation (Cloud Storage) (LRO) | source code | |
Getting word timestamps (Cloud Storage) (LRO) | source code | |
Using Enhanced Models (Local File) | source code | |
Selecting a Transcription Model (Local File) | source code | |
Selecting a Transcription Model (Cloud Storage) | source code | |
Multi-Channel Audio Transcription (Local File) | source code | |
Multi-Channel Audio Transcription (Cloud Storage) | source code | |
Transcribe Audio File (Local File) | source code | |
Transcript Audio File (Cloud Storage) | source code | |
Speech Adaptation (Cloud Storage) | source code | |
Using Context Classes (Cloud Storage) | source code | |
Quickstart Beta | source code | |
Getting punctuation in results (Local File) (Beta) | source code | |
Separating different speakers (Local File) (LRO) (Beta) | source code | |
Detecting language spoken automatically (Local File) (Beta) | source code | |
Adding recognition metadata (Local File) (Beta) | source code | |
Enabling word-level confidence (Local File) (Beta) | source code | |
Infinite Stream Recognize | source code | |
Infinite Stream Recognize Options | source code | |
Quickstart Sample | source code | |
Recognize | source code | |
Recognize Beta | source code | |
Speech Adaptation | source code | |
Transcribe Context Classes | source code | |
Transcribe Diarization | source code | |
Transcribe Diarization Gcs | source code |
To get help, follow the instructions in the shared Troubleshooting document.
Cloud Speech uses gRPC for the transport layer.
Java 7 or above is required for using this client.
This library follows Semantic Versioning.
Contributions to this library are always welcome and highly encouraged.
See CONTRIBUTING for more information how to get started.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. See Code of Conduct for more information.
Apache 2.0 - See LICENSE for more information.
Java Version | Status |
---|---|
Java 7 | |
Java 8 | |
Java 8 OSX | |
Java 8 Windows | |
Java 11 |