Skip to content

Commit

Permalink
Merge pull request #369 from MicrosoftDocs/main
Browse files Browse the repository at this point in the history
9/19 11:00 AM IST Publish
  • Loading branch information
PhilKang0704 authored Sep 19, 2024
2 parents e496207 + 5650f00 commit b02418c
Show file tree
Hide file tree
Showing 14 changed files with 582 additions and 66 deletions.
2 changes: 1 addition & 1 deletion articles/ai-services/openai/includes/api-surface.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,5 @@ Azure OpenAI provides two methods for authentication. You can use either API Ke
The service APIs are versioned using the ```api-version``` query parameter. All versions follow the YYYY-MM-DD date structure. For example:

```http
POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/completions?api-version=2024-06-01
POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01
```
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ manager: nitinme
ms.service: azure-ai-speech
ms.custom: devx-track-extended-java, devx-track-go, devx-track-js, devx-track-python
ms.topic: quickstart
ms.date: 01/30/2024
ms.date: 9/18/2024
ms.author: eur
zone_pivot_groups: programming-languages-speech-services
keywords: speech to text, speech to text software
#customer intent: As a developer, I want to create speech to text applications that use diarization to improve readability of multiple person conversations.
#customer intent: As a developer, I want to create speech to text applications that use diarization to identify speakers in multiple person conversations.
---

# Quickstart: Create real-time diarization
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
author: eric-urban
ms.service: azure-ai-speech
ms.topic: include
ms.date: 08/16/2023
ms.date: 9/18/2024
ms.author: eur
---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
author: eric-urban
ms.service: azure-ai-speech
ms.topic: include
ms.date: 01/30/2024
ms.date: 9/18/2024
ms.author: eur
---

Expand Down Expand Up @@ -60,6 +60,7 @@ Follow these steps to create a console application and install the Speech SDK.
}

auto speechConfig = SpeechConfig::FromSubscription(speechKey, speechRegion);
speechConfig->SetProperty(PropertyId::SpeechServiceResponse_DiarizeIntermediateResults, "true");

speechConfig->SetSpeechRecognitionLanguage("en-US");

Expand All @@ -73,14 +74,15 @@ Follow these steps to create a console application and install the Speech SDK.
conversationTranscriber->Transcribing.Connect([](const ConversationTranscriptionEventArgs& e)
{
std::cout << "TRANSCRIBING:" << e.Result->Text << std::endl;
std::cout << "Speaker ID=" << e.Result->SpeakerId << std::endl;
});

conversationTranscriber->Transcribed.Connect([](const ConversationTranscriptionEventArgs& e)
{
if (e.Result->Reason == ResultReason::RecognizedSpeech)
{
std::cout << "TRANSCRIBED: Text=" << e.Result->Text << std::endl;
std::cout << "Speaker ID=" << e.Result->SpeakerId << std::endl;
std::cout << "\n" << "TRANSCRIBED: Text=" << e.Result->Text << std::endl;
std::cout << "Speaker ID=" << e.Result->SpeakerId << "\n" << std::endl;
}
else if (e.Result->Reason == ResultReason::NoMatch)
{
Expand Down Expand Up @@ -152,18 +154,170 @@ Follow these steps to create a console application and install the Speech SDK.
The transcribed conversation should be output as text:

```output
TRANSCRIBED: Text=Good morning, Steve. Speaker ID=Unknown
TRANSCRIBED: Text=Good morning. Katie. Speaker ID=Unknown
TRANSCRIBED: Text=Have you tried the latest real time diarization in Microsoft Speech Service which can tell you who said what in real time? Speaker ID=Guest-1
TRANSCRIBED: Text=Not yet. I've been using the batch transcription with diarization functionality, but it produces diarization result until whole audio get processed. Speaker ID=Guest-2
TRANSRIBED: Text=Is the new feature can diarize in real time? Speaker ID=Guest-2
TRANSCRIBED: Text=Absolutely. Speaker ID=GUEST-1
TRANSCRIBED: Text=That's exciting. Let me try it right now. Speaker ID=GUEST-2
CANCELED: Reason=EndOfStream
TRANSCRIBING:good morning
Speaker ID=Unknown
TRANSCRIBING:good morning steve
Speaker ID=Unknown
TRANSCRIBING:good morning steve how are you doing
Speaker ID=Guest-1
TRANSCRIBING:good morning steve how are you doing today
Speaker ID=Guest-1

TRANSCRIBED: Text=Good morning, Steve. How are you doing today?
Speaker ID=Guest-1

TRANSCRIBING:good
Speaker ID=Unknown
TRANSCRIBING:good morning
Speaker ID=Unknown
TRANSCRIBING:good morning kat
Speaker ID=Unknown
TRANSCRIBING:good morning katie i hope you're having a
Speaker ID=Guest-2
TRANSCRIBING:good morning katie i hope you're having a great start to your day
Speaker ID=Guest-2

TRANSCRIBED: Text=Good morning, Katie. I hope you're having a great start to your day.
Speaker ID=Guest-2

TRANSCRIBING:have you
Speaker ID=Unknown
TRANSCRIBING:have you tried
Speaker ID=Unknown
TRANSCRIBING:have you tried the latest
Speaker ID=Unknown
TRANSCRIBING:have you tried the latest real
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can tell you
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can tell you who said
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can tell you who said what
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can tell you who said what in
Speaker ID=Guest-1
TRANSCRIBING:have you tried the latest real time diarization in microsoft speech service which can tell you who said what in real time
Speaker ID=Guest-1

TRANSCRIBED: Text=Have you tried the latest real time diarization in Microsoft Speech Service which can tell you who said what in real time?
Speaker ID=Guest-1

TRANSCRIBING:not yet
Speaker ID=Unknown
TRANSCRIBING:not yet i
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch trans
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization function
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces di
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature able to
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature able to di
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature able to diarize
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature able to diarize in real
Speaker ID=Guest-2
TRANSCRIBING:not yet i've been using the batch transcription with diarization functionality but it produces diarization results after the whole audio is processed is the new feature able to diarize in real time
Speaker ID=Guest-2

TRANSCRIBED: Text=Not yet. I've been using the batch transcription with diarization functionality, but it produces diarization results after the whole audio is processed. Is the new feature able to diarize in real time?
Speaker ID=Guest-2

TRANSCRIBING:absolutely
Speaker ID=Unknown
TRANSCRIBING:absolutely i
Speaker ID=Unknown
TRANSCRIBING:absolutely i recom
Speaker ID=Guest-1
TRANSCRIBING:absolutely i recommend
Speaker ID=Guest-1
TRANSCRIBING:absolutely i recommend you
Speaker ID=Guest-1
TRANSCRIBING:absolutely i recommend you give it a try
Speaker ID=Guest-1

TRANSCRIBED: Text=Absolutely, I recommend you give it a try.
Speaker ID=Guest-1

TRANSCRIBING:that's exc
Speaker ID=Unknown
TRANSCRIBING:that's exciting
Speaker ID=Unknown
TRANSCRIBING:that's exciting let me
Speaker ID=Guest-2
TRANSCRIBING:that's exciting let me try
Speaker ID=Guest-2
TRANSCRIBING:that's exciting let me try it right now
Speaker ID=Guest-2

TRANSCRIBED: Text=That's exciting. Let me try it right now.
Speaker ID=Guest-2
```

Speakers are identified as Guest-1, Guest-2, and so on, depending on the number of speakers in the conversation.

> [!NOTE]
> You might see `Speaker ID=Unknown` in some of the early intermediate results when the speaker is not yet identified. Without intermediate diarization results (if you don't set the `PropertyId::SpeechServiceResponse_DiarizeIntermediateResults` property to "true"), the speaker ID is always "Unknown".
## Clean up resources

[!INCLUDE [Delete resource](../../common/delete-resource.md)]
Expand Down
Loading

0 comments on commit b02418c

Please sign in to comment.