diff --git a/articles/ai-services/speech-service/how-to-audio-content-creation.md b/articles/ai-services/speech-service/how-to-audio-content-creation.md index ddc8bf9af8..34d1ae3fca 100644 --- a/articles/ai-services/speech-service/how-to-audio-content-creation.md +++ b/articles/ai-services/speech-service/how-to-audio-content-creation.md @@ -1,24 +1,24 @@ --- -title: Audio Content Creation - Speech service +title: Audio Content Creation tool titleSuffix: Azure AI services -description: Audio Content Creation is an online tool that allows you to run Text to speech synthesis without writing any code. +description: Audio Content Creation is an online tool that allows you to run text to speech synthesis without writing any code. author: eric-urban manager: nitinme ms.service: azure-ai-speech ms.topic: how-to -ms.date: 1/18/2024 +ms.date: 9/9/2024 ms.author: eur --- -# Speech synthesis with the Audio Content Creation tool +# Text to speech with the Audio Content Creation tool -You can use the [Audio Content Creation](https://speech.microsoft.com/portal/audiocontentcreation) tool in Speech Studio for Text to speech synthesis without writing any code. You can use the output audio as-is, or as a starting point for further customization. +You can use the [Audio Content Creation](https://speech.microsoft.com/portal/audiocontentcreation) tool in Speech Studio for text to speech without writing any code. The Audio Content Creation tool might provide the final speech audio that you want. You can use the output audio as-is, or as a starting point for further customization. -Build highly natural audio content for various scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can efficiently fine-tune Text to speech voices and design customized audio experiences. +Build highly natural audio content for various scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can efficiently fine-tune text to speech voices and design customized audio experiences. -The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust Text to speech output attributes in real-time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody. +The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust text to speech output attributes in real-time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody. -- No-code approach: You can use the Audio Content Creation tool for Text to speech synthesis without writing any code. The output audio might be the final deliverable that you want. For example, you can use the output audio for a podcast or a video narration. +- No-code approach: You can use the Audio Content Creation tool for text to speech synthesis without writing any code. The output audio might be the final deliverable that you want. For example, you can use the output audio for a podcast or a video narration. - Developer-friendly: You can listen to the output audio and adjust the SSML to improve speech synthesis. Then you can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-basics.md) to integrate the SSML into your applications. For example, you can use the SSML for building a chat bot. You have easy access to a broad portfolio of [languages and voices](language-support.md?tabs=tts). These voices include state-of-the-art prebuilt neural voices and your custom neural voice, if you built one. @@ -64,7 +64,7 @@ It takes a few moments to deploy your new Speech resource. After the deployment ## Use the tool -The following diagram displays the process for fine-tuning the Text to speech outputs. +The following diagram displays the process for fine-tuning the text to speech outputs. :::image type="content" source="media/audio-content-creation/audio-content-creation-diagram.jpg" alt-text="Diagram of the sequence of steps for fine-tuning text to speech outputs."::: @@ -78,13 +78,13 @@ Each step in the preceding diagram is described here: > [!NOTE] > Gated access is available for custom neural voice, which allows you to create high-definition voices that are similar to natural-sounding speech. For more information, see [Gating process](./text-to-speech.md). -1. Select the content you want to preview, and then select **Play** (triangle icon) to preview the default synthesis output. +1. Select the content you want to preview, and then select **Play** (via the triangle icon) to preview the default synthesis output. If you make any changes to the text, select the **Stop** icon, and then select **Play** again to regenerate the audio with changed scripts. Improve the output by adjusting pronunciation, break, pitch, rate, intonation, voice style, and more. For a complete list of options, see [Speech Synthesis Markup Language](speech-synthesis-markup.md). - For more information about fine-tuning speech output, view the [How to convert Text to speech using Microsoft Azure AI voices](https://youtu.be/ygApYuOOG6w) video. + For more information about adjusting the speech output, see the [how to convert text to speech video on YouTube](https://youtu.be/ygApYuOOG6w). However, the video might not be available in all regions and might not be up to date by the time you watch it. 1. Save and [export your tuned audio](#export-tuned-audio). @@ -94,46 +94,47 @@ Each step in the preceding diagram is described here: You can get your content into the Audio Content Creation tool in either of two ways: -* **Option 1** - 1. Select **New** > **Text file** to create a new audio tuning file. +### Option 1: Create a new audio tuning file - 1. Enter or paste your content into the editing window. The allowable number of characters for each file is 20,000 or fewer. If your script contains more than 20,000 characters, you can use Option 2 to automatically split your content into multiple files. - - 1. Select **Save**. +1. Select **New** > **Text file** to create a new audio tuning file. -* **Option 2** +1. Enter or paste your content into the editing window. The allowable number of characters for each file is 20,000 or fewer. If your script contains more than 20,000 characters, you can use Option 2 to automatically split your content into multiple files. - 1. Select **Upload** > **Text file** to import one or more text files. Both plain text and SSML are supported. +1. Select **Save**. - If your script file is more than 20,000 characters, split the content by paragraphs, by characters, or by regular expressions. +### Option 2: Upload an audio tuning file - 1. When you upload your text files, make sure that they meet these requirements: +1. Select **Upload** > **Text file** to import one or more text files. Both plain text and SSML are supported. - | Property | Description | - |----------|---------------| - | File format | Plain text (.txt)\*
SSML text (.txt)\**
Zip files aren't supported. | - | Encoding format | UTF-8 | - | File name | Each file must have a unique name. Duplicate files aren't supported. | - | Text length | Character limit is 20,000. If your files exceed the limit, split them according to the instructions in the tool. | - | SSML restrictions | Each SSML file can contain only a single piece of SSML. | - + If your script file is more than 20,000 characters, split the content by paragraphs, by characters, or by regular expressions. - \* **Plain text example**: +1. When you upload your text files, make sure that they meet these requirements: - ```txt - Welcome to use Audio Content Creation to customize audio output for your products. - ``` + | Property | Description | + |----------|---------------| + | File format | Plain text (.txt) or SSML text (.txt)

Zip files aren't supported. | + | Encoding format | UTF-8 | + | File name | Each file must have a unique name. Duplicate files aren't supported. | + | Text length | Character limit is 20,000. If your files exceed the limit, split them according to the instructions in the tool. | + | SSML restrictions | Each SSML file can contain only a single piece of SSML. | + - \** **SSML text example**: + Here's a plain text example: - ```xml - - - Welcome to use Audio Content Creation to customize audio output for your products. - - - ``` + ```txt + Welcome to use Audio Content Creation to customize audio output for your products. + ``` + + Here's an SSML example: + + ```xml + + + Welcome to use Audio Content Creation to customize audio output for your products. + + + ``` ## Export tuned audio @@ -150,7 +151,6 @@ After you review your audio output and are satisfied with your tuning and adjust | wav | riff-8khz-16bit-mono-pcm | riff-16khz-16bit-mono-pcm | riff-24khz-16bit-mono-pcm |riff-48khz-16bit-mono-pcm | | mp3 | N/A | audio-16khz-128kbitrate-mono-mp3 | audio-24khz-160kbitrate-mono-mp3 |audio-48khz-192kbitrate-mono-mp3 | - 1. To view the status of the task, select the **Task list** tab. If the task fails, see the detailed information page for a full report. @@ -177,20 +177,26 @@ The users you grant access to need to set up a [Microsoft account](https://accou To add users to a Speech resource so that they can use Audio Content Creation, do the following: - -1. In the [Azure portal](https://portal.azure.com/), select **All services**. -1. Then select the **Azure AI services**, and navigate to your specific Speech resource. +1. In the [Azure portal](https://portal.azure.com/), select **All services** from the left navigation pane, and then search for **Azure AI services** or **Speech**. +1. Select your Speech resource. > [!NOTE] - > You can also set up Azure RBAC for whole resource groups, subscriptions, or management groups. Do this by selecting the desired scope level and then navigating to the desired item (for example, selecting **Resource groups** and then clicking through to your wanted resource group). + > You can also set up Azure RBAC for whole resource groups, subscriptions, or management groups. Do this by selecting the desired scope level and then navigating to the desired item (for example, selecting **Resource groups** and then selecting your resource group). 1. Select **Access control (IAM)** on the left navigation pane. -1. Select **Add** -> **Add role assignment**. -1. On the **Role** tab on the next screen, select a role you want to add (in this case, **Owner**). +1. Select **Add** > **Add role assignment**. +1. On the **Role** tab on the next screen, select a role (such as **Owner**) that you want to add. 1. On the **Members** tab, enter a user's email address and select the user's name in the directory. The email address must be linked to a Microsoft account that's trusted by Microsoft Entra ID. Users can easily sign up for a [Microsoft account](https://account.microsoft.com/account) by using their personal email address. 1. On the **Review + assign** tab, select **Review + assign** to assign the role. Here's what happens next: -An email invitation is automatically sent to users. They can accept it by selecting **Accept invitation** > **Accept to join Azure** in their email. They're then redirected to the Azure portal. They don't need to take further action in the Azure portal. After a few moments, users are assigned the role at the Speech resource scope, which gives them access to this Speech resource. If users don't receive the invitation email, you can search for their account under **Role assignments** and go into their profile. Look for **Identity** > **Invitation accepted**, and select **(manage)** to resend the email invitation. You can also copy and send the invitation link to them. +1. An email invitation is automatically sent to users. + + > [!NOTE] + > If users don't receive the invitation email, you can search for their account under **Role assignments** and go into their profile. Look for **Identity** > **Invitation accepted**, and select **(manage)** to resend the email invitation. You can also copy and send the invitation link to them. + +1. They can accept it by selecting **Accept invitation** > **Accept to join Azure** in their email. +1. They're then redirected to the Azure portal. They don't need to take further action in the Azure portal. +1. After a few moments, users are assigned the role at the Speech resource scope, which gives them access to this Speech resource. Users now visit or refresh the [Audio Content Creation](https://aka.ms/audiocontentcreation) product page, and sign in with their Microsoft account. They select **Audio Content Creation** block among all speech products. They choose the Speech resource in the pop-up window or in the settings at the upper right. @@ -200,22 +206,23 @@ Users who are in the same Speech resource see each other's work in the Audio Con ### Remove users from a Speech resource +To remove a user's permission from a Speech resource, do the following: 1. Search for **Azure AI services** in the Azure portal, select the Speech resource that you want to remove users from. 1. Select **Access control (IAM)**, and then select the **Role assignments** tab to view all the role assignments for this Speech resource. 1. Select the users you want to remove, select **Remove**, and then select **OK**. - :::image type="content" source="media/audio-content-creation/remove-user.png" alt-text="Screenshot of the 'Remove' button on the 'Remove role assignments' pane."::: + :::image type="content" source="media/audio-content-creation/remove-user.png" alt-text="Screenshot of the 'Remove' button on the 'Remove role assignments' pane."::: ### Enable users to grant access to others If you want to allow a user to grant access to other users, you need to assign them the owner role for the Speech resource and set the user as the Azure directory reader. 1. Add the user as the owner of the Speech resource. For more information, see [Add users to a Speech resource](#add-users-to-a-speech-resource). - :::image type="content" source="media/audio-content-creation/add-role.png" alt-text="Screenshot showing the 'Owner' role on the 'Add role assignment' pane. "::: + :::image type="content" source="media/audio-content-creation/add-role.png" alt-text="Screenshot showing the 'Owner' role on the 'Add role assignment' pane. "::: 1. In the [Azure portal](https://portal.azure.com/), select the collapsed menu at the upper left, select **Microsoft Entra ID**, and then select **Users**. 1. Search for the user's Microsoft account, go to their detail page, and then select **Assigned roles**. -1. Select **Add assignments** > **Directory Readers**. If the **Add assignments** button is unavailable, it means that you don't have access. Only the global administrator of this directory can add assignments to users. +1. Select **Add assignments** > **Directory Readers**. If the **Add assignments** button is unavailable, it means that you don't have access. You must have the role of **Owner** or **User Access Administrator** to assign roles to users. ## Next steps diff --git a/articles/ai-services/speech-service/how-to-configure-azure-ad-auth.md b/articles/ai-services/speech-service/how-to-configure-azure-ad-auth.md index ef5975c235..79c1051197 100644 --- a/articles/ai-services/speech-service/how-to-configure-azure-ad-auth.md +++ b/articles/ai-services/speech-service/how-to-configure-azure-ad-auth.md @@ -2,14 +2,13 @@ title: How to configure Microsoft Entra authentication titleSuffix: Azure AI services description: Learn how to authenticate using Microsoft Entra authentication -author: rhurey +author: eric-urban manager: nitinme ms.service: azure-ai-speech ms.topic: how-to -ms.date: 1/18/2024 -ms.author: rhurey +ms.date: 9/9/2024 +ms.author: eur zone_pivot_groups: programming-languages-set-two -ms.devlang: cpp ms.custom: devx-track-azurepowershell, devx-track-extended-java, devx-track-python, devx-track-azurecli --- @@ -186,10 +185,10 @@ For ```SpeechRecognizer```, ```SpeechSynthesizer```, ```IntentRecognizer```, ``` ::: zone pivot="programming-language-csharp" ```C# string resourceId = "Your Resource ID"; -string aadToken = "Your Azure AD access token"; +string aadToken = "Your Microsoft Entra access token"; string region = "Your Speech Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. var authorizationToken = $"aad#{resourceId}#{aadToken}"; var speechConfig = SpeechConfig.FromAuthorizationToken(authorizationToken, region); ``` @@ -198,10 +197,10 @@ var speechConfig = SpeechConfig.FromAuthorizationToken(authorizationToken, regio ::: zone pivot="programming-language-cpp" ```C++ std::string resourceId = "Your Resource ID"; -std::string aadToken = "Your Azure AD access token"; +std::string aadToken = "Your Microsoft Entra access token"; std::string region = "Your Speech Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. auto authorizationToken = "aad#" + resourceId + "#" + aadToken; auto speechConfig = SpeechConfig::FromAuthorizationToken(authorizationToken, region); ``` @@ -212,7 +211,7 @@ auto speechConfig = SpeechConfig::FromAuthorizationToken(authorizationToken, reg String resourceId = "Your Resource ID"; String region = "Your Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. String authorizationToken = "aad#" + resourceId + "#" + token; SpeechConfig speechConfig = SpeechConfig.fromAuthorizationToken(authorizationToken, region); ``` @@ -222,7 +221,7 @@ SpeechConfig speechConfig = SpeechConfig.fromAuthorizationToken(authorizationTok ```Python resourceId = "Your Resource ID" region = "Your Region" -# You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +# You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. authorizationToken = "aad#" + resourceId + "#" + aadToken.token speechConfig = SpeechConfig(auth_token=authorizationToken, region=region) ``` @@ -235,10 +234,10 @@ For the ```TranslationRecognizer```, build the authorization token from the reso ::: zone pivot="programming-language-csharp" ```C# string resourceId = "Your Resource ID"; -string aadToken = "Your Azure AD access token"; +string aadToken = "Your Microsoft Entra access token"; string region = "Your Speech Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. var authorizationToken = $"aad#{resourceId}#{aadToken}"; var speechConfig = SpeechTranslationConfig.FromAuthorizationToken(authorizationToken, region); ``` @@ -247,10 +246,10 @@ var speechConfig = SpeechTranslationConfig.FromAuthorizationToken(authorizationT ::: zone pivot="programming-language-cpp" ```cpp std::string resourceId = "Your Resource ID"; -std::string aadToken = "Your Azure AD access token"; +std::string aadToken = "Your Microsoft Entra access token"; std::string region = "Your Speech Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. auto authorizationToken = "aad#" + resourceId + "#" + aadToken; auto speechConfig = SpeechTranslationConfig::FromAuthorizationToken(authorizationToken, region); ``` @@ -261,7 +260,7 @@ auto speechConfig = SpeechTranslationConfig::FromAuthorizationToken(authorizatio String resourceId = "Your Resource ID"; String region = "Your Region"; -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. String authorizationToken = "aad#" + resourceId + "#" + token; SpeechTranslationConfig translationConfig = SpeechTranslationConfig.fromAuthorizationToken(authorizationToken, region); ``` @@ -272,58 +271,12 @@ SpeechTranslationConfig translationConfig = SpeechTranslationConfig.fromAuthoriz resourceId = "Your Resource ID" region = "Your Region" -# You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +# You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. authorizationToken = "aad#" + resourceId + "#" + aadToken.token translationConfig = SpeechTranslationConfig(auth_token=authorizationToken, region=region) ``` ::: zone-end -### DialogServiceConnector - -For the ```DialogServiceConnection``` object, build the authorization token from the resource ID and the Microsoft Entra access token and then use it to create a ```CustomCommandsConfig``` or a ```BotFrameworkConfig``` object. - -::: zone pivot="programming-language-csharp" -```C# -string resourceId = "Your Resource ID"; -string aadToken = "Your Azure AD access token"; -string region = "Your Speech Region"; -string appId = "Your app ID"; - -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. -var authorizationToken = $"aad#{resourceId}#{aadToken}"; -var customCommandsConfig = CustomCommandsConfig.FromAuthorizationToken(appId, authorizationToken, region); -``` -::: zone-end - -::: zone pivot="programming-language-cpp" -```cpp -std::string resourceId = "Your Resource ID"; -std::string aadToken = "Your Azure AD access token"; -std::string region = "Your Speech Region"; -std::string appId = "Your app Id"; - -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. -auto authorizationToken = "aad#" + resourceId + "#" + aadToken; -auto customCommandsConfig = CustomCommandsConfig::FromAuthorizationToken(appId, authorizationToken, region); -``` -::: zone-end - -::: zone pivot="programming-language-java" -```Java -String resourceId = "Your Resource ID"; -String region = "Your Region"; -String appId = "Your AppId"; - -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. -String authorizationToken = "aad#" + resourceId + "#" + token; -CustomCommandsConfig dialogServiceConfig = CustomCommandsConfig.fromAuthorizationToken(appId, authorizationToken, region); -``` -::: zone-end - -::: zone pivot="programming-language-python" -The DialogServiceConnector is not currently supported in Python -::: zone-end - ### VoiceProfileClient To use the ```VoiceProfileClient``` with Microsoft Entra authentication, use the custom domain name created above. @@ -331,11 +284,11 @@ To use the ```VoiceProfileClient``` with Microsoft Entra authentication, use the ```C# string customDomainName = "Your Custom Name"; string hostName = $"https://{customDomainName}.cognitiveservices.azure.com/"; -string token = "Your Azure AD access token"; +string token = "Your Microsoft Entra access token"; var config = SpeechConfig.FromHost(new Uri(hostName)); -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. var authorizationToken = $"aad#{resourceId}#{aadToken}"; config.AuthorizationToken = authorizationToken; ``` @@ -344,11 +297,11 @@ config.AuthorizationToken = authorizationToken; ::: zone pivot="programming-language-cpp" ```cpp std::string customDomainName = "Your Custom Name"; -std::string aadToken = "Your Azure AD access token"; +std::string aadToken = "Your Microsoft Entra access token"; auto speechConfig = SpeechConfig::FromHost("https://" + customDomainName + ".cognitiveservices.azure.com/"); -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. auto authorizationToken = "aad#" + resourceId + "#" + aadToken; speechConfig->SetAuthorizationToken(authorizationToken); ``` @@ -356,12 +309,12 @@ speechConfig->SetAuthorizationToken(authorizationToken); ::: zone pivot="programming-language-java" ```Java -String aadToken = "Your Azure AD access token"; +String aadToken = "Your Microsoft Entra access token"; String customDomainName = "Your Custom Name"; String hostName = "https://" + customDomainName + ".cognitiveservices.azure.com/"; SpeechConfig speechConfig = SpeechConfig.fromHost(new URI(hostName)); -// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and AAD access token. +// You need to include the "aad#" prefix and the "#" (hash) separator between resource ID and Microsoft Entra access token. String authorizationToken = "aad#" + resourceId + "#" + token; speechConfig.setAuthorizationToken(authorizationToken);