-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created pipeline for ASR data creation. #141
base: dev
Are you sure you want to change the base?
Conversation
Important Review skippedReview was skipped as selected files did not have any reviewable changes. Files selected but had no reviewable changes (1)
You can disable this status message by setting the WalkthroughWalkthroughThe changes introduce four new dependencies in Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
workflow/training/whisper.py
Outdated
def process_and_upload_dataset(self, dataset,dataset_name): | ||
temp_dir = tempfile.mkdtemp() | ||
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('AZURE_TTS_KEY'),region = os.environ.get('AZURE_TTS_REGION')) | ||
speech_config.speech_synthesis_voice_name='en-US-AvaMultilingualNeural' | ||
|
||
def text_to_audio(text): | ||
audio_path = os.path.join(temp_dir, f"audio_{hash(text)}.wav") | ||
audio_config = speechsdk.audio.AudioOutputConfig(filename=audio_path) | ||
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config) | ||
result = speech_synthesizer.speak_text_async(text).get() | ||
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted: | ||
|
||
audio_path = self.output_varied_word_loudness_with_noise(audio_path) | ||
return audio_path | ||
else: | ||
print(f"Error synthesizing audio: {result.reason}") | ||
return None | ||
|
||
def process_split(split): | ||
audio_data = [] | ||
sentences = [] | ||
remaining = "" | ||
for example in split: | ||
text = example[split.column_names[0]] | ||
if len(text) > 250: | ||
remaining+=(text+" ") | ||
continue | ||
audio_path = text_to_audio(text) | ||
audio_data.append({"path": audio_path}) | ||
sentences.append(text) | ||
if remaining: | ||
while len(remaining) > 250: | ||
split_index = remaining[:250].rfind(' ') | ||
if split_index == -1: | ||
split_index = 250 | ||
text = remaining[:split_index].strip() | ||
audio_path = text_to_audio(text) | ||
audio_data.append({"path": audio_path}) | ||
sentences.append(text) | ||
remaining = remaining[split_index:].strip() | ||
if remaining: | ||
audio_path = text_to_audio(remaining) | ||
audio_data.append({"path": audio_path}) | ||
sentences.append(remaining) | ||
processed_split = Dataset.from_dict({ | ||
"audio": audio_data, | ||
"sentence": sentences | ||
}) | ||
processed_split = processed_split.cast_column("audio", Audio()) | ||
return processed_split | ||
|
||
def create_test_split(dataset, test_size=0.2): | ||
data = list(dataset) | ||
random.shuffle(data) | ||
split_index = int(len(data) * (1 - test_size)) | ||
train_data = data[:split_index] | ||
test_data = data[split_index:] | ||
return Dataset.from_list(train_data), Dataset.from_list(test_data) | ||
|
||
train_dataset = process_split(dataset['train']) | ||
if 'test' in dataset: | ||
test_dataset = process_split(dataset['test']) | ||
else: | ||
train_dataset, test_dataset = create_test_split(train_dataset) | ||
|
||
processed_dataset = DatasetDict({ | ||
'train': train_dataset, | ||
'test': test_dataset | ||
}) | ||
repo_id = f"{dataset_name}_audio" | ||
try: | ||
create_repo(repo_id, repo_type="dataset", token=os.environ.get('HUGGING_FACE_TOKEN')) | ||
except Exception as e: | ||
print(f"Repo already exists or couldn't be created: {e}") | ||
processed_dataset.push_to_hub(repo_id, token=os.environ.get('HUGGING_FACE_TOKEN')) | ||
print(f"Dataset uploaded successfully to {repo_id}") | ||
|
||
return processed_dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for external calls.
The function makes several external calls, such as to Azure's Cognitive Services and Hugging Face Hub, without error handling.
Add try-except blocks around external calls to handle exceptions and log errors.
Remove unused import HfApi
.
The import HfApi
is not used in the code.
Remove the unused import to clean up the code:
-from huggingface_hub import HfApi, create_repo
+from huggingface_hub import create_repo
Optimize the logic for handling long text.
The logic for handling long text in process_split
can be optimized by using a more efficient approach to split text into manageable chunks.
Consider using a text-wrapping library or a more efficient algorithm to split text.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def process_and_upload_dataset(self, dataset,dataset_name): | |
temp_dir = tempfile.mkdtemp() | |
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('AZURE_TTS_KEY'),region = os.environ.get('AZURE_TTS_REGION')) | |
speech_config.speech_synthesis_voice_name='en-US-AvaMultilingualNeural' | |
def text_to_audio(text): | |
audio_path = os.path.join(temp_dir, f"audio_{hash(text)}.wav") | |
audio_config = speechsdk.audio.AudioOutputConfig(filename=audio_path) | |
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config) | |
result = speech_synthesizer.speak_text_async(text).get() | |
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted: | |
audio_path = self.output_varied_word_loudness_with_noise(audio_path) | |
return audio_path | |
else: | |
print(f"Error synthesizing audio: {result.reason}") | |
return None | |
def process_split(split): | |
audio_data = [] | |
sentences = [] | |
remaining = "" | |
for example in split: | |
text = example[split.column_names[0]] | |
if len(text) > 250: | |
remaining+=(text+" ") | |
continue | |
audio_path = text_to_audio(text) | |
audio_data.append({"path": audio_path}) | |
sentences.append(text) | |
if remaining: | |
while len(remaining) > 250: | |
split_index = remaining[:250].rfind(' ') | |
if split_index == -1: | |
split_index = 250 | |
text = remaining[:split_index].strip() | |
audio_path = text_to_audio(text) | |
audio_data.append({"path": audio_path}) | |
sentences.append(text) | |
remaining = remaining[split_index:].strip() | |
if remaining: | |
audio_path = text_to_audio(remaining) | |
audio_data.append({"path": audio_path}) | |
sentences.append(remaining) | |
processed_split = Dataset.from_dict({ | |
"audio": audio_data, | |
"sentence": sentences | |
}) | |
processed_split = processed_split.cast_column("audio", Audio()) | |
return processed_split | |
def create_test_split(dataset, test_size=0.2): | |
data = list(dataset) | |
random.shuffle(data) | |
split_index = int(len(data) * (1 - test_size)) | |
train_data = data[:split_index] | |
test_data = data[split_index:] | |
return Dataset.from_list(train_data), Dataset.from_list(test_data) | |
train_dataset = process_split(dataset['train']) | |
if 'test' in dataset: | |
test_dataset = process_split(dataset['test']) | |
else: | |
train_dataset, test_dataset = create_test_split(train_dataset) | |
processed_dataset = DatasetDict({ | |
'train': train_dataset, | |
'test': test_dataset | |
}) | |
repo_id = f"{dataset_name}_audio" | |
try: | |
create_repo(repo_id, repo_type="dataset", token=os.environ.get('HUGGING_FACE_TOKEN')) | |
except Exception as e: | |
print(f"Repo already exists or couldn't be created: {e}") | |
processed_dataset.push_to_hub(repo_id, token=os.environ.get('HUGGING_FACE_TOKEN')) | |
print(f"Dataset uploaded successfully to {repo_id}") | |
return processed_dataset | |
def process_and_upload_dataset(self, dataset,dataset_name): | |
temp_dir = tempfile.mkdtemp() | |
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('AZURE_TTS_KEY'),region = os.environ.get('AZURE_TTS_REGION')) | |
speech_config.speech_synthesis_voice_name='en-US-AvaMultilingualNeural' | |
def text_to_audio(text): | |
audio_path = os.path.join(temp_dir, f"audio_{hash(text)}.wav") | |
audio_config = speechsdk.audio.AudioOutputConfig(filename=audio_path) | |
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config) | |
result = speech_synthesizer.speak_text_async(text).get() | |
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted: | |
audio_path = self.output_varied_word_loudness_with_noise(audio_path) | |
return audio_path | |
else: | |
print(f"Error synthesizing audio: {result.reason}") | |
return None | |
def process_split(split): | |
audio_data = [] | |
sentences = [] | |
remaining = "" | |
for example in split: | |
text = example[split.column_names[0]] | |
if len(text) > 250: | |
remaining+=(text+" ") | |
continue | |
audio_path = text_to_audio(text) | |
audio_data.append({"path": audio_path}) | |
sentences.append(text) | |
if remaining: | |
while len(remaining) > 250: | |
split_index = remaining[:250].rfind(' ') | |
if split_index == -1: | |
split_index = 250 | |
text = remaining[:split_index].strip() | |
audio_path = text_to_audio(text) | |
audio_data.append({"path": audio_path}) | |
sentences.append(text) | |
remaining = remaining[split_index:].strip() | |
if remaining: | |
audio_path = text_to_audio(remaining) | |
audio_data.append({"path": audio_path}) | |
sentences.append(remaining) | |
processed_split = Dataset.from_dict({ | |
"audio": audio_data, | |
"sentence": sentences | |
}) | |
processed_split = processed_split.cast_column("audio", Audio()) | |
return processed_split | |
def create_test_split(dataset, test_size=0.2): | |
data = list(dataset) | |
random.shuffle(data) | |
split_index = int(len(data) * (1 - test_size)) | |
train_data = data[:split_index] | |
test_data = data[split_index:] | |
return Dataset.from_list(train_data), Dataset.from_list(test_data) | |
train_dataset = process_split(dataset['train']) | |
if 'test' in dataset: | |
test_dataset = process_split(dataset['test']) | |
else: | |
train_dataset, test_dataset = create_test_split(train_dataset) | |
processed_dataset = DatasetDict({ | |
'train': train_dataset, | |
'test': test_dataset | |
}) | |
repo_id = f"{dataset_name}_audio" | |
try: | |
create_repo(repo_id, repo_type="dataset", token=os.environ.get('HUGGING_FACE_TOKEN')) | |
except Exception as e: | |
print(f"Repo already exists or couldn't be created: {e}") | |
processed_dataset.push_to_hub(repo_id, token=os.environ.get('HUGGING_FACE_TOKEN')) | |
print(f"Dataset uploaded successfully to {repo_id}") | |
return processed_dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
def output_varied_word_loudness_with_noise(self,input_audio_path): | ||
audio = AudioSegment.from_wav(input_audio_path) | ||
|
||
def gaussian_kernel(size, sigma=1.0): | ||
x = np.linspace(-size // 2, size // 2, size) | ||
kernel = np.exp(-(x ** 2) / (2 * sigma ** 2)) | ||
return kernel / np.sum(kernel) | ||
|
||
def apply_gaussian_filter(audio_segment, kernel_size=21, sigma=1.0): | ||
audio_array = np.array(audio_segment.get_array_of_samples()).astype(np.float32) | ||
if audio_segment.channels == 2: | ||
audio_array = audio_array.reshape((-1, 2)).mean(axis=1) | ||
audio_array = audio_array / np.max(np.abs(audio_array)) | ||
kernel = gaussian_kernel(kernel_size, sigma) | ||
filtered_signal = signal.convolve(audio_array, kernel, mode='same') | ||
filtered_signal = filtered_signal / np.max(np.abs(filtered_signal)) | ||
filtered_signal = (filtered_signal * 32767).astype(np.int16) | ||
filtered_audio = AudioSegment( | ||
filtered_signal.tobytes(), | ||
frame_rate=audio_segment.frame_rate, | ||
sample_width=2, | ||
channels=1 | ||
) | ||
return filtered_audio | ||
|
||
def generate_varied_noise(length, max_amplitude): | ||
base_noise = np.random.normal(0, 1, length) | ||
envelope = np.random.uniform(0, 1, length) | ||
return base_noise * envelope * max_amplitude | ||
|
||
def split_into_words(audio): | ||
chunks = split_on_silence(audio, | ||
min_silence_len=50, | ||
silence_thresh=-40, | ||
keep_silence=50) | ||
return chunks | ||
|
||
def adjust_random_word_volumes(chunks, min_adjustment=0.3, max_adjustment=10.0): | ||
adjusted_chunks = [] | ||
for chunk in chunks: | ||
if np.random.random() < 0.8: | ||
adjustment = np.random.uniform(min_adjustment, max_adjustment) | ||
chunk = chunk + (10 * np.log10(adjustment)) | ||
adjusted_chunks.append(chunk) | ||
return adjusted_chunks | ||
|
||
word_chunks = split_into_words(audio) | ||
adjusted_chunks = adjust_random_word_volumes(word_chunks) | ||
varied_loudness_audio = sum(adjusted_chunks) | ||
kernel_size = 21 | ||
sigma = 1.0 | ||
filtered_audio = apply_gaussian_filter(varied_loudness_audio, kernel_size, sigma) | ||
audio_data, sample_rate = sf.read(filtered_audio.export(input_audio_path, format="wav")) | ||
max_noise_amplitude = 0.01 | ||
noise = generate_varied_noise(len(audio_data), max_noise_amplitude) | ||
noisy_audio = audio_data + noise | ||
noisy_audio = np.clip(noisy_audio, -1, 1) | ||
sf.write(input_audio_path, noisy_audio, sample_rate) | ||
return input_audio_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhance error handling and logging.
The function could benefit from additional error handling and logging for key steps.
Apply this diff to add error handling and logging:
def output_varied_word_loudness_with_noise(self, input_audio_path):
- audio = AudioSegment.from_wav(input_audio_path)
+ try:
+ audio = AudioSegment.from_wav(input_audio_path)
+ except Exception as e:
+ logger.error(f"Error loading audio file: {e}")
+ return None
def gaussian_kernel(size, sigma=1.0):
x = np.linspace(-size // 2, size // 2, size)
kernel = np.exp(-(x ** 2) / (2 * sigma ** 2))
return kernel / np.sum(kernel)
def apply_gaussian_filter(audio_segment, kernel_size=21, sigma=1.0):
audio_array = np.array(audio_segment.get_array_of_samples()).astype(np.float32)
if audio_segment.channels == 2:
audio_array = audio_array.reshape((-1, 2)).mean(axis=1)
audio_array = audio_array / np.max(np.abs(audio_array))
kernel = gaussian_kernel(kernel_size, sigma)
filtered_signal = signal.convolve(audio_array, kernel, mode='same')
filtered_signal = filtered_signal / np.max(np.abs(filtered_signal))
filtered_signal = (filtered_signal * 32767).astype(np.int16)
filtered_audio = AudioSegment(
filtered_signal.tobytes(),
frame_rate=audio_segment.frame_rate,
sample_width=2,
channels=1
)
return filtered_audio
def generate_varied_noise(length, max_amplitude):
base_noise = np.random.normal(0, 1, length)
envelope = np.random.uniform(0, 1, length)
return base_noise * envelope * max_amplitude
def split_into_words(audio):
chunks = split_on_silence(audio,
min_silence_len=50,
silence_thresh=-40,
keep_silence=50)
return chunks
def adjust_random_word_volumes(chunks, min_adjustment=0.3, max_adjustment=10.0):
adjusted_chunks = []
for chunk in chunks:
if np.random.random() < 0.8:
adjustment = np.random.uniform(min_adjustment, max_adjustment)
chunk = chunk + (10 * np.log10(adjustment))
adjusted_chunks.append(chunk)
return adjusted_chunks
word_chunks = split_into_words(audio)
adjusted_chunks = adjust_random_word_volumes(word_chunks)
varied_loudness_audio = sum(adjusted_chunks)
kernel_size = 21
sigma = 1.0
filtered_audio = apply_gaussian_filter(varied_loudness_audio, kernel_size, sigma)
- audio_data, sample_rate = sf.read(filtered_audio.export(input_audio_path, format="wav"))
+ try:
+ audio_data, sample_rate = sf.read(filtered_audio.export(input_audio_path, format="wav"))
+ except Exception as e:
+ logger.error(f"Error reading filtered audio: {e}")
+ return None
max_noise_amplitude = 0.01
noise = generate_varied_noise(len(audio_data), max_noise_amplitude)
noisy_audio = audio_data + noise
noisy_audio = np.clip(noisy_audio, -1, 1)
- sf.write(input_audio_path, noisy_audio, sample_rate)
+ try:
+ sf.write(input_audio_path, noisy_audio, sample_rate)
+ except Exception as e:
+ logger.error(f"Error writing noisy audio: {e}")
+ return None
return input_audio_path
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def output_varied_word_loudness_with_noise(self,input_audio_path): | |
audio = AudioSegment.from_wav(input_audio_path) | |
def gaussian_kernel(size, sigma=1.0): | |
x = np.linspace(-size // 2, size // 2, size) | |
kernel = np.exp(-(x ** 2) / (2 * sigma ** 2)) | |
return kernel / np.sum(kernel) | |
def apply_gaussian_filter(audio_segment, kernel_size=21, sigma=1.0): | |
audio_array = np.array(audio_segment.get_array_of_samples()).astype(np.float32) | |
if audio_segment.channels == 2: | |
audio_array = audio_array.reshape((-1, 2)).mean(axis=1) | |
audio_array = audio_array / np.max(np.abs(audio_array)) | |
kernel = gaussian_kernel(kernel_size, sigma) | |
filtered_signal = signal.convolve(audio_array, kernel, mode='same') | |
filtered_signal = filtered_signal / np.max(np.abs(filtered_signal)) | |
filtered_signal = (filtered_signal * 32767).astype(np.int16) | |
filtered_audio = AudioSegment( | |
filtered_signal.tobytes(), | |
frame_rate=audio_segment.frame_rate, | |
sample_width=2, | |
channels=1 | |
) | |
return filtered_audio | |
def generate_varied_noise(length, max_amplitude): | |
base_noise = np.random.normal(0, 1, length) | |
envelope = np.random.uniform(0, 1, length) | |
return base_noise * envelope * max_amplitude | |
def split_into_words(audio): | |
chunks = split_on_silence(audio, | |
min_silence_len=50, | |
silence_thresh=-40, | |
keep_silence=50) | |
return chunks | |
def adjust_random_word_volumes(chunks, min_adjustment=0.3, max_adjustment=10.0): | |
adjusted_chunks = [] | |
for chunk in chunks: | |
if np.random.random() < 0.8: | |
adjustment = np.random.uniform(min_adjustment, max_adjustment) | |
chunk = chunk + (10 * np.log10(adjustment)) | |
adjusted_chunks.append(chunk) | |
return adjusted_chunks | |
word_chunks = split_into_words(audio) | |
adjusted_chunks = adjust_random_word_volumes(word_chunks) | |
varied_loudness_audio = sum(adjusted_chunks) | |
kernel_size = 21 | |
sigma = 1.0 | |
filtered_audio = apply_gaussian_filter(varied_loudness_audio, kernel_size, sigma) | |
audio_data, sample_rate = sf.read(filtered_audio.export(input_audio_path, format="wav")) | |
max_noise_amplitude = 0.01 | |
noise = generate_varied_noise(len(audio_data), max_noise_amplitude) | |
noisy_audio = audio_data + noise | |
noisy_audio = np.clip(noisy_audio, -1, 1) | |
sf.write(input_audio_path, noisy_audio, sample_rate) | |
return input_audio_path | |
def output_varied_word_loudness_with_noise(self, input_audio_path): | |
try: | |
audio = AudioSegment.from_wav(input_audio_path) | |
except Exception as e: | |
logger.error(f"Error loading audio file: {e}") | |
return None | |
def gaussian_kernel(size, sigma=1.0): | |
x = np.linspace(-size // 2, size // 2, size) | |
kernel = np.exp(-(x ** 2) / (2 * sigma ** 2)) | |
return kernel / np.sum(kernel) | |
def apply_gaussian_filter(audio_segment, kernel_size=21, sigma=1.0): | |
audio_array = np.array(audio_segment.get_array_of_samples()).astype(np.float32) | |
if audio_segment.channels == 2: | |
audio_array = audio_array.reshape((-1, 2)).mean(axis=1) | |
audio_array = audio_array / np.max(np.abs(audio_array)) | |
kernel = gaussian_kernel(kernel_size, sigma) | |
filtered_signal = signal.convolve(audio_array, kernel, mode='same') | |
filtered_signal = filtered_signal / np.max(np.abs(filtered_signal)) | |
filtered_signal = (filtered_signal * 32767).astype(np.int16) | |
filtered_audio = AudioSegment( | |
filtered_signal.tobytes(), | |
frame_rate=audio_segment.frame_rate, | |
sample_width=2, | |
channels=1 | |
) | |
return filtered_audio | |
def generate_varied_noise(length, max_amplitude): | |
base_noise = np.random.normal(0, 1, length) | |
envelope = np.random.uniform(0, 1, length) | |
return base_noise * envelope * max_amplitude | |
def split_into_words(audio): | |
chunks = split_on_silence(audio, | |
min_silence_len=50, | |
silence_thresh=-40, | |
keep_silence=50) | |
return chunks | |
def adjust_random_word_volumes(chunks, min_adjustment=0.3, max_adjustment=10.0): | |
adjusted_chunks = [] | |
for chunk in chunks: | |
if np.random.random() < 0.8: | |
adjustment = np.random.uniform(min_adjustment, max_adjustment) | |
chunk = chunk + (10 * np.log10(adjustment)) | |
adjusted_chunks.append(chunk) | |
return adjusted_chunks | |
word_chunks = split_into_words(audio) | |
adjusted_chunks = adjust_random_word_volumes(word_chunks) | |
varied_loudness_audio = sum(adjusted_chunks) | |
kernel_size = 21 | |
sigma = 1.0 | |
filtered_audio = apply_gaussian_filter(varied_loudness_audio, kernel_size, sigma) | |
try: | |
audio_data, sample_rate = sf.read(filtered_audio.export(input_audio_path, format="wav")) | |
except Exception as e: | |
logger.error(f"Error reading filtered audio: {e}") | |
return None | |
max_noise_amplitude = 0.01 | |
noise = generate_varied_noise(len(audio_data), max_noise_amplitude) | |
noisy_audio = audio_data + noise | |
noisy_audio = np.clip(noisy_audio, -1, 1) | |
try: | |
sf.write(input_audio_path, noisy_audio, sample_rate) | |
except Exception as e: | |
logger.error(f"Error writing noisy audio: {e}") | |
return None | |
return input_audio_path |
Completes this https://github.com/BharatSahAIyak/ai-tools/issues/98
Key Features
Process Flow
Checks if the dataset only has one column i.e. of text.
Text-to-Audio Conversion:
Dataset Processing:
Chunk Management:
Dataset Structure:
Upload:
Key Changes
1. Added
process_and_upload_dataset
Function2. Added
output_varied_word_loudness_with_noise
FunctionSummary by CodeRabbit
New Features
Dependencies