❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) #288

adriens · 2023-05-13T07:55:48Z

❔ About

i'm currently trying to compress a 30 Mo wav output, but pydub is always complaining about "error: Size should be 1, 2, 3 or 4".

👣 Steps to reproduce

Below is my code sample (I built the wav with bark ) :

!pip install pydub
!pip install ffmpeg-python

from pydub import AudioSegment
import ffmpeg


audio = AudioSegment.from_file('/kaggle/working/auptitcafe.wav', format='wav')
audio = audio.set_sample_width(2)
# Export the audio to MP3 format
audio.export('auptitcafe.mp3', format='mp3')

Then I get the following error message :

error                                     Traceback (most recent call last)
Cell In[22], line 9
      5 import ffmpeg
      8 audio = AudioSegment.from_file('/kaggle/working/auptitcafe.wav', format='wav')
----> 9 audio = audio.set_sample_width(2)
     10 # Export the audio to MP3 format
     11 audio.export('auptitcafe.mp3', format='mp3')

File /opt/conda/lib/python3.10/site-packages/pydub/audio_segment.py:1008, in AudioSegment.set_sample_width(self, sample_width)
   1003     return self
   1005 frame_width = self.channels * sample_width
   1007 return self._spawn(
-> 1008     audioop.lin2lin(self._data, self.sample_width, sample_width),
   1009     overrides={'sample_width': sample_width, 'frame_width': frame_width}
   1010 )

error: Size should be 1, 2, 3 or 4

The text was updated successfully, but these errors were encountered:

adriens · 2023-05-13T08:00:23Z

Is there something special with bark's wav output ?

C0untFloyd · 2023-05-13T08:15:08Z

I'm unfamiliar with that module but Bark's Output Format is Mono, so perhaps try:
audio.set_channels(1)

I guess it wouldn't hurt to also specify the samplerate (which is 24000).

adriens · 2023-05-13T08:35:49Z

Hmmmn yes, thanks, i'm giving it a try and will let you know on this issue 🙏

dnrico1 · 2023-05-14T09:22:59Z

Are you perhaps adding a silence as per the longform audio generation notebook? Ran into the same error yesterday when doing that, because np.zeroes produced 64-bit integers (which are like 64-bit wav samples). The error says that audioop only expects wav samples with 1, 2, 3 or 4 bytes per samples which are equivalent to 8, 16, 24 or 32-bit wavs. Inserting a 64-bit silence in the middle messes things up.

The solution was to specify dtype 16bit int for the np.zeroes

adriens · 2023-05-14T21:46:17Z

Are you perhaps adding a silence as per the longform audio generation notebook? R

Yes, indeed, this is how I'm adding silents. 😅

The solution was to specify dtype 16bit int for the np.zeroes

Would you share some code snippets ❔ 🙏

C0untFloyd · 2023-05-16T18:45:23Z

Not the poster above but you would just specify it in np.zeroes like this:
dtype=np.int16

dnrico1 · 2023-05-16T19:07:27Z

Yes exactly dtype=np.int16 as an argument to np.zeroes

…

On Tue, 16. May 2023 at 20:45, C0untFloyd ***@***.***> wrote: Not the poster above but you would just specify it in np.zeroes like this: dtype=np.int16 — Reply to this email directly, view it on GitHub <#288 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4HHAVFHMPCERF2RJO4ZWXTXGPDM3ANCNFSM6AAAAAAYAK4Q6U> . You are receiving this because you commented.Message ID: ***@***.***>

adriens · 2023-05-16T21:58:08Z

Thanks a lot to both of you @C0untFloyd and @dnrico1 , it really did the trick... making it possible to get the wav file into much much smaller files without hearable quality loss ❣️

# Properly code slience
# https://github.com/suno-ai/bark/issues/288
silence = np.zeros(int(0.25 * SAMPLE_RATE), dtype=np.int16 )

format	size
`.wav`	20.2 MB
`.mp3`	1.7 MB
`.webm`	1.8 MB

... so output wav can be easily previewed on various platforms and easily compressed to mp3/webm suno-ai#288

adriens · 2023-05-17T22:59:43Z

👇

🤏 Better silence management/encoding #305

adriens changed the title ~~❔ Transform output wav to mp3/webm (pydub)~~ ❔ Transform (30 Mo) output wav to mp3/webm (pydub) May 13, 2023

adriens changed the title ~~❔ Transform (30 Mo) output wav to mp3/webm (pydub)~~ ❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) May 13, 2023

adriens closed this as completed May 16, 2023

adriens added a commit to adriens/bark that referenced this issue May 17, 2023

Better silence management/encoding

ecaec48

... so output wav can be easily previewed on various platforms and easily compressed to mp3/webm suno-ai#288

adriens mentioned this issue May 17, 2023

🤏 Better silence management/encoding #305

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) #288

❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) #288

adriens commented May 13, 2023

adriens commented May 13, 2023

C0untFloyd commented May 13, 2023 •

edited

Loading

adriens commented May 13, 2023

dnrico1 commented May 14, 2023 •

edited

Loading

adriens commented May 14, 2023

C0untFloyd commented May 16, 2023

dnrico1 commented May 16, 2023 via email

adriens commented May 16, 2023

adriens commented May 17, 2023

❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) #288

❔ Transform bark (30 Mo) output wav to mp3/webm (pydub) #288

Comments

adriens commented May 13, 2023

❔ About

👣 Steps to reproduce

adriens commented May 13, 2023

C0untFloyd commented May 13, 2023 • edited Loading

adriens commented May 13, 2023

dnrico1 commented May 14, 2023 • edited Loading

adriens commented May 14, 2023

C0untFloyd commented May 16, 2023

dnrico1 commented May 16, 2023 via email

adriens commented May 16, 2023

adriens commented May 17, 2023

C0untFloyd commented May 13, 2023 •

edited

Loading

dnrico1 commented May 14, 2023 •

edited

Loading