You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you think it would be possible to briefly explain how it works, internally? E. g. how an image can be turned into sound? A paragraph or two should be fine, no need to write a 20 page article. Just anything to get people going who don't know about it (which so happens to be my case, as I don't know; first time I saw this actually).
The text was updated successfully, but these errors were encountered:
Hey @rubyFeedback I'm not the author but I've been messing with this code so I can try to explain what's going on.
A spectrogram works by showing frequency on the vertical axis, time on the horizontal axis. If a certain frequency is loud at a certain time, it is shown as a bright spot on the spectrogram.
What this program does is it breaks down an image and maps the image height, width, and brightness of each pixel, to the frequency, time, and loudness of an audio tone, respectively. For example, a pixel at the top of the image will be a tone with a high frequency, and a pixel at the bottom of the image will be a tone with a low frequency. A pixel toward the left of the image will be earlier in the resulting audio than a pixel toward the right of the image. If the pixel is bright, it's tone will be louder, and if it's dark, it's tone will be quieter.
The image is processed pixel by pixel and a sine wave audio tone is added at the corresponding frequency, time, and loudness so that the pixel is visible in the spectrogram. All the sine waves are then added together to create a single waveform. The reason a sine wave is used is probably because it's a single frequency instead of other waveforms like square or saw which have overtones on many frequencies. The file that does most of this work is base.py, and the inline comments explain generally whats going on, if you wanna take a look for yourself. Hope that helps.
Hello,
Do you think it would be possible to briefly explain how it works, internally? E. g. how an image can be turned into sound? A paragraph or two should be fine, no need to write a 20 page article. Just anything to get people going who don't know about it (which so happens to be my case, as I don't know; first time I saw this actually).
The text was updated successfully, but these errors were encountered: