Identifying pattern in Audio File #44
Replies: 5 comments 40 replies
-
interesting, so how would be the pattern in audio file looks like? For images, we may have low level feature, such as lines or edges, and then to blobs and shape, and with that we may achieve some higher level application such as recognition or detection. In this case, for audio, would it be some applications such as background noise removal? Or auto tuning? To enable the visualization, what would be some other option to 'see' the frequency response other than spectrogram ? And do you mind to share the code as well? |
Beta Was this translation helpful? Give feedback.
-
Not sure DFT is directly applicable here, let's do without it first. Maybe the alg will rediscover it. |
Beta Was this translation helpful? Give feedback.
-
I just checked for the first time, uncompressed audio is PCM, which doesn't represent frequency, only amplitude. |
Beta Was this translation helpful? Give feedback.
-
PP is a span of matching Ps, as P is a span of matching pixels. That match is computed for each P parameter, including L and neg_L. |
Beta Was this translation helpful? Give feedback.
-
and I am assuming our all the discussion is for mono audio file....Am I right @boris-kz ? |
Beta Was this translation helpful? Give feedback.
-
How to find Pattern in an Audio format:
These are two approaches:
Directly convert the audio file into byte ndarray and apply simple all the 1-D(line pattern and line pps) algorithms we have created
Each sample is stored as a number (two bytes)
range of available combinations:
– 16 bits, 216 = 65,536
–we want both positive and negative values (To indicate compressions and rarefactions)
–we use one bit to indicate positive (0) or negative (1)?
– That leaves us with 15 bits – 15 bits, 215 = 32,768
– One of those combinations will stand for zero ( We’ll use a “positive” one, so that’s one less pattern for positives so the range is from -32,768 to 32,767)
A Sound has many values in it
– Numbers that represent the sound at that time in the sample
• We can get an array of SoundSample objects
– SoundSample[] sampleArray = sound1.getSamples();
Once we have array we can implement line-pattern and line PPs algos on these arrays?(what do you say)
Calculate the DFT from FFT of the audio signal and store that in the array we can proceed on that array too
OR
further from DFT apply windows and create and plot Spectrogram(In a spectrogram representation plot — one axis represents the time, the second axis represents frequencies and the colors represent magnitude (amplitude) of the observed frequency at a particular time.) and this will convert it into a image like we have now then we can find patterns or line of patterns as we are on 1-D line pattern and 2-D frame pattern identification thing (your thoughts)?
Beta Was this translation helpful? Give feedback.
All reactions