diff --git a/src/routes/(content)/learn/batch-processing/+page.svx b/src/routes/(content)/learn/batch-processing/+page.svx index d3b9b786..a5b66fd4 100644 --- a/src/routes/(content)/learn/batch-processing/+page.svx +++ b/src/routes/(content)/learn/batch-processing/+page.svx @@ -55,7 +55,7 @@ There is also [this](https://discourse.flucoma.org/t/supernova-supercollider-una -In this example a single audio file is sliced into a collection of segments. Each segment will be analysed with an audio descriptor and the result for that segment will be stored in a [DataSet](/reference/dataset).The example illustrates how one can iterate through several portions of the source buffer and use the segmentation result to control a process that only analyses specific portions of time in the source file. +In this example a single audio file is sliced into a collection of segments. Each segment will be analysed with an audio descriptor and the result for that segment will be stored in a [DataSet](/reference/dataset). The example illustrates how one can iterate through several portions of the source buffer and use the segmentation result to control a process that only analyses specific portions of time in the source file. The step-by-step workflow is as follows: @@ -181,7 +181,7 @@ src='/examples/batch/add-list.png' alt='Using vexpr to add two lists together' /> -The complexity of the patching hasn't increased greatly, and by replacing `+` with `vexpr` (vector expression), we can process entire lists of values. The patch will also still work with single value inputs (scalars) making it much more flexible than what we had originally. While the patching itself is not too complicated, knowing about the existence of these objects and how they work is not greatly emphasised in the way visual programming is taught wich is often concerned entirely with _streams_ of scalar values. Furthermore, while this is just a toy example, it demonstrates that the notion of processing groups of things together is not the **first** and **most obvious** way to patch. Conversely, SuperCollider has the language (sclang) which offers built-in methods for dealing with arrays of data. Most operators are also _overloaded_, meaning the interface for adding two integers is roughly the same as adding two arrays of integers. For example: +The complexity of the patching hasn't increased greatly, and by replacing `+` with `vexpr` (vector expression), we can process entire lists of values. The patch will also still work with single value inputs (scalars) making it much more flexible than what we had originally. While the patching itself is not too complicated, knowing about the existence of these objects and how they work is not greatly emphasised in the way visual programming is taught, which is often concerned entirely with _streams_ of scalar values. Furthermore, while this is just a toy example, it demonstrates that the notion of processing groups of things together is not the **first** and **most obvious** way to patch. Conversely, SuperCollider has the language (sclang) which offers built-in methods for dealing with arrays of data. Most operators are also _overloaded_, meaning the interface for adding two integers is roughly the same as adding two arrays of integers. For example: diff --git a/src/routes/(content)/learn/bufnmf/+page.svx b/src/routes/(content)/learn/bufnmf/+page.svx index b0fc900f..de6b621e 100644 --- a/src/routes/(content)/learn/bufnmf/+page.svx +++ b/src/routes/(content)/learn/bufnmf/+page.svx @@ -42,7 +42,7 @@ As you can see (and hear in the playback), these different drum hits overlap qui This is where non-negative matrix factorisation can be quite useful. -We can use BufNMF to analyze the drum loop’s spectrogram and determine which parts of the spectrum always (or often) occur together in time. For example, in the snare hits (the green boxes) you can see the different partials of the drum head’s resonant tone (the horizontal bands from ~300 Hz to ~2000 Hz) always occurring together. Sometimes there is also some cymbal sound happening (above that in a red box), but because the cymbal sound is often not there, NMF will try to decompose the spectrogram in such a way that these can be considered separate components. In general, the more that different sound objects are separated out in time and the more repetitions of each sound object, the better BufNMF will be at identifying and separating them. +We can use BufNMF to analyze the drum loop’s spectrogram and determine which parts of the spectrum always (or often) occur together in time. For example, in the snare hits (the blue boxes) you can see the different partials of the drum head’s resonant tone (the horizontal bands from ~300 Hz to ~2000 Hz) always occurring together. Sometimes there is also some cymbal sound happening (above that in a yellow box), but because the cymbal sound is often not there, NMF will try to decompose the spectrogram in such a way that these can be considered separate components. In general, the more that different sound objects are separated out in time and the more repetitions of each sound object, the better BufNMF will be at identifying and separating them. Processing this mono drum loop with BufNMF (specifying that we want it to try to decompose the sound into three components) returns these three buffers, one with each component it was able to separate: diff --git a/src/routes/(content)/learn/decomposition-overview/+page.svx b/src/routes/(content)/learn/decomposition-overview/+page.svx index 7c84b510..fcd89e2e 100644 --- a/src/routes/(content)/learn/decomposition-overview/+page.svx +++ b/src/routes/(content)/learn/decomposition-overview/+page.svx @@ -14,7 +14,7 @@ featuredimage: /img/flucoma-magical.jpg ## Introduction -One of the strongest arms of the toolkit is its *decomposition* capabilities. If you’re not familiar with the term, the basic idea is that any mixed sound is likely to be made up of many other sounds. Unfortunately it can be really hard to find and listen to those “hidden” sounds so decomposition offers some strategies, algorithms and manoeuvres that help you discover and explore the internal structure of a sound. +One of the strongest arms of the toolkit is its *decomposition* capabilities. If you’re not familiar with the term, the basic idea is that any mixed sound is likely to be made up of many other sounds. Unfortunately it can be really hard to find and listen to those “hidden” sounds, so decomposition offers some strategies, algorithms and manoeuvres that help you discover and explore the internal structure of a sound. Here on the team, we tend to think of there being two types of objects that deal with this process distinctly. @@ -38,4 +38,4 @@ These objects take sounds and *deconstruct* them, returning to you a set of new Separation techniques often open the musical question: “What are you hoping to find?”. Sometimes these objects can serve a direct and clear purpose such as wanting to extract a specific sound from a mixed source. On the other hand, they can also function as speculative tools, and can be used to listen to what might be lurking in the mixture. In this vein, there is a wonderful object called [BufNMF](/reference/bufnmf). Perhaps one of, if not THE most powerful tool in the toolkit on its own it can be used for “blind source separation”, classification, feature learning amongst many other things. We have a great set of articles talking about how you can use [BufNMF](/reference/bufnmf), such as [Seeding BufNMF with Bases and Activations](/learn/seeding-nmf), and [Audio Decomposition Using BufNMF](/learn/bufnmf). -Lastly is the world of hybridisation through separation. Imagine you want to cross the bright timbre of a hi-hat with the envelope of a synthesiser. This is where objects like [AudioTransport](/reference/audiotransport) and [BufNMFCross](/reference/bufnmfcross) come into play. These are definitely not easy points of entry though so try out the other objects first! \ No newline at end of file +Lastly is the world of hybridisation through separation. Imagine you want to cross the bright timbre of a hi-hat with the envelope of a synthesiser. This is where objects like [AudioTransport](/reference/audiotransport) and [BufNMFCross](/reference/bufnmfcross) come into play. These are definitely not easy points of entry though so try out the other objects first! diff --git a/src/routes/(content)/learn/mlp-parameters/+page.svx b/src/routes/(content)/learn/mlp-parameters/+page.svx index fb0d8e9c..2dadbd1a 100644 --- a/src/routes/(content)/learn/mlp-parameters/+page.svx +++ b/src/routes/(content)/learn/mlp-parameters/+page.svx @@ -39,7 +39,7 @@ Because changing `hiddenLayers` changes the internal structure of the neural net Generally speaking, the larger the internal structure of a neural network is, the more data points it needs to train. A general-principle is to have about ten times as many data points as there are the total number of internal parameters. The number of internal parameters in a neural network is total number of weights + the total number of biases. The total number of weights equals the sum of the products of each pair of adjacent layers. The total number of biases is equal to the number of hidden neurons + the number of output neurons. -For example, consider a neural network with 13 input neurons, two hidden layers of 5 and 4 neurons, and an output layer of 3 neurons. The total number of weights is `( 13 * 5 ) + ( 5 * 4 ) + ( 4 * 3 ) = 97`. The total number of biases is `5 + 4 + 3 = 12`. Which makes the total number of parameters `97 + 12 = 109`. The general-principle suggests that it would be good to have around 1,000 data points (about ten times as many) to successfully train this neural network. In practice, a neural network of this size may be able to learn from many fewer points. It completely depends on how clearly patterns in the data are represented. This general-principle may become important if your training is not going well--consider how many point you have and how that compares to how may internal parameters are in your neural nentwork. Based on these numbers consider changing the size of your neural network or using a different number of data points and see if your neural network improves. +For example, consider a neural network with 13 input neurons, two hidden layers of 5 and 4 neurons, and an output layer of 3 neurons. The total number of weights is `( 13 * 5 ) + ( 5 * 4 ) + ( 4 * 3 ) = 97`. The total number of biases is `5 + 4 + 3 = 12`. Which makes the total number of parameters `97 + 12 = 109`. The general-principle suggests that it would be good to have around 1,000 data points (about ten times as many) to successfully train this neural network. In practice, a neural network of this size may be able to learn from many fewer points. It completely depends on how clearly patterns in the data are represented. This general-principle may become important if your training is not going well--consider how many point you have and how that compares to how many internal parameters are in your neural network. Based on these numbers consider changing the size of your neural network or using a different number of data points and see if your neural network improves. -During the NNDSVD process, any zeros in the ``soruce`` buffer won't be able to be updated because the process multiplies these values by a scalar (and ``0 * anything = 0``). Therefore, it may be useful to change any zeros to something else before this process. BufNMFSeed has four options for managing values in the ``source`` buffer before processing: +During the NNDSVD process, any zeros in the ``source`` buffer won't be able to be updated because the process multiplies these values by a scalar (and ``0 * anything = 0``). Therefore, it may be useful to change any zeros to something else before this process. BufNMFSeed has four options for managing values in the ``source`` buffer before processing: * **NMF-SVD** Nonnegative Double Singular Value Decomposition where any negative values are first converted to their absolute value. This is likely to be quicker than the other options, but less rigorous. (This is the default.) * **NNDSVDar** Nonnegative Double Singular Value Decomposition where any elements that are zero are first filled with a random value between 0 and the ``average * 0.01`` (essentially small random values). This may be slightly faster but slightly less accurate than other options. diff --git a/src/routes/(content)/reference/bufscale/+page.svx b/src/routes/(content)/reference/bufscale/+page.svx index 63dd4589..4ef99b82 100644 --- a/src/routes/(content)/reference/bufscale/+page.svx +++ b/src/routes/(content)/reference/bufscale/+page.svx @@ -13,6 +13,6 @@ category: Utility Objects -The [BufScale]() object scaling the value of data stored in a buffer and copies it to a new buffer. The interface is relatively lean: all you provide are minima and maxima for the input and output. Optionally you can clip the output, such that after the scaling is performed the numbers never exceed the provided output range. Use the widget below to see how the scaling process processes data. +The [BufScale]() object scales the values of data stored in a buffer and copies them to a new buffer. The interface is relatively lean: all you provide are minima and maxima for the input and output. Optionally you can clip the output, such that after the scaling is performed the numbers never exceed the provided output range. Use the widget below to see how the scaling process processes data. - \ No newline at end of file + diff --git a/src/routes/(content)/reference/bufstft/+page.svx b/src/routes/(content)/reference/bufstft/+page.svx index 720f0ea4..293e776f 100644 --- a/src/routes/(content)/reference/bufstft/+page.svx +++ b/src/routes/(content)/reference/bufstft/+page.svx @@ -27,12 +27,12 @@ Because a wealth of [resources](/learn/fourier-transform) explaining the Fourier ## Re-synthesising One reason the STFT is so commonly used is that it is possible to transform the information from the frequency domain back into the time domain. For any given FFT frame, if no magnitudes or phases are changed at all, it is possible to perfectly reconstruct the original signal of `windowSize` samples. We can then sum together our overlapping segments of `windowSize` samples (spaced out by `hopSize`) samples to reconstruct the original buffer (this process is called [overlap-add](https://en.wikipedia.org/wiki/Overlap%E2%80%93add_method)). BufSTFT can perform this by supplying a buffer for both `magnitude` and `phase`, specifying `inverse` = 1, and passing a buffer to `resynth` for writing the re-synthesized signal into. -When we make our segments, it is common to apply an envelope to them, called a [window function](https://en.wikipedia.org/wiki/Window_function). Different window functions have different requirements for being able to provide perfect reconstruction, usually specifying what `hopSize` to use in relation to a `windowSize` (also called an overlap factor which is `windowSize` / `hopSize`). FluCoMa Uses a Hann Window, for which, the overlap factor should be an integer greater than or equal to 2 (the `windowSize` should be at least twice as big as the `hopSize`). This is why FluCoMa's default `hopSize` of -1 indicates to use a `hopSize` equal to `windowSize` / 2 (which is an overlap factor of 2). +When we make our segments, it is common to apply an envelope to them, called a [window function](https://en.wikipedia.org/wiki/Window_function). Different window functions have different requirements for being able to provide perfect reconstruction, usually specifying what `hopSize` to use in relation to a `windowSize` (also called an overlap factor which is `windowSize` / `hopSize`). FluCoMa uses a Hann Window, for which, the overlap factor should be an integer greater than or equal to 2 (the `windowSize` should be at least twice as big as the `hopSize`). This is why FluCoMa's default `hopSize` of -1 indicates to use a `hopSize` equal to `windowSize` / 2 (which is an overlap factor of 2). ## Changing Values in the Frequency Domain -As said above, if no magnitudes or phases are changed at all, the original signal can be re-created using an inverse FFT (and an inverse STFT if there are may windows of analysis). However, if even one magnitude or phase is modified the original cannot be exactly reconstructed. This may be useful, for example, modifying some of the magnitudes before re-synthesis can adjust the loudness of a band of frequencies in the original signal (sometimes called spectral filtering). Generally speaking this is how many of the spectral processing algorithms in FluCoMa operate ([AudioTransport](/reference/audiotransport), [BufNMF](/reference/bufnmf), [HPSS](/reference/hpss), and more). +As said above, if no magnitudes or phases are changed at all, the original signal can be re-created using an inverse FFT (and an inverse STFT if there are many windows of analysis). However, if even one magnitude or phase is modified the original cannot be exactly reconstructed. This may be useful, for example, modifying some of the magnitudes before re-synthesis can adjust the loudness of a band of frequencies in the original signal (sometimes called spectral filtering). Generally speaking this is how many of the spectral processing algorithms in FluCoMa operate ([AudioTransport](/reference/audiotransport), [BufNMF](/reference/bufnmf), [HPSS](/reference/hpss), and more). One common artefact of these adjustments is _spectral smearing_, which is when the time position of an event within an analysis window, becomes less clear. When all the analysis windows of an inverse STFT contain spectral smearing, it gives the signal a blurry, or chorus-y sound. If the analysis window contains a transient that is "smeared" it can be called "pre-ring", meaning that the spectral sound of the transient can be heard _prior_ to where it originally occurred in the analysis window. It sometimes sounds like a brief crescendo to to the transient, caused by the fade-in of the window function. @@ -46,4 +46,4 @@ label="Spectrogram of a section of the Nicol drum loop with no evidence of smear \ No newline at end of file +/> diff --git a/src/routes/(content)/reference/kdtree/+page.svx b/src/routes/(content)/reference/kdtree/+page.svx index 37ab13a3..f843e164 100644 --- a/src/routes/(content)/reference/kdtree/+page.svx +++ b/src/routes/(content)/reference/kdtree/+page.svx @@ -14,7 +14,7 @@ category: Analyse Data The KDTree facilitates the rapid querying of data stored in a [DataSet](/reference/dataset). It _can_ make querying by distance more efficient and faster than if you did it by manual lookup. Before a KDTree can be queried it has to be fitted to a [DataSet](/reference/dataset) containing the points that it will look up. -One use of the KDTree is for doing _nearest neighbour lookups_. The interactive example below contains a collection of data points, whose x and y values range between 0.0 and 1.0, which are plotted on to a canvas-like scatterplot. Using the coordinates of our mouse inside this scatterplot, we can search for the point, or group of points closest to our mouse. We can also constrain the radius of the search: only points within a certain distance of our query point, and not just whichever are highlighted, are highlighted. Before you use the example you will have to `fit` the underlying [KDTree](/reference/kdtree) by pressing the button labelled _fit_. +One use of the KDTree is for doing _nearest neighbour lookups_. The interactive example below contains a collection of data points, whose x and y values range between 0.0 and 1.0, which are plotted on to a canvas-like scatterplot. Using the coordinates of our mouse inside this scatterplot, we can search for the point, or group of points closest to our mouse. We can also constrain the radius of the search: only points within a certain distance of our query point, and not just whichever are nearest, are highlighted. Before you use the example you will have to `fit` the underlying [KDTree](/reference/kdtree) by pressing the button labelled _fit_. @@ -22,4 +22,4 @@ You can change the number of neighbours and the radius constraint with the slide ## Why a Kdtree Is Not Always Faster Than a Brute Force Search -Whilst k-d trees can offer very good performance relative to naïve search algorithms, they suffer from something called ["the curse of dimensionality"](https://en.wikipedia.org/wiki/Curse_of_dimensionality) (like many algorithms for multi-dimensional data). In practice, this means that as the number of dimensions of your data goes up, the relative performance gains of a k-d tree go down. \ No newline at end of file +Whilst k-d trees can offer very good performance relative to naïve search algorithms, they suffer from something called ["the curse of dimensionality"](https://en.wikipedia.org/wiki/Curse_of_dimensionality) (like many algorithms for multi-dimensional data). In practice, this means that as the number of dimensions of your data goes up, the relative performance gains of a k-d tree go down. diff --git a/src/routes/(content)/reference/mds/+page.svx b/src/routes/(content)/reference/mds/+page.svx index 2be88650..7d4893a7 100644 --- a/src/routes/(content)/reference/mds/+page.svx +++ b/src/routes/(content)/reference/mds/+page.svx @@ -26,11 +26,11 @@ First, MDS computes a distance matrix by calculating the distance between every Unlike the other dimensionality reduction algorithms, MDS does not have a `fit` or `transform` method, nor does it have the ability to transform data points in buffers. This is essentially because the algorithm needs to do the fit & transform as one process using just the data provided in the source DataSet and therefore incorporating new data points would require a re-fitting of the model. -What makes MDS more flexible than the other dimensionality reduction algorithms in FluCoMa ([PCA](/reference/pca) and [UMAP](/reference/umap)) is that MDS allows for different measures of distance to be used when computing the distance matrix (see the list below). This allows you to explore different ways of measuring the distance of the data points (i.e., comparing their similarity or difference) during the during the dimensionality reduction process. Exploring different measures of difference may create different musical relationships between points in the data. +What makes MDS more flexible than the other dimensionality reduction algorithms in FluCoMa ([PCA](/reference/pca) and [UMAP](/reference/umap)) is that MDS allows for different measures of distance to be used when computing the distance matrix (see the list below). This allows you to explore different ways of measuring the distance of the data points (i.e., comparing their similarity or difference) during the dimensionality reduction process. Exploring different measures of difference may create different musical relationships between points in the data. ## Comparing Measures of Distance -Below are plots different two dimensional representations of the same MFCC analyses (originally in 13 dimensions) using all the distance metrics available in MDS. The colour is arbitrarily assigned so that you can track the location changes of each point in space. +Below are different plots of two dimensional representations of the same MFCC analyses (originally in 13 dimensions) using all the distance metrics available in MDS. The colour is arbitrarily assigned so that you can track the location changes of each point in space. diff --git a/src/routes/(content)/reference/melbands/+page.svx b/src/routes/(content)/reference/melbands/+page.svx index 44f4dc7f..413b538b 100644 --- a/src/routes/(content)/reference/melbands/+page.svx +++ b/src/routes/(content)/reference/melbands/+page.svx @@ -26,7 +26,7 @@ caption='An illustration of the different steps involved in computing Melbands, In order to create the Mel-Frequency Spectrum, first an FFT is computed on a window of the signal. Next, a series of overlapping triangle filters are applied to the magnitudes of the FFT spectrum: each magnitude in the FFT spectrum is multiplied by its corresponding value in the each triangle filter. These products are then added up to produce a weighted sum of FFT magnitudes for each triangle filter, which is the magnitude of the corresponding MelBand. Since most of the values in these triangle filters are zero (everywhere except where the triangle rises up) each filter only sums the magnitudes in a certain frequency band, so each MelBand magnitude is a representation of the overall energy in that frequency band. The MelBands are perceived as being equally spaced because as they get higher in frequency, they cover more FFT bins (more hertz), mirroring how humans perceive pitch relationships (higher frequency pitches have more hertz between musical intervals). -One can choose whether to `normalize` triangle filters used to compute the MelBands. By default (`normalize` = 1), normalization will account for how many FFT magnitudes each triangle filter sums together by dividing the sum by the number of contributing FFT magnitudes, essentially averaging their contributions. When the triangle filters are not normalized (`normalize` = 0), averaging does not occur and the magnitudes of the higher MelBands end up being larger values (disproportionately so) becuase they're summing together more FFT magnitudes. In the charts below, notice how different the unnormalized and normalized triangle filters are and when normalized, how much larger the magnitudes in the higher MelBands become. +One can choose whether to `normalize` triangle filters used to compute the MelBands. By default (`normalize` = 1), normalization will account for how many FFT magnitudes each triangle filter sums together by dividing the sum by the number of contributing FFT magnitudes, essentially averaging their contributions. When the triangle filters are not normalized (`normalize` = 0), averaging does not occur and the magnitudes of the higher MelBands end up being larger values (disproportionately so) because they're summing together more FFT magnitudes. In the charts below, notice how different the unnormalized and normalized triangle filters are and when normalized, how much larger the magnitudes in the higher MelBands become. ### Normalized Triangle Filters (the default) diff --git a/src/routes/(content)/reference/noveltyslice/+page.svx b/src/routes/(content)/reference/noveltyslice/+page.svx index 980f9db3..419457da 100644 --- a/src/routes/(content)/reference/noveltyslice/+page.svx +++ b/src/routes/(content)/reference/noveltyslice/+page.svx @@ -18,7 +18,7 @@ category: Slice Sound NoveltySlice provides a broad concept for thinking about how we might be able say _this_ bit of a sound is different from _that_ bit. It's useful for slicing when we want a more general basis for distinguishing between segments than looking for the start of well defined events with [onsets](/reference/onsetslice), [transients](/reference/transientslice) or changes in the envelope. It can be especially useful when we're interested in making longer slices than you might get with these typically more finely-grained methods. -There could be many ways of coming up with a measure of novelty. The one that we use in the Fluid Corpus Manipulation Toolkit works by constructing a map of how different each chunk of a signal is to every other chunk. To do this, we transform the sound into the spectral domain using an STFT, meaning that each chunk can now represented by the magnitudes in each bin. How similar each chunk is to another can then be estimated using a distance measure, and we end up with a grid or [self-similarity matrix (SSM)](https://en.wikipedia.org/wiki/Self-similarity_matrix) that maps the difference between each point in time to every other point in time. A SSM looks like this when we visualise it. Reading it might be somewhat overwhelming, but the basic gist is that for any time point on either axis, we can "lookup" another point in time to see if it is similar. In this particular example, we can see that the structure of the audio is very repetitive, denoted by the almost unbroken diagonal lines. +There could be many ways of coming up with a measure of novelty. The one that we use in the Fluid Corpus Manipulation Toolkit works by constructing a map of how different each chunk of a signal is to every other chunk. To do this, we transform the sound into the spectral domain using an STFT, meaning that each chunk can now represented by the magnitudes in each bin. How similar each chunk is to another can then be estimated using a distance measure, and we end up with a grid or [self-similarity matrix (SSM)](https://en.wikipedia.org/wiki/Self-similarity_matrix) that maps the difference between each point in time to every other point in time. An SSM looks like this when we visualise it. Reading it might be somewhat overwhelming, but the basic gist is that for any time point on either axis, we can "lookup" another point in time to see if it is similar. In this particular example, we can see that the structure of the audio is very repetitive, denoted by the almost unbroken diagonal lines. @@ -28,7 +28,7 @@ For an interactive example of an SSM see [this website](https://colinmorris.gith -What we're interested in is how much "novelty" appears to be present from one moment to the next, in other words, over a given window of time how similar is "now" to the past. We can find this out from our SSM by adding together all the differences in a window around a given moment, and making a "novelty curve". Then, we can estimate likely places to make slices by looking for peaks in this curve. If we're interested in longer slices, one thing we can do is to make the time window that we sum together larger (the _kernel size_). Additionally, we can apply smoothing to the novelty curve to suppresses smaller / shorter peaks and focus instead on larger / longer ones. Below is a depicition of a novelty curve, computed on the self-similarity matrix you can see above with a kernel size of 41. The detected peaks (read slice points) that come out of this are marked in red. +What we're interested in is how much "novelty" appears to be present from one moment to the next, in other words, over a given window of time how similar is "now" to the past. We can find this out from our SSM by adding together all the differences in a window around a given moment, and making a "novelty curve". Then, we can estimate likely places to make slices by looking for peaks in this curve. If we're interested in longer slices, one thing we can do is to make the time window that we sum together larger (the _kernel size_). Additionally, we can apply smoothing to the novelty curve to suppresses smaller / shorter peaks and focus instead on larger / longer ones. Below is a depiction of a novelty curve, computed on the self-similarity matrix you can see above with a kernel size of 41. The detected peaks (read slice points) that come out of this are marked in red. @@ -46,4 +46,4 @@ blurb={'The original paper describing the novelty algorithm'} title={'Novelty-Based Segmentation'} blurb={'A technically oriented exploration of novelty-based segmentation using Python code'} url={'https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S4_NoveltySegmentation.html'} -/> \ No newline at end of file +/> diff --git a/src/routes/(content)/reference/pca/+page.svx b/src/routes/(content)/reference/pca/+page.svx index e6f0ef6b..05933d37 100644 --- a/src/routes/(content)/reference/pca/+page.svx +++ b/src/routes/(content)/reference/pca/+page.svx @@ -39,7 +39,7 @@ As this is often how PCA is used, FluidPCA allows the user to specify the number PCA "whitening" can be turned on by setting the `whiten` parameter to 1. "Whitening" ensures that not only will the principal components be uncorrelated, but also they will all have unit variance (they'll all have a standard deviation of 1). "Whitening" is sometimes also referred to as `sphering` meaning that it's making the data "spherical" in the sense that each principal component (each new dimension) has roughly the same spread, just as a sphere has the same width in each of its three dimensions. -Whitening looses some information during the transformation process (essentially, the relative scales of the principal components), however, whitening may improve the accuracy of further computations such as classification. If there is noise in the data, whitening might amplify this noise by scaling it to be the same size as other other information, making it as prominent as the relevant information, therefore whitening will not _always_ improve data. +Whitening loses some information during the transformation process (essentially, the relative scales of the principal components), however, whitening may improve the accuracy of further computations such as classification. If there is noise in the data, whitening might amplify this noise by scaling it to be the same size as other other information, making it as prominent as the relevant information, therefore whitening will not _always_ improve data. ## Related Resources diff --git a/src/routes/(content)/reference/skmeans/+page.svx b/src/routes/(content)/reference/skmeans/+page.svx index 14a38cb8..e948ddf3 100644 --- a/src/routes/(content)/reference/skmeans/+page.svx +++ b/src/routes/(content)/reference/skmeans/+page.svx @@ -20,7 +20,7 @@ category: Analyse Data 1. Instead of taking the Euclidian distance to each center of each cluster like [KMeans](/reference/kmeans) does, SKMeans takes the angular distance of the normalised vector in high dimension. This is also named cosine similarity. -2. Once the cluster are found, the distance are encoded with an 'alpha' function, in effect promoting a sparser representations where smaller similarities are penalised. +2. Once the clusters are found, the distances are encoded with an 'alpha' function, in effect promoting sparser representations where smaller similarities are penalised. In effect, SKMeans is mostly used to learn features in a higher dimension space than the original data, with the assumption that it would help untangle near clusters. diff --git a/src/routes/(content)/reference/standardize/+page.svx b/src/routes/(content)/reference/standardize/+page.svx index b011d1e4..5db20790 100644 --- a/src/routes/(content)/reference/standardize/+page.svx +++ b/src/routes/(content)/reference/standardize/+page.svx @@ -14,9 +14,9 @@ category: Analyse Data import Image from '$lib/components/Image.svelte'; -Standardized data means that every dimension in the data set has a _mean_ of 0 and a [standard deviation](/reference/bufstats#standard-deviation) of 1. This scaling can be useful for many machine learning algorithms, but since most data in the wild does not fit this criteria, standardizing data is often employed. Standardizing data (and scaling data generally) can also be important for transforming a [DataSet](/reference/dataset) so that all of the dimensions have similar ranges. With similar ranges, the distance metrics (such as euclidian distance) used by many machine learning algorithms will similarly weight each of the dimensions when calculating now near or far (similar or dissimilar) two data points are from each other. +Standardized data means that every dimension in the data set has a _mean_ of 0 and a [standard deviation](/reference/bufstats#standard-deviation) of 1. This scaling can be useful for many machine learning algorithms, but since most data in the wild does not fit this criteria, standardizing data is often employed. Standardizing data (and scaling data generally) can also be important for transforming a [DataSet](/reference/dataset) so that all of the dimensions have similar ranges. With similar ranges, the distance metrics (such as euclidian distance) used by many machine learning algorithms will similarly weight each of the dimensions when calculating how near or far (similar or dissimilar) two data points are from each other. -Using standardization to scale data implies the assumption that the data is generally [normally distributed](/learn/distribution). Small data sets derived from audio analyses are often not normally distributed and therefore standarization might not be the best choice for scaling. It maybe useful to test other scalers ([Normalize](/reference/normalize) or [RobustScale](/reference/robustscale)) and see which provides the best musical results. +Using standardization to scale data implies the assumption that the data is generally [normally distributed](/learn/distribution). Small data sets derived from audio analyses are often not normally distributed and therefore standardization might not be the best choice for scaling. It maybe useful to test other scalers ([Normalize](/reference/normalize) or [RobustScale](/reference/robustscale)) and see which provides the best musical results.