README/docs update

maum-ai · Jul 27, 2021 · 12f855c · 12f855c
1 parent e79cc5e
commit 12f855c
Show file tree

Hide file tree

Showing 3 changed files with 14 additions and 9 deletions.
diff --git a/.gitignore b/.gitignore
@@ -4,6 +4,7 @@ docs/.DS_Store
 */*/*/.DS_Store
 */*/*/*/.DS_Store
 __pycache__/*
+*.swp
 
 
 
diff --git a/README.md b/README.md
@@ -3,8 +3,7 @@
 **NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling**<br>
 Junhyeok Lee, Seungu Han @ [MINDsLab Inc.](https://github.com/mindslab-ai), SNU
 
-Paper(arXiv): https://arxiv.org/abs/2104.02321 (Accepted to INTERSPEECH 2021)<br>
-Audio Samples: https://mindslab-ai.github.io/nuwave<br>
+[![arXiv](https://img.shields.io/badge/arXiv-2104.02321-brightgreen.svg?style=flat-square)](https://arxiv.org/abs/2104.02321) [![GitHub Repo stars](https://img.shields.io/github/stars/mindslab-ai/nuwave?color=yellow&label=NU-Wave&logo=github&style=flat-square)](https://github.com/mindslab-ai/nuwave) [![githubio](https://img.shields.io/badge/GitHub.io-audio_samples-blue?logo=Github&style=flat-square)](https://mindslab-ai.github.io/nuwave/)
 
 Official Pytorch+[Lightning](https://github.com/PyTorchLightning/pytorch-lightning) Implementation for NU-Wave.<br>
 
@@ -22,14 +21,14 @@ Update: torch.log --> torch.log10 on lsd, value and lsd formula in the paper is
 Before running our project, you need to download and preprocess dataset to `.pt` files
 1. Download [VCTK dataset](https://datashare.ed.ac.uk/handle/10283/3443)
 2. Remove speaker `p280` and `p315`
-3. Modify path of downloaded dataset `data:dir` in `hparameters.yaml`
+3. Modify path of downloaded dataset `data:dir` in `hparameter.yaml`
 4. run `utils/wav2pt.py`
 ```shell script
 $ python utils/wav2pt.py
 ```
 
 ## Training
-1. Adjust `hparameters.yaml`, especially `train` section.
+1. Adjust `hparameter.yaml`, especially `train` section.
 ```yaml
 train:
   batch_size: 18 # Dependent on GPU memory size

diff --git a/docs/index.html b/docs/index.html
@@ -1,16 +1,20 @@
 <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 
+<div class="container" style="max-width:1500px;">
+
   <title>Audio samples for "NU-Wave: A Diffusion model for Neural Audio Upsampling"</title>
 
   </head>
   <h2>Audio samples for "NU-Wave: A Diffusion model for Neural Audio Upsampling"</h2>
 
     </article>
-    <div><p><b>Paper:</b> <a href="https://arxiv.org/abs/2104.02321">arXiv:2104.02321</a> (Accepted to INTERSPEECH 2021)</p></div>
-    <div><p><b>Code(available soon):</b> <a href="https://github.com/mindslab-ai/nuwave">mindslab-ai/nuwave @ GitHub</a>
-      <iframe src="https://ghbtns.com/github-btn.html?user=mindslab-ai&repo=nuwave&type=star&count=true" frameborder="0" scrolling="0" width="150" height="20" title="GitHub"></iframe>
-     </p></div>
-    <div><p><b>Authors:</b> Junhyeok Lee, Seungu Han @<a href="https://mindslab.ai">MINDsLab Inc.</a>, SNU</p></div>
+    <p>
+    <a href="https://arxiv.org/abs/2104.02321" rel="nofollow"><img src="https://img.shields.io/badge/arXiv-2104.02321-brightgreen.svg?style=flat-square" style="max-width:100%;"></a>
+    <a href="https://github.com/mindslab-ai/nuwave"><img src="https://img.shields.io/github/stars/mindslab-ai/nuwave?color=yellow&amp;label=NU-Wave&amp;logo=github&amp;style=flat-square" style="max-width:100%;"></a> 
+    <a href="https://www.interspeech2021.org/" rel="nofollow"><img src="https://img.shields.io/badge/Accepted-INTERSPEECH%202021-blue?style=flat-square" style="max-width:100%;"></a></p>
+
+
+    <div><p><b>Authors:</b> <a href="mailto:[email protected]">Junhyeok Lee</a>, <a href="mailto:[email protected]">Seungu Han</a> @<a href="https://mindslab.ai">MINDsLab Inc.</a>, SNU</p></div>
     <div><p><b>Abstract:</b>
         In this work, we introduce NU-Wave, the first neural audio upsampling model to produce waveforms of sampling rate 48kHz from coarse 16kHz or 24kHz inputs, while prior works could generate only up to 16kHz. NU-Wave is the first diffusion probabilistic model for audio super-resolution which is engineered based on neural vocoders. NU-Wave generates high-quality audio that achieves high performance in terms of signal-to-noise ratio (SNR), log-spectral distance (LSD), and accuracy of the ABX test. In all cases, NU-Wave outperforms the baseline models despite the substantially smaller model capacity (3.0M parameters) than baselines (5.4-21%). The audio samples of our model are available at https://mindslab-ai.github.io/nuwave, and the code will be made available soon.
     </p></div>
@@ -287,4 +291,5 @@ <h3> Section &#8547;: Examples for multi speaker (unseen speaker during training
     </table>
     <br> </br>
   </body>
+</div>
 </html>