Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Evaluating Deep Speech 2 on Mac OSX #56

Open
karllab41 opened this issue Sep 22, 2017 · 35 comments
Open

Evaluating Deep Speech 2 on Mac OSX #56

karllab41 opened this issue Sep 22, 2017 · 35 comments

Comments

@karllab41
Copy link

karllab41 commented Sep 22, 2017

Hello! Thanks for posting this. I'm excited to run speech recognition on files! I've been trying to use Deep Speech 2 for evaluating my denoising pipeline. However, I'm having some trouble with the installation, and most of it is from aeon and the data loading.

When I run:

python evaluate.py --manifest val:$TOPDIR/librispeech/test-clean/test-manifest.csv --model_file $TOPDIR/model/librispeech_16_epochs.prm

I get:

Traceback (most recent call last):
  File "evaluate.py", line 21, in <module>
    from aeon.dataloader import DataLoader
ModuleNotFoundError: No module named 'aeon.dataloader'

Here's how I did my installation. I installed Neon from scratch via the original github page, which I assumed installed aeon. It did, but dataloader apparently was not in that installation. So, I went to the aeon page. The instructions told me to install aeon via:

git clone https://github.com/NervanaSystems/private-aeon.git aeon

That seemed incorrect (since private-aeon.git seems to no longer be private). So, I just installed

git clone https://github.com/NervanaSystems/aeon.git aeon

I ran into some C++ problems, so I followed Aeon Issue 48, which installed it. However, even after I installed aeon, I still couldn't import aeon.datasetloader.

import aeon.datasetloader
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-2f90034cee08> in <module>()
----> 1 import aeon.datasetloader

ImportError: No module named datasetloader
@karllab41 karllab41 changed the title Deep Speech 2 on Mac OSX Evaluating Deep Speech 2 on Mac OSX Sep 22, 2017
@karllab41
Copy link
Author

karllab41 commented Sep 23, 2017

Update: I reloaded the aeon package with release version 0.2.7: https://github.com/NervanaSystems/aeon/releases/tag/v0.2.7, which I downloaded from here, and it has datasetloader. It still appears that aeon is the problem because after I type in:

python evalrun.py --manifest val:$TOPDIR/librispeech/test-clean/test-manifest.csv --model_file $TOPDIR/model/librispeech_16_epochs.prm

(The paths are correct; I checked.) The error message looks like:

DISPLAY:neon:mklEngine.so not found; falling back to cpu backend
DISPLAY:neon:mklEngine.so not found; falling back to cpu backend
2017-09-22 19:42:54,373 - neon.backends.nervanacpu - WARNING - Problems inferring BLAS info, CPU performance may be suboptimal
2017-09-22 19:42:54,374 - neon.backends - WARNING - deterministic_update and deterministic args are deprecated in favor of specifying random seed
2017-09-22 19:42:54,379 - neon.backends.nervanacpu - WARNING - Problems inferring BLAS info, CPU performance may be suboptimal
Loading model file: /Users/l41admin/Magnolia/deepspeech/model/librispeech_16_epochs.prm
formats: formats: formats: can't open input file `': No such file or directoryformats: can't open input file `': No such file or directorycan't open input file `': No such file or directory
Unable to readdecode_thread_pool exception: number of frames is negative
can't open input file `': No such file or directory


Unable to readUnable to readUnable to readdecode_thread_pool exception: number of frames is negative
decode_thread_pool exception: number of frames is negative
decode_thread_pool exception: number of frames is negative

The data that I'm using comes from running:

python data/ingest_librispeech.py $TOPDIR/librispeech/test-clean $TOPDIR/librispeech/test-clean/transcripts_dir $TOPDIR/librispeech/test-clean/test-manifest.csv

I don't know what I could be doing wrong?

@wei-v-wang
Copy link

@karllab41 You are doing everything right. PR #57 is addressing the initial problem.
It rooted from aeon getting bumped to aeon 1.0.

@karllab41
Copy link
Author

Thanks, @wei-v-wang.

While on the Mac, I haven't gotten evaluation on Librispeech to work (with either aeon-0.2.7 or 1.0), Linux has been okay with aeon-0.2.7 (release), neon 2.1, python 2.7 (though I'd like to try 3...)

Not sure if that helps, but just FYI

@SkyKingCoversGroundTiger

report: not working on Mac for both train and evaluation (with aeon 1.0).
will test with Linux later :(

@SkyKingCoversGroundTiger

report update: tested on Linux (Ubuntu 1604) with aeon-0.2.7 (release), neon 2.2.0, python 2.7. So far so good.

@wei-v-wang
Copy link

Thanks @yangroupaomo neon 2.2.0 should be automatically install aeon 1.0. In neon's virtual environment (. .venv/bin/activate), and do "pip list |grep aeon", it should be aeon-1.0, right?
Please feel free to let us know of issues.

@karllab41 We released neon 2.2.0 which featured our first improvement of DS2 on IA (more improvement to come in future releases). Feel free to try the latest neon as well (git checkout latest) from neon directory. :)

@SkyKingCoversGroundTiger
Copy link

@saikishor
Copy link

I am getting many errors while installing aeon and try to check for nervana-aeon;
after the step : running setup.py install for nervana-aeon .../

......
In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_loader_source.hpp:22:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/buffer_batch.hpp:23:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

In file included from /home/saikishor/Deepspeech/neon/aeon/src/box.cpp:16:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/box.hpp:19:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_manager.cpp:18:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_manager.hpp:21:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/buffer_batch.hpp:23:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

1 error generated.

clang++ -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -fstack-protector -I/usr/include/opencv -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c /home/saikishor/Deepspeech/neon/aeon/src/boundingbox.cpp -o build/temp.linux-x86_64-2.7/home/saikishor/Deepspeech/neon/aeon/src/boundingbox.o -O3 -std=c++11 -Werror=return-type -Werror=inconsistent-missing-override -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-padded -Wno-weak-vtables -Wno-global-constructors -Wno-switch-enum -Wno-gnu-zero-variadic-macro-arguments -Wno-undef -Wno-exit-time-destructors -Wno-missing-prototypes -Wno-disabled-macro-expansion -Wno-pedantic -Wno-documentation -Wno-covered-switch-default -Wno-old-style-cast -Wno-unknown-warning-option -Wno-sign-compare -Wno-unused-parameter -Wno-conversion -Wno-float-equal -Wno-duplicate-enum -Wno-used-but-marked-unused -Wno-c++11-compat-deprecated-writable-strings -Wno-deprecated -Wno-double-promotion -DPYTHON_FOUND

error: command 'clang++' failed with exit status 1

1 error generated.

1 error generated.

1 error generated.

1 error generated.

In file included from /home/saikishor/Deepspeech/neon/aeon/src/boundingbox.cpp:17:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/etl_boundingbox.hpp:21:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/interface.hpp:27:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/typemap.hpp:19:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

1 error generated.

1 error generated.

1 error generated.

1 error generated.


Cleaning up...
Command /usr/bin/python -c "import setuptools, tokenize;file='/tmp/pip-ybSgIX-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-lUO1rj-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip-ybSgIX-build
Storing debug log for failure in /home/saikishor/.pip/pip.log

@wei-v-wang
Copy link

It was not quite eye-catching but "http://neon.nervanasys.com/index.html/installation.html" has a suggestion when encountering aeon related issues.
"If you have encountered error messages about failing to install aeon while building neon, please visit aeon page for how to install prerequisites for aeon to enable neon with aeon data loader."

To be more clear, the above seems to be related to clang and could you try following aeon readme to install all pre-requisites? aeon is part of neon but neon does not list (automatically install) all aeon pre-requisites.

https://github.com/NervanaSystems/aeon/blob/master/README.md

@saikishor
Copy link

Yes the problem is definitely with clang, but it's very hard on resolve. I am using Ubuntu so the prerequisites installation is only:

apt-get install git clang cmake python-dev python-pip libcurl4-openssl-dev libopencv-dev libsox-dev

Followed by normal installation "cmake" of aeon.

Do you have any idea how to solve it?.
The clang is using gcc++ 4.9

@saikishor
Copy link

The version of the clang is 3.4 and it's using gcc and gcc++ of 4.9 as default.

Is there any possible way to resolve it?

@wei-v-wang
Copy link

any possibility of upgrading clang from 3.4 to 3.5? Google search on "max_align_t" had suggestions along these lines.

@saikishor
Copy link

I tried installing clang to 3.6 after removing 3.4 and tried to run the commands, but ended up with the same at the end. I tried about max_align_t on Google, but there are many solutions with clang, but they only propose to run for individual files and they didn't explain well how to do for a Cmake file.

@wei-v-wang
Copy link

FYI: NervanaSystems/neon#375 had some suggestion with using libc++

Can you try the fix in the above 375? That was a Mac system in 375 fix.

@saikishor
Copy link

Thanks I tried but, I have a question I didn't find env.sh in aeon package folder, so I didn't find a way to proceed further!!!

@saikishor
Copy link

I am using Ubuntu 14.04 system

@wei-v-wang
Copy link

Oh, sorry the 375 issue must have been for old aeon. Let me take a closer look.
(I am not asking you to upgrade to Ubuntu 16.04). I will let you know what version of gcc and clang I am using on Ubuntu 16.04 and see if we can arrive at a solution.

@saikishor
Copy link

Thanks you for your effort.. @wei-v-wang

@wei-v-wang
Copy link

First, sorry to others if we are discussing neon installation issues on deep speech :)

@saikishor I am using a similar system as yours (Ubuntu 16.04) and clang 3.8 and gcc5.4
The following shows what a working aeon installation is like (after type 'make' under neon and extracting aeon related output log)

HEAD is now at 7e1af03... Merge for v1.0.0 release. The number changed from v1.0.1 to v1.0.0 because v1.0.0 has never been released.
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is Clang 3.8.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/clang++
-- Check for working CXX compiler: /usr/bin/clang++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Checking for module 'sox'
-- Found sox, version 14.4.1
-- Found CURL: /usr/lib/x86_64-linux-gnu/libcurl.so (found version "7.47.0")
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found version "3.5.2")
-- Found PythonInterp: /usr/bin/python3.5 (found version "3.5.2")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find LATEX (missing: LATEX_COMPILER)
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- Found Sphinx: /home/weiwang/git/neon/.venv2/bin/sphinx-build
-- Failed to locate breathe executable (missing: BREATHE_EXECUTABLE)
Doxygen not found, skipping documentation
Breathe not found, skipping documentation
Without COVERAGE flag coverage raport is unavailable
-- Configuring done
-- Generating done
-- Build files have been written to: /home/weiwang/git/neon/aeon/build
Processing /home/weiwang/git/neon/aeon/build
Installing collected packages: nervana-aeon
Running setup.py install for nervana-aeon ... done
Successfully installed nervana-aeon-1.0.0

@saikishor
Copy link

Sorry to others from my side as well.
Installing collected packages: nervana-aeon
Running setup.py install for nervana-aeon ...
I am exactly failing after this step. I don't know why. The only difference we had is our default gcc version used by clang. It is not clear over internet how to set your default gcc in clang. So, I feel totally cornered at this point.

@wei-v-wang
Copy link

OK, sorry for the frustrating experience. Here hopefully is a better suggestion:
I just tested on a Ubuntu 14.04 system
VERSION="14.04.5 LTS, Trusty Tahr"

-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is Clang 3.4.0

would also work.

So can you completely remove gcc-4.9 and install gcc-4.8? Afterwards, can you try install clang 3.4?

@saikishor
Copy link

Sure @wei-v-wang. I will try and keep you updated.
I have one more question. I have many versions of gcc installed, should I uninstall all of them?. The problem is when I tried to uninstall gcc-4.9 I got to see that cuda and some GPU parts were also warned to be uninstalled as it is part of that.

Do you think uninstalling and installing all gcc by gcc 4.8, will not create any issue to my cuda?. I will surely give a try.

@wei-v-wang
Copy link

I am not sure whether changes to gcc will affect cuda.
Can you try keeping all GCC versions and make gcc4.8 the default in the PATH and LD_LIBRARY_PATH?

@saikishor
Copy link

Yes sure. I will try to do that and keep you posted.

@saikishor
Copy link

I gave up on neon, i tried many suggestions from stackoverflow and other places, couldn't resolve my issue. Thanks for all your help @wei-v-wang

@wei-v-wang
Copy link

Hi @saikishor I understand your frustration regarding gcc versions and clang version. Did "GNU 4.8.4 and Clang 3.4.0" not help?
Have you tried Docker? Are you willing to hear about setting up neon in docker so it will be an isolated environment?

Also, neon is evolving, we encourage you to try it later on while we improve the installation experience. However, it is likely there would be difficult corner cases to handle, e.g. the gcc/clang related issue tied to the operating system.

@saikishor
Copy link

@wei-v-wang Yes yes docker is an option!!!, but where can I find the info about setting up neon on docker?

@wei-v-wang
Copy link

Hi @saikishor I will contact my team for the instructions on the docker option. Please stay tuned.

@saikishor
Copy link

saikishor commented Oct 19, 2017

Wow!!! @wei-v-wang Thanks for your generous help....
I would like to mention that docker is one of the best options for anyone to opt, as it enables the user to train on multiple GPU's and test on multiple GPU's. Mainly, there is a huge load on CPU while evaluating, which take lot of time to process and making it hard to implement in real-time applications.

Thank you once again for an initiative on docker option.

@wei-v-wang
Copy link

@saikishor You are welcome.
Have you considered trying training on Intel Xeon Scalable Processor Family (Skylake)? Please keep this in mind as an option/alternative to multi GPU -- you may be surprised to find what multi-Skylake could give you in terms of training processing :)

@saikishor
Copy link

@wei-v-wang Surely I will consider Intel Xeon Scalable Processor Family (Skylake) for training, but more researchers are very interested on its evaluation peformance on CPU's as this will drive the whole neon into real-time applications.

@wei-v-wang
Copy link

@saikishor Good point! I should have asked you to consider Skylake for both training and inference, :)

@wsokolow
Copy link

wsokolow commented Oct 19, 2017

Hi @saikishor , below I'm providing you instruction how to set and run Neon + Deepspeech inside Docker container:

Build docker image using below dockerfile:

Ubuntu-14.04.txt

docker build --rm -f=Ubuntu-14.04.txt -t=neon:test .

Run your docker container:

docker run -it --name neon_test neon:test /bin/bash


To run Deepspeech training on Neon, while inside docker container, follow below steps:

1. Install Neon 2.2

git clone https://github.com/NervanaSystems/neon.git
cd neon
make -j
. ./.venv2/bin/activate

2. Install Deepspeech

git clone https://github.com/NervanaSystems/deepspeech.git
cd deepspeech
pip install -r requirements.txt
make -j

3. Prepare Deepspeech datasets and run the training

Please follow instructions described in https://github.com/NervanaSystems/deepspeech/blob/master/README.md , paragraph "Training a model".
You will need to download train and val datasets, ingest them to generate .csv manifests files and run the training.

IMPORTANT: You might need to additionally install scipy package (pip install scipy)

Example training commandline using GPU backend:

python train.py --manifest train:/root/output/train-clean-100/train-manifest.csv
--manifest val:/root/output/dev-clean/val-manifest.csv -e 2 -z 8 -b gpu-s model_output.pkl

This will run 2 epochs on GPU backend, use batch size 8 and save model to "model_output.pkl" file.

Let me know if you encounter any issues, I will help.

@saikishor
Copy link

Thank you @wsokolow that was a fast response. I need some time to try this, after that i will let you know.

@pzelazko-intel
Copy link

@saikishor I investigated extactly the same problem as yours recently.
What I found is that the root cause is clang error fixed in version 3.4-2:
https://reviews.llvm.org/rL201729
https://stackoverflow.com/questions/23462950/clang-only-compiles-c11-program-using-boostformat-when-std-c11-option-i

This problem does not reproduce with gcc 4.8, but does with gcc 4.9. AFAIK gcc 4.8 is default version for ubuntu 14.04, so I suppose you had to update it.

I see you wrote that it didnt help for you to install clang 3.5 - that's strange. Maybe try upgrading to 3.8 or downgrading gcc to 4.8.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants