Statistics #59

Sandr0x00 · 2017-04-13T10:57:32Z

Gather statistics about the stability of our algorithms in order to find bottlenecks in the whole pipeline.

How many requests were processed
How many errors occurred
Which errors occurred: @ansjin please commit the error outputs I've seen in your last used version
How much time took everything
Average time for each algorithm

additional:

improve design

ansjin · 2017-04-13T15:49:41Z

Building on top of what @SANDR00 did.

Relationship Statistics :

Currently running here just for testing http://104.198.227.113/

First two IPs are of relationship Algorithms (Scaled 2 Times)
Last one is of DateEvent Extraction Algorithms (scaled 2 times )

Currently the scaling is not high. More the scaling then more parallel requests can be served!

Running on 765 wiki pages scrapped by team 2 here( MusicConnectionMachine/UnstructuredData#65 (comment) )

Also resulted data can be checked @ 35.187.17.177 with username and database as default values of postgres

A snapshot after some elapsed time:

kordianbruck · 2017-04-13T22:43:12Z

@ansjin this looks goodish? What is the status column? Is that the number of finished requests of a total of 1707 after 1022 seconds?

simonzachau · 2017-04-13T23:09:49Z

@kordianbruck yes, the status says how many are done. The total are all (incl. the remaining ones). The "design" was just to get started, it's not very intuitive for now.

kordianbruck · 2017-04-13T23:12:08Z

Great. Breaking it down: So those Take roughly 30seconds to process one request. That's a lot.

simonzachau · 2017-04-13T23:16:52Z

@kordianbruck it's on the free tier of Google so it's not that powerful - If we give it 100 machines instead of 2, we'll get it done faster

simonzachau · 2017-04-13T23:26:33Z

And one more thing: The size of the request has to be taken into account before making a conclusion about the speed. Currently, one request is about half a website as far as I know.

ansjin · 2017-04-13T23:33:38Z

@kordianbruck
Actually the timer doesn't gets stop if one of the algorithm has completed its all requests. It keeps on running until all the requests are finished.
The last one is the DateEventExtraction which is very fast as compared to other NLP algorithms, it roughly process those many requests in less than 200sec with not much scaling.

And yes the other relationship algorithms takes around 30sec to 1 min to process a request but if we would have many multiple machines running than those requests can be processed in parallel. Also we could use kubernetes feature which allow to create multiple pods on the same machine(which acts as a new machine only) and can fully utilize the compute power of a VM.
Currently this is just the test we are running to see how our complete application works and how much we will have to scale up when we will be running on azure.

ansjin · 2017-04-13T23:50:35Z

Status :

Inferences

OpenIE, the second algorithm was running on VM in different Zone as compared to other algorithms. I checked my account, it showed that there were a few seconds downtime for some machines in that zone. I think that is the reason why we got so many ECONNREFUSED errors for that algorithm.

For the later deployment time, we should use VMs from different zones in our cluster so that if there is an issue with a zone then still the service gets available from other Zone VM

Date Event Extraction completes too fast!
The errors are either Socket hangup or the ECONNREFUSED, which I think would not be there once we have better VMs and more compute power.

kordianbruck · 2017-04-14T21:11:04Z

Thanks for the updates / explanation!

What does "completes too fast" mean? Is it not working? Why is too fast a problem?

1-2% error rate is fine. Anything above we should investigate.

simonzachau · 2017-04-15T09:26:24Z

@kordianbruck about investigation:

In the case of the date event extraction we once had a lot of errors and just tweaked some parameters on the Google side (number of pods per machine) and on our side (number of parallel requests). On the one hand, since they were the same errors, we hope to be able to reduce those for the relationship algorithms to virtually zero as well with the same approach.
On the other hand, there are differences: Compared to the date event extraction the relationship algorithms

require a lot more processing power -> solved by scaling up and out
vary in processing power: "difficult" input takes more time -> I think @MusicConnectionMachine/group-2 might have improved their output since the time when the wikipages that we are using were generated; regardless of that we are for sure optimising the requests

ansjin · 2017-04-15T10:24:53Z

@kordianbruck Complete so fast, is a good thing for us. Actually it doesn't need as much processing as the other algorithms so that's why it is getting completed fast.

@simonzachau Those errors were mostly socket time out and they were coming because we were sending 5 parallel requests at a time when there were only 2 machines in the back-end to serve the purpose. So out of those 5, one or 2 requests were getting timed out and our error count was increasing.

Sandr0x00 added the enhancement label Apr 13, 2017

Sandr0x00 self-assigned this Apr 13, 2017

ansjin mentioned this issue Apr 13, 2017

Azure Setup #18

Closed

simonzachau self-assigned this Apr 15, 2017

kordianbruck added this to the 21.04 - Pre Hackathon milestone Apr 17, 2017

Sandr0x00 changed the title ~~Stability~~ Statistics Apr 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistics #59

Statistics #59

Sandr0x00 commented Apr 13, 2017 •

edited

Loading

ansjin commented Apr 13, 2017 •

edited

Loading

kordianbruck commented Apr 13, 2017 •

edited

Loading

simonzachau commented Apr 13, 2017 •

edited

Loading

kordianbruck commented Apr 13, 2017

simonzachau commented Apr 13, 2017

simonzachau commented Apr 13, 2017

ansjin commented Apr 13, 2017

ansjin commented Apr 13, 2017 •

edited

Loading

kordianbruck commented Apr 14, 2017

simonzachau commented Apr 15, 2017

ansjin commented Apr 15, 2017

Statistics #59

Statistics #59

Comments

Sandr0x00 commented Apr 13, 2017 • edited Loading

ansjin commented Apr 13, 2017 • edited Loading

kordianbruck commented Apr 13, 2017 • edited Loading

simonzachau commented Apr 13, 2017 • edited Loading

kordianbruck commented Apr 13, 2017

simonzachau commented Apr 13, 2017

simonzachau commented Apr 13, 2017

ansjin commented Apr 13, 2017

ansjin commented Apr 13, 2017 • edited Loading

kordianbruck commented Apr 14, 2017

simonzachau commented Apr 15, 2017

ansjin commented Apr 15, 2017

Sandr0x00 commented Apr 13, 2017 •

edited

Loading

ansjin commented Apr 13, 2017 •

edited

Loading

kordianbruck commented Apr 13, 2017 •

edited

Loading

simonzachau commented Apr 13, 2017 •

edited

Loading

ansjin commented Apr 13, 2017 •

edited

Loading