Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preferred way of importing? #140

Closed
phdowling opened this issue Sep 11, 2016 · 14 comments
Closed

Preferred way of importing? #140

phdowling opened this issue Sep 11, 2016 · 14 comments

Comments

@phdowling
Copy link
Contributor

Hey guys,

I make use of the Twitter stream you open in your project. Currently I import it by cloning your repo to a certain location and doing an import of the file the stream is opened in:

var PokemonTwitter = require("../PokeData/app/controllers/filler/twitter");

However, for production, this of course needs to change - is your project on npm already? How should I expect to import the module I need? Also pinging @PokemonGoers/catch-em-all for this.

@samitsv
Copy link
Collaborator

samitsv commented Sep 12, 2016

we do not have a plan to ship a npm package, and if you want get the data generated by twitter, then you can do so using http://pokedata.c4e3f8c7.svc.dockerapp.io:65014/doc/#api-PokemonSighting-GetSightingBySource or http://pokedata.c4e3f8c7.svc.dockerapp.io:65014/api/pokemon/sighting/search?source=twitter

@phdowling
Copy link
Contributor Author

The REST API won't do in this case, we need live tweets, i.e. the raw feed. I guess we can just start a seperate feed in our own module and be fairly independent from yours there. Another way to go would be for us to PR the code we wrote into your repo, then we could access all data sources freely - not sure what the best way to go is here. @gyachdav or @sacdallago , any suggestions here?

@gyachdav
Copy link

I recommend you stick with analyzing the Streaming API on your own, separated from project A.

@sacdallago
Copy link
Member

@phdowling yup, I would suggest you create a npm package that the guys from A can use on the tweets to perform the sentiment analysis. They are listening to the tweets anyway, I imagine it to be something like adding a function (from your package) which calculates the score, deffer a write to a dedicated collection ({tweetId: xyz, sentiment: +1.2} and that's it.

Or, eventually, the guys from A can implement a npm runner to perform the score analysis for all the tweets.

@samitsv are you storing the RAW tweets somewhere? I can't remember

@samitsv
Copy link
Collaborator

samitsv commented Sep 13, 2016

@sacdallago raw tweets are not being stored

@sacdallago
Copy link
Member

@samitsv MH. It might make sense that they are :) @gyachdav we did this last semester, but I'm not entirely sure it makes sense.

Taking the idea from PokemonGoers/HashPokemonGo#12 (comment) maybe extend that object ({twitterId: xyz, sentiment: 1.23}) with:

  • the original text of the tweet
  • lat/lng info if available
  • timestamp of the tweet

and save that?

Also @samitsv , checking out the data ccoming from twitter: why store null lat/lng values? Aka:

{"_id":"57c936554e3bd9e1024717fc","source":"TWITTER","appearedOn":"2016-09-02T08:20:37.452Z","__v":0,"pokemonId":7,"location":null}

It makes sense in the collection mentioned above, but no sense in the sightings collection... just knowing that they spawned somewhere in the globe seems like the least informative feature ever to me 😆 @goldbergtatyana do you agree?

@samitsv
Copy link
Collaborator

samitsv commented Sep 14, 2016

@sacdallago about null lat/lng, maybe important if someone wants to see the number of appearances of pokemons or want to know the appearance time of pokemon, like one pokemon appears mostly on day and not during night time?

@sacdallago
Copy link
Member

hmm.. @goldbergtatyana @juanmirocks @gyachdav opinions?

@goldbergtatyana
Copy link

valid points from both of you. null values of long/lat

  • are not useful for prediction, since we can neither infer exact location, nor weather or topological features (e.g. proximity to water)
  • could be useful for general statistics on the reportings of pokemons

If we have an issue with storage space, then I would recommend to not store sightings with empty locations. If there is no issue, then yes store them 😄

@swathi-ssunder
Copy link
Collaborator

@goldbergtatyana @sacdallago To add further to the discussion, we had to store null values for location even when there is no data for it(without skipping the location key altogether) since we have indexed data based on location data field for geospatial queries.
So if we rather prefer not to have these entries, then we could skip/ignore the record altogether.

@goldbergtatyana
Copy link

thanks @swathi-ssunder ! Again, for predictions these records are useless. For statistics on sightings they are nice. However, for doing the statistics these data will be most likely used just once. Therefore, I would suggest to get rid of entries with empty location altogether.

@samitsv
Copy link
Collaborator

samitsv commented Sep 17, 2016

@phdowling @sacdallago i think i forgot about the part with twitter texts being stored. So do we store it or not? I see some sentiment api is already implemented, @phdowling could you let me know how I could access it?

@sacdallago
Copy link
Member

@samitsv I would store it. Better to have this data than not.. @gyachdav @goldbergtatyana @juanmirocks opinions?

@samitsv samitsv mentioned this issue Sep 19, 2016
@samitsv
Copy link
Collaborator

samitsv commented Sep 22, 2016

being worked on #162

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants