-
Notifications
You must be signed in to change notification settings - Fork 18
Heatmap Stress Test
Heatmaps are working well with small numbers of items (about 30k). We need to verify it works well with millions. For stress testing, we hope to use tweets from Ben.
I created a simple php script to convert Ben's tweet file something that Zeega can ingest via the command line. Once in Zeega, the data can be added to Solr via Solr's data import.
The simple php script is run
php tweetsToJson.php geo_tweets_hour_2015_12_03_22.csv
This creates the file geo_tweets_hour_2015_12_03_22.json. This json file can be ingested from /var/www/zeega$ with:
sudo app/console zeega:persist --file_path=/home/spacemansteve/tmp/geo_tweets_hour_2015_12_03_22.json --ingestor=console --user=468 --replace_duplicates
in src/Zeega/IngestionBundle/Command/PersistCommand.php, duplicates are recognized by:
findOneBy(array("uri"=>$item["uri"], "user"=>$user)); So, as the tweets are converted to JSON we need to add a false URI.
The zeega:persist command seems to add items to the production database. Running zeega:persist from /var/www/spacemansteve generates an error:
[InvalidArgumentException]
There are no commands defined in the "zeega" namespace
So, the php code has been modified to also generate xml files ingestable by Solr. They are sent to Solr with:
curl "http://dev.jdarchive.org:8983/solr/jda/update?commit=true" --data-binary @testTweets.xml -H "Content-Type:application/xml;charset=utf-8"