diff --git a/docs/topics/advanced/rediskeys.rst b/docs/topics/advanced/rediskeys.rst index 77f74fa6..e5ca5d0a 100644 --- a/docs/topics/advanced/rediskeys.rst +++ b/docs/topics/advanced/rediskeys.rst @@ -69,6 +69,6 @@ If you run the integration tests, there may be temporary Redis keys created that - **cluster:test** - Used when testing the Kafka Monitor can act and set a key in Redis -- **test-spider:istresearch.com:queue** - Used when testing the crawler installation can interact with Redis and Kafka +- **test-spider:dmoztools.net:queue** - Used when testing the crawler installation can interact with Redis and Kafka - **stats:crawler::test-spider:** - Automatically created and destoryed during crawler testing by the stats collection mechanism settings. diff --git a/docs/topics/introduction/quickstart.rst b/docs/topics/introduction/quickstart.rst index cf63c8ea..652e4882 100644 --- a/docs/topics/introduction/quickstart.rst +++ b/docs/topics/introduction/quickstart.rst @@ -431,7 +431,7 @@ Which ever setup you chose, every process within should stay running for the rem :: - python kafka_monitor.py feed '{"url": "http://istresearch.com", "appid":"testapp", "crawlid":"abc123"}' + python kafka_monitor.py feed '{"url": "http://dmoztools.net", "appid":"testapp", "crawlid":"abc123"}' You will see the following output on the command line for that successful request: @@ -439,7 +439,7 @@ You will see the following output on the command line for that successful reques 2015-12-22 15:45:37,457 [kafka-monitor] INFO: Feeding JSON into demo.incoming { - "url": "http://istresearch.com", + "url": "http://dmoztools.net", "crawlid": "abc123", "appid": "testapp" } @@ -460,7 +460,7 @@ Crawl Request: :: - python kafka_monitor.py feed '{"url": "http://dmoz.org", "appid":"testapp", "crawlid":"abc1234", "maxdepth":1}' + python kafka_monitor.py feed '{"url": "http://dmoztools.net", "appid":"testapp", "crawlid":"abc1234", "maxdepth":1}' Now send an ``info`` action request to see what is going on with the crawl: diff --git a/docs/topics/kafka-monitor/quickstart.rst b/docs/topics/kafka-monitor/quickstart.rst index 1392b60c..9708bf83 100644 --- a/docs/topics/kafka-monitor/quickstart.rst +++ b/docs/topics/kafka-monitor/quickstart.rst @@ -33,7 +33,7 @@ JSON Object feeder into your desired Kafka Topic. This takes a valid JSON object :: - $ python kafka_monitor.py feed '{"url": "http://istresearch.com", "appid":"testapp", "crawlid":"ABC123"}' + $ python kafka_monitor.py feed '{"url": "http://dmoztools.net", "appid":"testapp", "crawlid":"ABC123"}' The command line feed is very slow and should not be used in production. Instead, you should write your own continuously running application to feed Kafka the desired API requests that you require. @@ -89,10 +89,10 @@ Feed an item :: - $ python kafka_monitor.py feed '{"url": "http://istresearch.com", "appid":"testapp", "crawlid":"ABC123"}' + $ python kafka_monitor.py feed '{"url": "http://dmoztools.net", "appid":"testapp", "crawlid":"ABC123"}' 2016-01-05 15:14:44,829 [kafka-monitor] INFO: Feeding JSON into demo.incoming { - "url": "http://istresearch.com", + "url": "http://dmoztools.net", "crawlid": "ABC123", "appid": "testapp" } @@ -116,8 +116,8 @@ If you have a :ref:`Crawler ` running, you should see the html come thr "response_headers": { }, - "response_url": "http://istresearch.com", - "url": "http://istresearch.com", + "response_url": "http://dmoztools.net", + "url": "http://dmoztools.net", "status_code": 200, "status_msg": "OK", "appid": "testapp", diff --git a/docs/topics/rest/api.rst b/docs/topics/rest/api.rst index cb2aab4b..524b17a6 100644 --- a/docs/topics/rest/api.rst +++ b/docs/topics/rest/api.rst @@ -156,7 +156,7 @@ Feed a crawl request :: - $ curl scdev:5343/feed -H "Content-Type: application/json" -d '{"url":"istresearch.com", "appid":"madisonTest", "crawlid":"abc123"}' + $ curl scdev:5343/feed -H "Content-Type: application/json" -d '{"url":"http://dmoztools.net", "appid":"madisonTest", "crawlid":"abc123"}' Feed a Stats request