Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/wayback machine #286

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
9d61a6f
Added wayback-machine action
adaveinthelife Jun 4, 2018
35704a0
Merge branch 'master' of https://github.com/internetarchive/internet-…
adaveinthelife Jun 10, 2018
d10b1ea
Further modifications to Wayback code. Fixed slots
adaveinthelife Jun 13, 2018
92c7c8e
Run eslint and local testing on wayback code
adaveinthelife Jun 14, 2018
c2b2cae
Rewrote wayback machine to return promise
adaveinthelife Jun 16, 2018
9bc0f69
Created fallback for speech if country rank missing from alexa rankin…
adaveinthelife Jun 19, 2018
efc09fe
Reverted strings.js back to master to pass tests
adaveinthelife Jun 19, 2018
5514439
Added Wayback dialog to strings.js
adaveinthelife Jun 19, 2018
d4184a2
Wayback machine revisions for code review + changes to config.js file
adaveinthelife Jun 25, 2018
565491b
Changes to string.js with additional dialog
adaveinthelife Jun 26, 2018
734ec81
Changes to wayback spec file
adaveinthelife Jun 27, 2018
4ace6fc
Break traverse off into separate util function and created unit test …
adaveinthelife Jul 4, 2018
3cbd152
Reworked XML parser promise to include resolve and reject
adaveinthelife Jul 6, 2018
22c6015
Reworked all functions to return promises and then created promises p…
adaveinthelife Jul 10, 2018
0c7c183
Added unit tests for axios requests in wayback feature
adaveinthelife Jul 10, 2018
bcb2da7
Added more unit tests for axios requests in wayback feature
adaveinthelife Jul 10, 2018
ced199e
Took some debug messages out of WB code and traverse util
adaveinthelife Jul 10, 2018
d7b796c
Added unit tests for archiveEngine and alexaEngine
adaveinthelife Jul 11, 2018
167d287
Added unit test for xml converter in wb machine
adaveinthelife Jul 11, 2018
abb91d7
Took out axios calls for wayback.spec and added them into fixtures
adaveinthelife Jul 13, 2018
fa153ac
Changed promise pipeline to avoid callback
adaveinthelife Jul 14, 2018
698830e
Took out missing axios variable in wayback spec
adaveinthelife Jul 14, 2018
74f93e5
Rewrote archive and alexa engines to be pure functions
adaveinthelife Jul 14, 2018
a3364d7
Rewrote pipeline responses to use obj assign
adaveinthelife Jul 16, 2018
d916d1e
Took unneeded promises out of pipeline, used unpacking to make code c…
adaveinthelife Jul 16, 2018
f53d4f3
Added catch statement for WB handler
adaveinthelife Jul 16, 2018
d76ca18
Simplified pipeline for WB machine
adaveinthelife Jul 16, 2018
8e3ec2f
Fixed typo in catch statement
adaveinthelife Jul 16, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion functions/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
"object.entries": "^1.0.4",
"raven": "^2.6.0",
"replaceall": "^0.1.6",
"supports-color": "^5.4.0"
"xml2js": "^0.4.19"
},
"devDependencies": {
"axios-mock-adapter": "^1.15.0",
Expand Down
129 changes: 129 additions & 0 deletions functions/src/actions/wayback-machine.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
// Third party imports
const axios = require('axios');
const mustache = require('mustache');
const xml2js = require('xml2js');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all new imports should be installed by:

npm install --save xml2js

when you would need new lib for testing you should use instead:

npm install --save-dev xml2js

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should install xml2js with:

npm install --save xml2js

to add it to package.json


// Local imports
const config = require('../config');
const {debug} = require('../utils/logger')('ia:actions:wayback-machine');
const dialog = require('../dialog');
const endpointProcessor = require('../network/endpoint-processor');
const traverse = require('../utils/traverse');
const waybackStrings = require('../strings').intents.wayback;

/**
* Handle wayback query action
* - fill slots of wayback query
* - perform data requests to archive and alexa rankings
* - construct response speech for action
*
* @param app
*/
function handler (app) {
// Create wayback object
const waybackObject = {
url: '',
earliestYear: 0,
latestYear: 0,
totalUniqueURLs: 0,
alexaWorldRank: 0,
alexaUSRank: 0,
speech: waybackStrings.default,
};

// Check to see that both parameters have content
if (!app.params.getByName('wayback') && !app.params.getByName('url')) {
debug('wayback action called by mistake');
dialog.ask(app, waybackObject);
}

// Get url parameter and make url queries
waybackObject.url = app.params.getByName('url');
const archiveQueryURL = endpointProcessor.preprocess(
config.wayback.ARCHIVE, app, waybackObject
);
const alexaQueryURL = endpointProcessor.preprocess(
config.wayback.ALEXA, app, waybackObject
);

return Promise.all([axios.get(archiveQueryURL), axios.get(alexaQueryURL)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use callback nesting here. One of the main feature of Promises is avoiding callback hell. It should be something like:

action.then(a => {
  return Promise.all(/*....*/);
})
.then(b => {
  return Promise.all(/*....*/);
})
.then(c => {
});

.then(function (requestData) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw you can make this part of code more readable by using arguments unpacking:

.then(([archiveRes, alexaRes]) => {
     return Promise.all([archiveEngine(archiveRes.data), xmlConverter(alexaRes.data)])
})

return Promise.all([archiveEngine(requestData[0].data, waybackObject), xmlConverter(requestData[1].data)])
.then(function (response) {
return Promise.all([alexaEngine(response[1], waybackObject)])
.then(function () {
// Construct response dialog for action
if (waybackObject.alexaUSRank !== 0) {
waybackObject.speech = mustache.render(waybackStrings.speech, waybackObject);
waybackObject.speech += mustache.render(waybackStrings.additionalSpeech, waybackObject);
} else {
waybackObject.speech = mustache.render(waybackStrings.speech, waybackObject);
waybackObject.speech += '.';
}
dialog.close(app, waybackObject);
});
});
}); // End of REQUEST promise
} // End of handler

function archiveEngine (archiveJSON, waybackObject) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function also should be pure-function and it doesn't need to be async

return new Promise(function (resolve, reject) {
// debug('Inside archiveEngine promise...');
// Create array of capture years and then find earliest year
// and most recent year.
let yearsArray = Object.keys(archiveJSON.captures);
Copy link
Member

@hyzhak hyzhak Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, it is better to sort years, to be sure that they are in the order

waybackObject.earliestYear = yearsArray[0];
waybackObject.latestYear = yearsArray[yearsArray.length - 1];

// Traverse URL category

// Find baseline of URL count
waybackObject.totalUniqueURLs += traverse(archiveJSON.urls[waybackObject.earliestYear]);
// debug('Baseline url count: ' + waybackObject.totalUniqueURLs);

waybackObject.totalUniqueURLs += traverse(archiveJSON.new_urls);
// debug('Final url count: ' + waybackObject.totalUniqueURLs);

if (waybackObject.totalUniqueURLs > 0) {
// debug('archiveEngine promise successful!');
resolve();
}
});
}

function alexaEngine (alexaJSON, waybackObject) {
return new Promise(function (resolve, reject) {
// debug('Inside alexaEngine promise...');
waybackObject.alexaWorldRank = alexaJSON['ALEXA']['SD'][0]['POPULARITY'][0]['$']['TEXT'];
try {
waybackObject.alexaUSRank = alexaJSON['ALEXA']['SD'][0]['COUNTRY'][0]['$']['RANK'];
} catch (e) {
debug('Country not found');
debug(e);
}
if (waybackObject.alexaWorldRank > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check that alexaWorldRank > 0 here, and what should be in case when alexaWorldRank <= 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend that function alexaEngine would be in the pure-function way: https://en.wikipedia.org/wiki/Pure_function

You don't need to mutate waybackObject here, just return rands:

function getAlexaRanks (alexaJSON) {
  //.... some in logic to extract ranks from alexa json
  return {
    alexaWorldRank: 123456,
    alexsUSRank: 1345,
  };
}

and then you can merge the result inside of handler:

res = {}
// it merges res to empty object and then merge getAlexaRanks
res = Object.assign({}, res, getAlexaRanks (alexaJSON));

// debug('alexaEngine promise successful!');
resolve();
}
});
}

function xmlConverter (data) {
return new Promise(function (resolve, reject) {
// debug('Inside xmlConverter promise...');
let XMLparser = new xml2js.Parser();
XMLparser.parseString(data, function (err, result) {
if (err) {
let error = new Error('The XML parser didn\'t work.');
reject(error);
} else {
// debug('xmlConverter promise successful!');
resolve(JSON.parse(JSON.stringify(result)));
}
});
});
}

module.exports = {
handler,
};
5 changes: 5 additions & 0 deletions functions/src/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ module.exports = {
DEFAULT_SONG_IMAGE: 'http://archive.org/images/notfound.png',
},

wayback: {
ARCHIVE: 'http://web.archive.org/__wb/search/metadata?q={{url}}',
ALEXA: 'http://data.alexa.com/data?cli=10&url={{url}}',
},

/**
* settings specific for supported platforms
*/
Expand Down
8 changes: 8 additions & 0 deletions functions/src/strings.js
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,14 @@ module.exports = {
speech: 'Version is {{version}}.',
},

wayback: {
speech: '{{url}} was first captured by the Internet Archive in {{earliestYear}} and most recently in {{latestYear}}. The archive has {{totalUniqueURLs}} unique url\'s for this website. <break></break>{{url}} is ranked <say-as interpret-as="ordinal">{{alexaWorldRank}}</say-as> in the world in popularity',
additionalSpeech: ' and <say-as interpret-as="ordinal">{{alexaUSRank}}</say-as> in the United States.',
default: 'Would you like to use the wayback machine to hear the history of a website? Simply say Wayback Machine and the name of the website you\'d like to hear.',
error: "I'm sorry I'm having trouble here. Maybe we should try this again later.",
suggestions: ['wayback machine google.com', 'wayback machine archive.org']
},

welcome: {
acknowledges: [
'Welcome to music at the Internet Archive.'
Expand Down
31 changes: 31 additions & 0 deletions functions/src/utils/traverse.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
const _ = require('lodash');
// const {debug} = require('../utils/logger')('ia:actions:utils:traverse');

/**
* Traverse a given object
*
* @param {Object} obj
*/
module.exports = function (obj) {
let results = [];
function traverse (obj) {
_.forOwn(obj, (val, key) => {
if (_.isArray(val)) {
val.forEach(el => {
traverse(el);
});
} else if (_.isObject(val)) {
traverse(val);
} else {
results.push(val);
}
});
}
traverse(obj);
let count = 0;
while (results.length !== 0) {
count += results.pop();
}
// debug('final count inside traverse = ' + count);
return count;
};
64 changes: 64 additions & 0 deletions functions/tests/actions/wayback-machine.spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
const axios = require('axios');
const {expect} = require('chai');
const rewire = require('rewire');

const action = rewire('../../src/actions/wayback-machine');

const mockApp = require('../_utils/mocking/platforms/app');
const mockDialog = require('../_utils/mocking/dialog');

describe('actions', () => {
describe('wayback machine', () => {
let app;
let dialog;
let archiveRequest = axios.get('http://web.archive.org/__wb/search/metadata?q=cnn.com');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't request data for a test from an ourside. If you would need any data you should create fixture data.

Examples: https://github.com/internetarchive/internet-archive-google-action/tree/master/functions/tests/fixtures

let alexaRequest = axios.get('http://data.alexa.com/data?cli=10&url=cnn.com');

beforeEach(() => {
app = mockApp();
dialog = mockDialog();
action.__set__('dialog', dialog);
});

it('check to see that overall action eventually returns a promise', () => {
action.handler(app);
expect(Promise.resolve()).to.be.a('promise');
});

it('check to see that axios request promises are working', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't test axios library.

expect(archiveRequest).to.not.be.undefined;
expect(alexaRequest).to.not.be.undefined;
});

it('check to see that archiveEngine is working', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adaveinthelife before merging this PR please fix those tests or note that they are not working and should be fixed

let obj = { earliestYear: 0, latestYear: 0, totalUniqueURLs: 0 };
let result;

action.__set__('archiveEngine', function (a, b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems this test has never come inside of this mock. You never call handler in this test on others. Actually, it is not clear what: check to see that archiveEngine is working means here, could you be more specific?

result = action.archiveEngine(archiveRequest, obj);
expect(result).to.change(obj, 'earliestYear');
expect(result).to.change(obj, 'latestYear');
expect(result).to.change(obj, 'totalUniqueURLs');
});
});

it('check to see that alexaEngine is working', () => {
let obj = { alexaWorldRank: 0 };
let result;

action.__set__('alexaEngine', function (a, b) {
result = action.alexaEngine(alexaRequest, obj);
expect(result).to.change(obj, 'alexaWorldRank');
});
});

it('check to see that xmlConverter is working', () => {
let result;

action.__set__('xmlConverter', function (a) {
result = action.xmlConverter(alexaRequest);
expect(result).to.not.throw('The XML parser didn\'t work.');
});
});
});
});
35 changes: 35 additions & 0 deletions functions/tests/utils/traverse.spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
const {expect} = require('chai');
const traverse = require('../../src/utils/traverse');

let testJSON = {'captures': {
'1999': {
'text/html': 18360
},
'2000': {
'application/x-director': 19,
'video/quicktime': 1584,
'application/x-troff-man': 1,
'x-world/x-vrml': 1,
'audio/x-pn-realaudio': 176,
'audio/mpeg': 195,
'audio/x-wav': 3098,
'image/png': 97,
'text/html': 901401,
'video/x-ms-asf': 142,
'image/gif': 17388,
'text/plain': 394428,
'image/jpeg': 82903,
'application/x-shockwave-flash': 39,
'application/zip': 108,
'audio/x-aiff': 2767,
'text/css': 55,
'application/pdf': 291
}}};

describe('utils', () => {
describe('traverse', () => {
it('should traverse a given object to return the sum of it\'s leaf nodes', () => {
expect(traverse(testJSON)).to.be.equal(1423053);
});
});
});