Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify dist by merging sql-asm-memory-growth.js into sql-asm.js #431

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kaizhu256
Copy link
Member

Co-authored-by: Yan Li [email protected]
Co-authored-by: kai zhu [email protected]

this implements suggestion #239 (comment)

removed or edited all references to memory-growth using command git grep -i 'memory.*growth'

@lovasoa
Copy link
Member

lovasoa commented Oct 21, 2020

What is the performance impact ?

@kaizhu256
Copy link
Member Author

kaizhu256 commented Oct 22, 2020

good question. the following benchmark indicates asm2 is 1-6% slower running queries on a 60mb database:

// asm2 = LEGACY_VM_SUPPORT + ALLOW_MEMORY_GROWTH
// used build-artifacts from https://github.com/sql-js/sql.js/actions/runs/320372611 for asm2 binaries
// used build-artifacts from https://github.com/sql-js/sql.js/actions/runs/316181408 for asm, wasm binaries

loading 60mb database in chrome
sampled 10 times
asm2:   20.6 +/- 2.2 ms   ( 5% slower than asm)
asm :   19.7 +/- 1.9 ms
wasm:   15.4 +/- 0.3 ms   (22% faster than asm)

copy all tables in 60mb database in chrome
sampled 5 times
asm2:   15798 +/-  47 ms  (  1% slower than asm)
asm :   15610 +/- 152 ms
wasm:    3384 +/-  47 ms  (360% faster than asm)

fts-query with joins in chrome
sampled 5 times
asm2:   40464 +/- 115 ms  (  6% slower than asm)
asm :   38171 +/- 220 ms
wasm:   13942 +/- 216 ms  (170% faster than asm)

you can emulate above result in following steps:

image

  1. download 60mb enron-email database from https://github.com/kaizhu256/sqlite-enron

  2. load database into modified sql.js demos below:

  3. run following sql-queries used in benchmarks:

copy all tables in 60mb database

-- create database-schema
CREATE TABLE `employeelist2` (
  `eid` integer  NOT NULL PRIMARY KEY AUTOINCREMENT
,  `firstName` varchar(31) NOT NULL DEFAULT ''
,  `lastName` varchar(31) NOT NULL DEFAULT ''
,  `Email_id` varchar(31) NOT NULL DEFAULT ''
,  `Email2` varchar(31) DEFAULT NULL
,  `Email3` varchar(31) DEFAULT NULL
,  `EMail4` varchar(31) DEFAULT NULL
,  `folder` varchar(31) NOT NULL DEFAULT ''
,  `status` varchar(50) DEFAULT NULL
,  UNIQUE (`Email_id`)
);
CREATE TABLE `message2` (
  `mid` integer NOT NULL DEFAULT '0'
,  `sender` varchar(127) NOT NULL DEFAULT ''
,  `date` datetime DEFAULT NULL
,  `message_id` varchar(127) DEFAULT NULL
-- ,  `subject` text
-- ,  `body` text
,  `folder` varchar(127) NOT NULL DEFAULT ''
,  PRIMARY KEY (`mid`)
);
CREATE TABLE `recipientinfo2` (
  `rid` integer NOT NULL DEFAULT '0'
,  `mid` integer  NOT NULL DEFAULT '0'
,  `rtype` text  DEFAULT NULL
,  `rvalue` varchar(127) DEFAULT NULL
,  `dater` datetime DEFAULT NULL
,  PRIMARY KEY (`rid`)
,  FOREIGN KEY (mid) REFERENCES message2(mid)
);
CREATE TABLE `referenceinfo2` (
  `rfid` integer NOT NULL DEFAULT '0'
,  `mid` integer NOT NULL DEFAULT '0'
,  `reference` text
,  PRIMARY KEY (`rfid`)
,  FOREIGN KEY (mid) REFERENCES message2(mid)
);
CREATE INDEX "idx_message_sender2" ON "message2" (`sender`);
CREATE VIRTUAL TABLE messages_ft2 USING fts4(subject, body);

-- insert tables
-- employeelist2
-- message2
-- recipientinfo2
-- referenceinfo2
-- message_ft
SELECT 'employeelist', COUNT(*) FROM employeelist;
SELECT 'message', COUNT(*) FROM message;
SELECT 'recipientinfo', COUNT(*) FROM recipientinfo;
SELECT 'referenceinfo', COUNT(*) FROM referenceinfo;
SELECT 'messages_ft', COUNT(*) FROM messages_ft;

INSERT INTO message2 SELECT * FROM
    (
    SELECT
        message.mid,
        message.sender,
        message.date,
        message.message_id,
        -- message.subject,
        -- message.body,
        message.folder
    FROM
    (
    SELECT DISTINCT message.mid
    FROM message
    INNER JOIN recipientinfo ON recipientinfo.mid = message.mid
    INNER JOIN referenceinfo ON referenceinfo.mid = message.mid
    ) AS tmp1
    INNER JOIN message ON message.mid = tmp1.mid
    ORDER BY tmp1.mid
    LIMIT 10000
    ) AS tmp1;

INSERT INTO employeelist2 SELECT * FROM employeelist;
INSERT INTO recipientinfo2 SELECT
        tmp1.rid,
        tmp1.mid,
        tmp1.rtype,
        tmp1.rvalue,
        tmp1.dater
    FROM message2
    LEFT JOIN recipientinfo AS tmp1 ON tmp1.mid = message2.mid
    ORDER BY tmp1.rid;
INSERT INTO referenceinfo2 SELECT
        tmp1.rfid,
        tmp1.mid,
        tmp1.reference
    FROM message2
    LEFT JOIN referenceinfo AS tmp1 ON tmp1.mid = message2.mid
    ORDER BY tmp1.rfid;
INSERT INTO messages_ft2 (
    rowid, subject, body
)
SELECT
        tmp1.rowid, tmp1.subject, tmp1.body
    FROM message2
    INNER JOIN messages_ft AS tmp1 ON tmp1.rowid = message2.mid
    ORDER BY tmp1.rowid;
VACUUM;
SELECT 'employeelist2', COUNT(*) FROM employeelist2;
SELECT 'message2', COUNT(*) FROM message2;
SELECT 'recipientinfo2', COUNT(*) FROM recipientinfo2;
SELECT 'referenceinfo2', COUNT(*) FROM referenceinfo2;
SELECT 'messages_ft2', COUNT(*) FROM messages_ft2;
--*/

fts-query with joins

SELECT * FROM messages_ft2
INNER JOIN recipientinfo2 ON recipientinfo2.mid = messages_ft2.rowid
INNER JOIN referenceinfo2 ON referenceinfo2.mid = messages_ft2.rowid
WHERE body MATCH 'energy NEAR oil';

@kaizhu256
Copy link
Member Author

also the difference in size is < 1%:

                                 asm        asm2
sql-asm-debug.js          10,865,349  10,896,696  bytes
worker.sql-asm-debug.js   10,867,832  10,899,179  bytes
sql-asm.js                 2,383,575   2,389,992  bytes
worker.sql-asm.js          2,386,058   2,392,475  bytes

@lovasoa
Copy link
Member

lovasoa commented Oct 26, 2020

If we want to be able to merge this without breaking backwards compatibility, we can first add -s LEGACY_VM_SUPPORT=1 and -s ALLOW_MEMORY_GROWTH=1 into the asmjs version, and add a deprecation warning to the memory growth version.

On the next major version bump, we can delete the memory growth version.

@kaizhu256
Copy link
Member Author

k. would updated patch with following be ok?

image

image

@kaizhu256
Copy link
Member Author

latest-patch removes redundant npm-tests on sql-asm-memory-growth.js, since its now essentially a carbon-copy of sql-asm.js (plus a deprecation-warning header).

this simplifies sunsetting sql-asm-memory-growth.js in some future-release by simply deleting the following lines in /Makefile:

--- a/Makefile
+++ b/Makefile
@@ -101,9 +101,6 @@ dist/sql-wasm.js: $(BITCODE_FILES) $(OUTPUT_WRAPPER_FILES) $(SOURCE_API_FILES) $
        mv $@ out/tmp-raw.js
        cat src/shell-pre.js out/tmp-raw.js src/shell-post.js > $@
        rm out/tmp-raw.js
-       # TODO - remove code below in future releases to sunset sql-asm-memory-growth.js
-       printf 'console.error(\n    "\\n\\nDEPRECATION WARNING.\\n"\n    + "sql-asm-memory-growth.js will be removed in future releases.\\n"\n    + "Use sql-asm.js instead, which now includes memory-growth support.\\n\\n"\n);\n' > dist/sql-asm-memory-growth.js
-       cat $@ >> dist/sql-asm-memory-growth.js

 # Web worker API
 .PHONY: worker

if you still want npm-tests on sql-asm-memory-growth.js i can revert this patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants