Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is dead code elimination really working? #37

Open
aisamu opened this issue Apr 11, 2017 · 11 comments
Open

Is dead code elimination really working? #37

aisamu opened this issue Apr 11, 2017 · 11 comments

Comments

@aisamu
Copy link

aisamu commented Apr 11, 2017

I can't seem to bring the code size down with a barely empty project through dead-code elimination alone, even when following the installation instructions to the letter.

The template used was reagent-figwheel with only devtools and reagent as dependencies. I've fiddled with the goog.DEBUG flag, and then removed devtools from the :require and :dependencies vectors.

Type Size
:dependencies + :require + goog.DEBUG true 1.9 MB
:dependencies + :require + goog.DEBUG false 1.5 MB
:dependencies + :require + no mention at all 1.6 MB
:dependencies + no :require + no mention at all 763 KB
no :dependencies + no :require + no mention at all 763 KB

The first two rows followed the instructions from the release notes. There's a single mention of (devtools/install!) within a (when ^boolean js/goog.DEBUG ...) block. The other three had this line removed manually.

Google Closure's shaved 400kb of the build, but that's still a 800kb increase for a :require without a single mention of devtools! Is this the best I can expect from dead code elimination?

Here's the code used for the builds (e.g. lein new reagent-figwheel +devtools):

;##################
;### profile.cljs

(defproject devtools-dce "0.1.0-SNAPSHOT"
 :dependencies [[org.clojure/clojure "1.8.0"]
                [org.clojure/clojurescript "1.9.229"]
                [reagent "0.6.0"]
                [binaryage/devtools "0.8.2"]]

  [...]

 :cljsbuild
 {:builds
  [{:id           "min"
    :source-paths ["src/cljs"]
    :compiler     {:main            devtools-dce.core
                   :optimizations   :advanced
                   :output-to       "resources/public/js/compiled/app.js"
                   :output-dir      "resources/public/js/compiled/min"
                   :closure-defines {goog.DEBUG false}
                   :pseudo-names    true   ;; added to exacerbate size difference
                   :pretty-print    false}}

   ]})

;##############
;## core.cljs

(ns devtools-dce.core
 (:require
  [reagent.core :as reagent]
  [devtools.core :as devtools]
  ))

[...]

(defn dev-setup []        ;; inlining into `main` made no difference
 (when ^boolean js/goog.DEBUG
   (enable-console-print!)
   (println "dev mode")
   (devtools/install!)
   ))

(defn ^:export main []
 (dev-setup)
 (reload))
@darwin
Copy link
Member

darwin commented Apr 11, 2017

Interesting observations. But I'm sorry I'm not going to investigate it.

Currently each release of cljs-devtools performs following dead-code elimination sanity check:

  1. compiles a trivial empty project with :dependencies, :require and goog.DEBUG set to false under :advanced mode with :pseudo-names:
    :dead-code
    {:source-paths ["src/lib"
    "test/src/dead-code"]
    :compiler {:output-to "test/resources/.compiled/dead-code/build.js"
    :output-dir "test/resources/.compiled/dead-code"
    :asset-path ".compiled/dead-code"
    :main devtools.main
    :closure-defines {"goog.DEBUG" false}
    :pseudo-names true
    :optimizations :advanced}}
  2. uses this script to check that no traces of devtools namespace are present: https://github.com/binaryage/cljs-devtools/blob/master/test/scripts/dead-code-check.sh

It seems to work in this case. But I don't check files sizes or how it affects non-devtools code. One possible theory could be that although devtools namespaces are not present, still requiring them triggers Google Closure to compile-in more of the standard library. For example whole cljs printing machinery (println and friends) which would be normally dead-code eliminated.

Please investigate it further and let me know. You might also want to test more recent ClojureScript versions because this might be dependent on Google Closure compiler version.

@aisamu
Copy link
Author

aisamu commented Apr 11, 2017

Nice catch! Unfortunately, updating Clojurescript to 1.9.494 made things a bit worse.

There seems to be no difference between goog.DEBUG true or false versions. (I re-checked it three times over). (Typo on the scrips!)

Here are the updated sizes:

Type clojurescript "1.9.494" Size
:dependencies + :require + goog.DEBUG true 2.1 MB
:dependencies + :require + goog.DEBUG false 1.7 MB
:dependencies + :require + no mention at all 1.7 MB
:dependencies + no :require + no mention at all 860 KB
no :dependencies + no :require + no mention at all 860 KB

This is way over my league, but I'll gladly follow your lead if you have more ideas.

Regarding your DCE sanity check:

  1. That's where I got the :pseudo-names from!
  2. I tried to use the scrips on the generated files, but it never outputs anything, even on the builds with explicit mentions of devtools and no DCE.
$> cat devtools-dce-pseudo-name-inline-debug-debug-true/resources/public/js/compiled/app.js | perl -pe 's/(\\$|\\d+)\\$/\\1\\$\\n/g' | grep -o 'devtools\\$.*'
(no output)

$> sw_vers
ProductName:	Mac OS X
ProductVersion:	10.10.5
BuildVersion:	14F2315

$> perl --version
This is perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level

$> grep --version
grep (BSD grep) 2.5.1-FreeBSD

@darwin
Copy link
Member

darwin commented Apr 11, 2017

I was curious. I have just added some helper scripts to compare DCE in scenarios you described.

You can run lein compare-dead-code or lein compare-dead-code-with-pseudo-names.

My output looks like this:

> lein compare-dead-code
Compiling ClojureScript...
Compiling "test/resources/.compiled/dce-no-debug/build.js" from ["src/lib" "test/src/dead-code"]...
Successfully compiled "test/resources/.compiled/dce-no-debug/build.js" in 15.833 seconds.
Compiling ClojureScript...
Compiling "test/resources/.compiled/dce-with-debug/build.js" from ["src/lib" "test/src/dead-code"]...
Successfully compiled "test/resources/.compiled/dce-with-debug/build.js" in 15.853 seconds.
Compiling ClojureScript...
Compiling "test/resources/.compiled/dce-no-mention/build.js" from ["src/lib" "test/src/dead-code-no-mention"]...
Successfully compiled "test/resources/.compiled/dce-no-mention/build.js" in 15.609 seconds.
Compiling ClojureScript...
Compiling "test/resources/.compiled/dce-no-require/build.js" from ["src/lib" "test/src/dead-code-no-require"]...
Successfully compiled "test/resources/.compiled/dce-no-require/build.js" in 8.852 seconds.
Compiling ClojureScript...
Compiling "test/resources/.compiled/dce-no-sources/build.js" from ["test/src/dead-code-no-require"]...
Successfully compiled "test/resources/.compiled/dce-no-sources/build.js" in 9.004 seconds.

stats:
WITH_DEBUG: 368048 bytes
NO_DEBUG:   297414 bytes
NO_MENTION: 297414 bytes
NO_REQUIRE: 4778 bytes
NO_SOURCES: 4778 bytes

beautified with-debug.js
beautified no-debug.js
beautified no-mention.js
beautified no-require.js
beautified no-sources.js

beautified sources in test/resources/.compiled/dead-code-compare

see https://github.com/binaryage/cljs-devtools/issues/37

These results are consistent with my theory:

  1. WITH_DEBUG must be largest
  2. NO_DEBUG and NO_MENTION should be the same if :closure-defines {goog.DEBUG false} works
  3. NO_REQUIRE and NO_SOURCES should be the same and should be smallest
  4. difference between NO_DEBUG and NO_SOURCES is probably triggered by some static code in cljs-devtools library which gets DCE-ed but triggers inclusion of large parts of cljs.core library for some reason (needs investigation)

During writing these scripts I got completely puzzled by behaviour of lein/cljsbuild/cljs-compiler. It looks like cljs compiler incorrectly reuses caches between individual cljsbuild builds. That caused that my results were illogical and depended on the order of compilation of individual cljsbuild builds and cljs compiler cache state. I had to brute-force lein clean between individual dce builds just to be sure I get consistent results. It seems to me that you hit similar issue with your measurements because you should be able to reproduce same file size relationships.

@aisamu
Copy link
Author

aisamu commented Apr 12, 2017

Oh, there was a typo on the scripts that I managed to overlook three times in a row... I've corrected the second row above. The results went back to the previous distribution, just a bit larger.

Yup, I had already been bitten by this cljsbuild behavior, it's almost a "standard".
As a matter of fact, I ran each of those tests on a brand-new, separate project.

I've just tried the whole procedure with reagent and got similar results. It looks like I was expecting too much from dead code elimination, and the way to go for complete elision is indeed different :source-paths.

Thanks for the attention (and for cljs-devtools!)

The results for reagent:

[clojurescript "1.9.494"], [reagent "0.6.0"] Size
:dependencies + :require + goog.DEBUG true 844 KB
:dependencies + :require + goog.DEBUG false 838 KB
:dependencies + :require + no mention at all 839 KB
:dependencies + no :require + no mention at all 425 KB
no :dependencies + no :require + no mention at all 425 KB

@darwin
Copy link
Member

darwin commented Apr 12, 2017

I dug deeper into this and I have some bad news.

So far I identified following problems:

  1. I used to think that def-ing static config maps[1] is DCE-friendly. Unfortunately it turns out that if the map has more than 8 keys, the cljs compiler emits cljs.core.PersistentHashMap.fromArrays call which ruins DCE. [my theory] Probably because internally it uses transients which use protocols which are too dynamic. Closure Compiler then gets convinced to compile-in large part of cljs.core library.
  2. requiring cljs.pprint without ever using any of its functionality also triggers huge compilation output alone. I briefly investigated its generated javascript before advanced mode optimization and it definitely suffers from 1) as well (specifically cljs.pprint.*code-table* and cljs.pprint.directive-table)
  3. just requiring cljs.stacktrace also triggers huge compilation output alone. There are no static cljs.core.PersistentHashMap.fromArrays calls. But it declares multi-methods which is probably the cause of code bloat.

Please note that cljs.pprint also uses multi-methods so both issues might trigger large compilation output there.

I don't have a good solution for this right now. I was able to work-around cljs.core.PersistentHashMap.fromArrays issue by wrapping static def-s in dynamic functions which get called lazily on first-usage. This fixed the issue in cljs-devtools code. But I'm not able to easily fix cljs.pprintand cljs.stacktrace.

[1]

@darwin
Copy link
Member

darwin commented Apr 12, 2017

Ok, I was able to shave off quite some dead weight:
https://travis-ci.org/binaryage/cljs-devtools/builds/221500541#L333-L337

The small difference between NO_DEBUG and NO_REQUIRE is caused by requiring goog.userAgent. So even google closure library authors do not always produce code which could be fully understood by DCE in closure compiler. I could eventually apply the same trick as with cljs.pprint and cljs.stacktrace, but IMO it is not worth it ATM.

@darwin darwin closed this as completed Apr 12, 2017
@darwin darwin reopened this Apr 13, 2017
@aisamu
Copy link
Author

aisamu commented Apr 13, 2017

Wow, that's really great! It effectively brought DCE back as a viable option.

Following today's Clojurescript release (with the accompanying Google Closure version bump), here are the new results:

[clojurescript "1.9.518"], [binaryage/devtools "0.9.3-SNAPSHOT"]* Size
:dependencies + :require + goog.DEBUG true 1.3 MB
:dependencies + :require + goog.DEBUG false 844 KB
:dependencies + :require + no mention at all 844 KB
:dependencies + no :require + no mention at all 841 KB
no :dependencies + no :require + no mention at all 841 KB

*5270726

Your version of "I'm sorry I'm not going to investigate it" is putting our's "I'm 110% on it" to shame.

Thank you!

@darwin
Copy link
Member

darwin commented Apr 13, 2017

Wait, I was too fast. It didn't work as expected.

darwin added a commit that referenced this issue Apr 13, 2017
The solution with devtools.optional namespace didn't work in scenario
when cljs-devtools was required as a library via maven dependency.
In that case we cannot (goog/require "devtools.optional") because the \
file is not compiled/present.

There does not seem to be a way how to conditionally require something
based on compiler options. So I decided just emit a compile-time warning
to warn users and leave it as is.

People should not use cljs-devtools in :advanced mode. And ideally they
should include it via :preloads which has no effect in :advanced mode.
@darwin
Copy link
Member

darwin commented Apr 13, 2017

Unfortunately when the library is packaged as maven dependency the optional requires didn't work.

At least I implemented a compile-time warning. The solution is to use :preloads compiler option or different sets of :source-paths in advanced mode to exclude traces of cljs-devtools from the build completely.

@darwin
Copy link
Member

darwin commented Apr 13, 2017

@aisamu Just to add... Your sizes of compiled files are consistent and what I would expect with 5270726. You would just see a runtime error when running a dev build of your app. It would complain that "goog.require could not find: devtools.optional".

@aisamu
Copy link
Author

aisamu commented Apr 19, 2017

Oh, sad to hear.

Perhaps it'd be useful to other library/tool writers if you post something on Clojurescript's mailing list. At the very least you'll have more experienced eyes on the matter.

I've collected some threads below related to DCE, but I'm not really qualified to assess whether you've stumbled on something new/different or not, so I'll leave them here for your appraisal:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants