Compute and display statistics of git repositories
A scientific Python distribution such as Anaconda.
e.g. Install Anaconda then conda update conda conda update anaconda conda install Pygments
Tested with
python : 2.7.10 and 3.5.1
numpy : 1.10.2
matplotlib: 1.5.0
pandas : 0.17.1
pygments : 2.0.2
You can check your versions with version.py.
git-stats currently contains only one analysis script, code-age.py.
- Copy code-age.py to your computer
- Open a shell and cd to the root of the git repository you want to report
python code-age.py
NOTE: This can take hours to run a big repository as it blames every file in the repository.- The location of the reports directory will be written to stdout
- Optionally try some patterns e.g.
python code-age.py '*.py'
,python code-age.py docs
code-age.py analyzes the age of files in a git repository and writes some reports and draws some graphs about them. It writes reports in the directory structure given in git.stats.tree.txt
NOTE: LoC is short for Lines of Code.
e.g. For repository git, which is a github mirror off the git source code:
[root] Defaults to ~/git.stats
└── git Directory for https://github.com/git/git.git
└── reports
└── 2015-12-29.28274d02.master Revision 28274d02 which was created on 2015-12-22 on
│ on branch "master".
└── [all-files] Report on all files in this revision
├── README Summary of files in [all-files]
├── author_ext_files.csv Number of files of given extension in which author has code
├── author_ext_loc.csv Number of LoC author in files of given extension by author
├── [all-authors] Sub-report on all authors
│ ├── README Summary of files in [all-authors]
│ ├── code-age.png Graph of code age. LoC / day vs date
│ ├── code-age.txt List of commits in the peaks in the code-age.png graph
│ ├── details.csv LoC in each directory in for these files and authors
│ ├── newest-commits.txt List of newest commits for these files and authors
│ └── oldest-commits.txt List of oldeswest commits for these files and authors
....
├── Alex_Henrie Sub-report on author Alex Henrie
│ ├── README
│ ├── code-age.png
│ ├── code-age.txt
│ ├── details.csv
│ ├── newest.txt
│ └── oldest.txt
Top level files 2015-12-29.28274d02.master/[all-files]
1) README
This file contains summary information about this report.
Summary of File Counts and LoC by Author and File Extension
===========================================================
Totals: 2806 files, 23743 commits, 764802 LoC
Revision Details
----------------
Repository: git (https://github.com/git/git.git)
Date: 2015-12-29
Description: master
SHA-1 hash 28274d02c489f4c7e68153056e9061a46f62d7a0
This shows the number of files in which each author has one or more lines of code in the revision by extension and author. being reported. (This table shown on this page is truncated. author_ext_files.csv has the full table.)
NOTE: The numbers of files in the Total row and column are not the number of files in the repository. They are the total numbers of files in which each author has one or more lines of code.
Total | .c | .sh | .txt | .h | |
---|---|---|---|---|---|
Total | 40742.0 | 14352.0 | 9457.0 | 7276.0 | 2710.0 |
Junio C Hamano | 7747.0 | 2779 | 1713 | 1973 | 504 |
Jeff King | 3070.0 | 1479 | 853 | 296 | 291 |
Nguyễn Thái Ngọc Duy | 1680.0 | 994 | 226 | 174 | 210 |
Shawn O. Pearce | 1254.0 | 384 | 352 | 99 | 55 |
Jonathan Nieder | 1159.0 | 300 | 373 | 305 | 62 |
Linus Torvalds | 1088.0 | 770 | 78 | 16 | 136 |
Johannes Schindelin | 1022.0 | 458 | 298 | 103 | 92 |
René Scharfe | 761.0 | 514 | 104 | 60 | 73 |
Michael Haggerty | 707.0 | 437 | 106 | 56 | 70 |
Thomas Rast | 696.0 | 171 | 151 | 193 | 34 |
This shows the lines of code in the revision being reported by extension and author. (The table shown on this page is truncated. author_ext_loc.csv has the full table.)
Total | .c | .sh | .po | .txt | |
---|---|---|---|---|---|
Total | 764802.0 | 198828.0 | 172727.0 | 159684.0 | 81591.0 |
Junio C Hamano | 115080.0 | 37433 | 27753 | 6220 | 28929 |
Jeff King | 31776.0 | 13134 | 11724 | 0 | 3175 |
Jiang Xin | 24649.0 | 1170 | 718 | 11256 | 81 |
Shawn O. Pearce | 24636.0 | 5392 | 4748 | 1519 | 2353 |
Nguyễn Thái Ngọc Duy | 20908.0 | 13226 | 5499 | 0 | 1233 |
Peter Krefting | 16243.0 | 4 | 11 | 15718 | 0 |
Alexander Shopov | 16182.0 | 0 | 0 | 16149 | 29 |
Johannes Schindelin | 15963.0 | 7531 | 4996 | 0 | 1345 |
Jonathan Nieder | 15266.0 | 3111 | 6625 | 0 | 1914 |
Ævar Arnfjörð Bjarmason | 14688.0 | 11093 | 1306 | 93 | 107 |
A closer look at 2015-12-29.28274d02.master/[all-files]/[all-authors]
This directory contains files that report on the age of all authors code for all files (i.e. every file) in revision 28274d02
, the git
repository master
branch on 2015-12-29.
1) code-age.png is a graph showing the age of the code in question.
The horizontal axis is date and the vertical axis is LoC /day. This means the area under the curve between two dates is the LoC surviving from the period bounded by those datess.
You can see that some code from 2006 survives in the current git master branch.
2) code-age.txt lists the commits in the peaks in code-age.png
================================================================================
[all-authors]: 10 peaks 117654 LoC
................................................................................
5) 187 commits 12253 LoC around 2007-07-18
1025 LoC, 2007-07-21 90a7149 German translation for git-gui
1006 LoC, 2007-07-22 e79bbfe Add po/git-gui.pot
992 LoC, 2007-07-22 4fe7626 Italian translation of git-gui
1095 LoC, 2007-07-25 2340a74 Japanese translation of git-gui
1150 LoC, 2007-07-27 f6b7de2 Hungarian translation of git-gui
................................................................................
10) 130 commits 9705 LoC around 2009-06-03
135 LoC, 2009-05-26 3902985 t5500: Modernize test style
7040 LoC, 2009-06-01 f0ed822 Add custom memory allocator to MinGW and MacOS builds
124 LoC, 2009-06-04 195643f Add 'git svn reset' to unwind 'git svn fetch'
127 LoC, 2009-06-06 2264dfa http*: add helper methods for fetching packs
288 LoC, 2009-06-06 5424bc5 http*: add helper methods for fetching objects (loose)
................................................................................
...
3) oldest-commits.txt lists the oldest commits in the code in question.
================================================================================
[all-authors]: 23743 commits 764802 LoC
................................................................................
111 LoC, 2005-04-08 e83c516 Initial revision of "git", the information manager from hell
38 LoC, cache.h
22 LoC, read-cache.c
11 LoC, README
................................................................................
40 LoC, 2005-04-08 8bc9a0c Add copyright notices.
5 LoC, builtin/cat-file.c
5 LoC, builtin/commit-tree.c
5 LoC, builtin/diff-files.c
...
4) newest-commits.txt lists the oldest commits in the code in question.
================================================================================
[all-authors]: 23743 commits 764802 LoC
................................................................................
12 LoC, 2015-12-29 28274d0 Git 2.7-rc3
11 LoC, Documentation/RelNotes/2.7.0.txt
1 LoC, GIT-VERSION-GEN
................................................................................
119 LoC, 2015-12-28 c5e5e68 l10n: Updated Bulgarian translation of git (2477t,0f,0u)
119 LoC, po/bg.po
................................................................................
...
5) details.csv attempts to show where the code is distributed through the source tree.
dir | LoC | frac |
---|---|---|
764802 | 1 | |
t | 167955 | 0.21960585877128982 |
po | 120787 | 0.15793237988394382 |
Documentation | 82641 | 0.1080554182651196 |
builtin | 55958 | 0.07316664966880317 |
git-gui | 53934 | 0.0705202130747566 |
git-gui/po | 37087 | 0.6876367412022101 |
contrib | 35890 | 0.04692717853771303 |
gitk-git | 29385 | 0.03842170914825013 |
compat | 25884 | 0.03384405375508955 |
Documentation/RelNotes | 19709 | 0.2384893696833291 |