Generating git repository stats

I added some bash scripts to a repository I currently maintain for a project at my university. We wanted some scripts to generate git repository statistics and parse them to CSV, to be able to plot them with gnuplot lateron easily. The idea is that we can simply invoke make to build the whole thing with updated git statistics and everything (it is a tex document). Here are some of these scripts…

The full history, please, James!

First, let us generate a CSV which contains the full history, including (short) commit hashes, author names, dates and the header of the commit messages:

git log --date=local --pretty=format:"%h%x09%an%x09%ad%x09%s"

As simple as git is, as simple is this script. It’s kind of cryptic, as we print x09, which is a tabulator, but it works perfectly! You can also change the --date part to iso to generate ISO dates!

Weekend work, or what?

With this neat piece of bash, we can check how much work was done at weekend days:

for i in Mon Tue Wed Thu Fri Sat Sun
    echo -en "$i\t"
    echo $(git shortlog -n --format='%ad %s'| grep "$i " | wc -l)

Example output:

Mon     53
Tue     58
Wed     85
Thu     69
Fri     49
Sat     16
Sun     32

Neat, isn’t it?

Do we work at night?

Well, the stuff about the weekdays is really neat, but we can also check if the commiters are morning grouches:

for i in `seq -w 0 23`
    echo -ne "$i\t"
    echo $(git shortlog -n --format='%ad %s' | grep " $i:" | wc -l)

Example output for this:

00      14
01      9
02      0
03      0
04      0
05      0
06      0
07      0
08      0
09      1
10      5
11      10
12      6
13      15
14      69
15      13
16      21
17      39
18      44
19      21
20      17
21      23
22      36
23      19

As you can see, we work the whole day, from 9 AM until 1 AM! But I think as the project goes on, we will also fill the gap between 2 AM and 8 AM! I hope you found this stuff interesting! I don’t know much about gnuplot, the gnuplot part will be done by another project member (hopefully), but I guess this should be easily parseable by gnuplot!