name: inverse layout: true class: center, middle, inverse --- # Archaeology with Git ## Radovan Bast Licensed under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/). Code examples: [OSI](http://opensource.org)-approved [MIT license](http://opensource.org/licenses/mit-license.html). --- layout: false ## Archaeology with Git - Inspecting commits - Finding out who introduced a modification - Finding out when a modification has been introduced - Grepping code - Finding removed code - Finding a developer to talk to (large projects) - Going back in time - Finding out when something broke - Finding commits not merged upstream --- ## Inspecting commits - At any moment we can inspect individual commits ```shell $ git show 49dc419 commit 49dc419c8a44051cfe7826b85ee0a23e5faf3975 Author: Radovan Bast
Date: Sat Nov 22 15:33:15 2014 +0100 do not recompute powers diff --git a/triangle.py b/triangle.py index cc52fe2..fa35eab 100644 --- a/triangle.py +++ b/triangle.py @@ -6,7 +6,10 @@ m = int(sys.argv[1]) # loop over all a < b < c <= m for c in xrange(1, m+1): + cp = c*c for b in xrange(1, c): + bp = b*b for a in xrange(1, b): - if a*a + b*b == c*c: + ap = a*a + if ap + bp == cp: print("(%i, %i, %i)" % (a, b, c)) ``` - We see that the start of the hash is enough if it is unique --- ## Finding out who introduced a modification and when - `git blame` is a fun and useful command to find out when a specific line got introduced and by whom - This is important to judge the impact of a bug (When was it introduced? For how long has this bug been around? Has this bug been released?) ```shell $ git blame
faa4617a (Radovan Bast 2012-07-02 11:04:45 +0200) ! get the one electron Hamiltonian e564cfe8 (A. J. Thorvaldsen 2012-12-25 00:59:21 +0100) H1 = 0*S e564cfe8 (A. J. Thorvaldsen 2012-12-25 00:59:21 +0100) call mat_ensure_alloc(H1, only_alloc=.true.) faa4617a (Radovan Bast 2012-07-02 11:04:45 +0200) call interface_scf_get_h1(H1) faa4617a (Radovan Bast 2012-07-02 11:04:45 +0200) e6cfa2cf (Radovan Bast 2012-03-01 10:06:39 +0100) ! get the two electron contribution (G) to Fock matrix e564cfe8 (A. J. Thorvaldsen 2012-12-25 00:59:21 +0100) G = 0*S e564cfe8 (A. J. Thorvaldsen 2012-12-25 00:59:21 +0100) call mat_ensure_alloc(G, only_alloc=.true.) 810e14d8 (Radovan Bast 2012-07-01 17:19:56 +0200) call interface_scf_get_g(D, G) 761d5c27 (Radovan Bast 2012-05-18 16:28:34 +0200) e6cfa2cf (Radovan Bast 2012-03-01 10:06:39 +0100) ! Fock matrix F = H1 + G 761d5c27 (Radovan Bast 2012-05-18 16:28:34 +0200) F = H1 + G ``` - "Who the %&!@!!! wrote this?!?" - In 90% of cases you yourself - It is useful for projects with 20+ people where you do not have complete overview --- ## Grepping code ```shell $ git grep fixme densfit/df_vxc.F:!fixme call XC integrator dirac/dirgrd.F:!fixme: there is some issue with one-step - investigate later dirac/dirrdn.F:! fixme num grid report should be independent of DFT dirac/dirscf.F: !fixme: michal for InteRest dirac/dirscf.F: !fixme: michal for InteRest dirac/dirscf.F: !fixme: michal for InteRest embedding/get_vemb_mat.F:!fixme irep can be < 0 in NONREL runs eri/eri2out.F:!fixme eri/eri2out.F:!fixme eri/eri2out.F:!fixme gp/gpkrmc.F:C fixme: dette ser ikke ud til at virke for complex groups. interest/module_interest_hrr.f90: !-- fixme --! interest/module_interest_interface.F90: !> fixme: contracted basis sets ``` - Greps entire repository below current directory - Super fast --- ## Finding removed code - There used to be a line containing "break" - Now it is gone - We can figure out when it disappeared ```shell $ git log -c -S'break' commit 41997dffb4c5dd47c81adb1fd1b616e72415e6e5 Author: Radovan Bast
Date: Sun Nov 23 22:44:31 2014 +0100 remove line commit bf39f9d3f2b2867e1c66d58d3c4a32842c7f80d5 Author: Radovan Bast
Date: Sat Nov 22 15:33:37 2014 +0100 break loop a if triple found ``` --- ## Finding removed code ```shell $ git show 41997df commit 41997dffb4c5dd47c81adb1fd1b616e72415e6e5 Author: Radovan Bast
Date: Sun Nov 23 22:44:31 2014 +0100 remove line diff --git a/triangle.py b/triangle.py index c693105..fa35eab 100644 --- a/triangle.py +++ b/triangle.py @@ -13,4 +13,3 @@ for c in xrange(1, m+1): ap = a*a if ap + bp == cp: print("(%i, %i, %i)" % (a, b, c)) - break # no need to continue loop a ``` --- ## Finding a developer to talk to (large projects) - Example: you want to know who implemented a specific keyword/functionality - `git grep` for the keyword - Then with `git blame` find the person who introduced it - With `git show` check the commit that introduced the change (it could be a file rename done by someone else) - In the latter case check out a version prior to the move and `git grep` and `git blame` there --- ## Going back in time - We do not have to branch from the tip of the branch (HEAD) - We can branch from arbitrary (earlier) hash ```shell $ git checkout -b
``` - This is the recommended mechanism to inspect old code (hash abc123): ```shell $ git checkout -b museum abc123 # create branch called "museum" from hash abc123 # do some archaeology ... # "what was I thinking back then!?" $ git checkout master # after you are done switch back to "master" $ git branch -d museum ``` - Branches help us to keep the repository organized --- ## Finding out when something broke - Sometimes you realize that something broke - You know that it used to work - You do not know when it broke - First find out a commit in past when it worked - Then bisect your way until you find the commit that broke it ```shell $ git bisect start $ git bisect good 75c737cf5c9 # this is a commit that worked $ git bisect bad HEAD # last commit is broken # now compile and test # after that decide whether $ git bisect good # or $ git bisect bad # now compile and test # after that decide whether $ git bisect good # or $ git bisect bad # iterate until commit is found ``` - This can even be automatized with `git bisect run