Split a commit in two with Git

2014-04-14  |   |  tool   git  

Ever wanted a commit was actually made of two? Read on.

There are several reasons why you could wish a commit was actually made of several distinct ones:

  • because it makes the history more readable
  • because you are trying to reorder some commits and it creates nasty conflicts
  • just because

Merging two commits into one is easy: look for squashing for more info. While I am relatively versed in Git, I never knew how to efficiently do the opposite - splitting commits - until today.

Split a commit in two for the busy ones

Let's see the sequence first before explaining it

git rebase -i <oldsha1>
# mark the expected commit as `edit` (replace pick in front of the line), save a close
git reset HEAD^
git add ...
git commit -m "First part"
git add ...
git commit -m "Second part"
git rebase --continue

What did we do?

A detailed explanation

Interactive rebase

git rebase -i <oldsha1> opens a list of commits from oldsha1 to the latest commit in the branch. You can:

  • reorder them,
  • change the commit message of some,
  • squash (merge) two commits together,
  • and edit a commit.

We use edit in our case as we want to change the commit. Simply replace the pick word with edit on the line of the commit you want to split. When you save and close this "file", you will be placed at that commit in the command line.

Undo the actual commit

If you do a git status or a git diff, you will see that git places you right after the commit. What we want is to undo the commit and place the changes in our working area.

This is what git reset HEAD^ does: reset the state to the second last commit and leave the changes of the last commit in the working area. HEAD^ means the commit at HEAD minus 1.

Create the two commits

Next is simple gittery where you add changes and commit them the way you wish you had.

Finish the interactive rebasing

Make sure to finish the rebase by calling git rebase --continue. Hopefully, there won't be any conflicts and your history will contain the new commits.

A few more tips

This tip becomes much more powerful when you know how to add to the staging area parts of a file changes - instead of all the file changes that is.

The magic tool for that is git add -p myfile but it is quite arid. I recommend you use either GitX (Mac OS X, GUI) or tig (CLI). They offer a more friendly interactive way to add chunks of changes (up to line by line additions).

Another interesting tip for people that work on topic branches forked off master. You can do git rebase -i master which will list the commits between master and your branch. See my previous post on the subject for more info.


Unable to update git from homebrew

2012-08-04  |   |  tool   git  

I have had problems on one machine to upgrade Git from Homebrew. Let me first tell you how to fix the problem and then what homebrew is about.

The problem

The problem appeared when I tried to upgrade git

brew upgrade git

It turned out to be much more complicated than I anticipated to find the problem. The exact error message was:

Error: Failed executing: make prefix=/usr/local/Cellar/git/1.7.11.3 CC=/usr/bin/clang CFLAGS=-Os\ -w\ -pipe\ -march=native\ -Qunused-arguments\ -mmacosx-version-min=10.7 LDFLAGS=-L/usr/local/lib install (git.rb:49)
These existing issues may help you:
    https://github.com/mxcl/homebrew/issues/8643
    https://github.com/mxcl/homebrew/issues/10544
    https://github.com/mxcl/homebrew/issues/11481
    https://github.com/mxcl/homebrew/issues/12344
    https://github.com/mxcl/homebrew/issues/12814
    https://github.com/mxcl/homebrew/issues/13850
Otherwise, this may help you fix or report the issue:
    https://github.com/mxcl/homebrew/wiki/bug-fixing-checklist

My environment was listed as:

==> Build Environment
HOMEBREW_VERSION: 0.9.2
HEAD: 53d5bfb44e8644eff1693b2a734f079d10b53043
CPU: dual-core 64-bit penryn
OS X: 10.7.4-x86_64
Xcode: 4.3.3
CLT: 4.3.0.0.1.1249367152
X11: 2.6.4 @ /usr/X11
CC: /usr/bin/clang
CXX: /usr/bin/clang++ => /usr/bin/clang
LD: /usr/bin/clang
CFLAGS: -Os -w -pipe -march=native -Qunused-arguments -mmacosx-version-min=10.7
CXXFLAGS: -Os -w -pipe -march=native -Qunused-arguments -mmacosx-version-min=10.7
CPPFLAGS: -isystem /usr/local/include
LDFLAGS: -L/usr/local/lib
MACOSX_DEPLOYMENT_TARGET: 10.7
MAKEFLAGS: -j2

And the last line before the error outputs were

/usr/bin/clang -isystem /usr/local/include -Os -w -pipe -march=native -Qunused-arguments -mmacosx-version-min=10.7 -I. -DUSE_ST_TIMESPEC -DNO_GETTEXT  -DHAVE_DEV_TTY -DXDL_FAST_HASH -DSHA1_HEADER='<openssl/sha.h>'  -DNO_MEMMEM -DSHELL_PATH='"/bin/sh"' -o git-daemon -L/usr/local/lib  daemon.o libgit.a xdiff/lib.a  -lz  -liconv  -lcrypto -lssl 
Undefined symbols for architecture x86_64:
  "_iconv_open", referenced from:
Undefined symbols for architecture x86_64:
  "_iconv_open", referenced from:
      _reencode_string in libgit.a(utf8.o)
      _reencode_string in libgit.a(utf8.o)
  "_iconv", referenced from:
  "_iconv", referenced from:
      _reencode_string in libgit.a(utf8.o)
      _reencode_string in libgit.a(utf8.o)
  "_iconv_close", referenced from:
  "_iconv_close", referenced from:
      _reencode_string in libgit.a(utf8.o)
      _reencode_string in libgit.a(utf8.o)
ld: symbol(s) not found for architecture x86_64
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [git-daemon] Error 1
make: *** Waiting for unfinished jobs....
make: *** [git-credential-store] Error 1

It turned out that libiconv was the culprit. Simply uninstall it:

brew remove libiconv
brew prune
brew cleanup

Then run brew upgrade git again and things should work now.

I found the inspiration here.

What is homebrew anyways

Homebrew is a very easy to use and maintain package manager for Mac OS X environments. Anytime you want to install one of those unix-y tools, Homebrew is your friend.

Passed the initial installation, Homebrew is as simple to use as

brew install *something*

and you are good to go. Keeping versions up-to-date are very easy too

# update brew itself
brew update
# update tools installed with brew    
brew upgrade

For example here are a few things I have installed and maintain with Homebrew (aka brew for the friends).

  • git
  • keychain
  • mvim
  • mongodb
  • postgresql
  • rsync
  • unison
  • wget

That also includes some Java tools:

  • gradle
  • maven
  • jboss-as
  • ceylon

Homebrew does not install as a privileged user - it is actually discouraged. That makes it a bit picky when permissions are not right.

I have been using this trick quite regularly with success.

ruby -e "$(curl -fsSL https://gist.github.com/raw/768518/fix_homebrew.rb)"

Enjoy


Pro tip on git rebase -i

2012-05-15  |   |  tool   git  

Here is a small tip to improve your efficiency when using dynamic rebasing in Git.

I do my work on topic branches that are forked of master. Before I push my work for review via a GitHub pull request, I like to clean it up a bit by:

  • reordering some commits
  • squashing some commits together
  • rewriting commit messages

Nothing fancy but it helps improve history readability.

You can of course do that by using git rebase -i and most examples show how you can go back in time a couple of commits.

git rebase -i HEAD^4 #go back 4 commits ago

There is a nicer and more efficient to do that when you work on topic branches

git rebase -i master

That's it. Pretty stupid but, since you can put any Git object reference, why not use the object where you started to fork off? The rebase will show you all commits between master and your branch.

If you are on a Mac, I highly recommend using GitX or one of his forks. In particular, you can amend the last commit and graphically select what should be staged and unstaged.


Awestruct: building dynamic static web sites

2011-07-08  |   |  git   tool   website  

I've been wanting to update my personal site for a while as it was done in iWeb and Apple is basically killing the product. Through discussions at Red Hat, I tried Awestruct, a tool to generate static web sites while still benefitting from templating, blog support and other kind of neat automations.

Awestruct is a project started and lead by Bob McWhirter, a JBoss fellow known amongst other things for the awesomeness of his project names. Let me tell you this tool is fantastic. It is a ruby based tool that generates a fully static website (.html, .css and resource files) based on:
  • content written in a few available markup languages (Markdown is one, haml is another)
  • layouts letting you template the structure of your website
  • style via Sass, a superset of CSS that ends up generating .css files

I am not a web savvy guy. I know HTML a bit and CSS half a bit. Actually, I know that the best resource on the web is w3Schools.com and that's it. I was able to redo my website in about half a day from downloading Awestruct to deploying the generated site. The beauty is that I can now add pages very easily in an extremely consistent look.

Another useful part is Awestruct extensions and helpers. While the site is a static website, any content you can generate based on some structured data can end up as a part of your website:

  • a blog (ie take elements in a directory and render them as blog entries including the rss feed)
  • display a tag cloud and generate the list of pages per tag
  • display the list of your project releases
  • your own extension (if you know some Ruby, you should be good)

Now add some dust of Javascript and you can add things like

  • Google Analytics integration with one line of config
  • Comments support on your static website thanks to the javascript integration with IntenseDebase.

What I like about Awestruct is that the good libraries are picked for you (Haml, Compass-style etc) but the killer features are two-fold:

  • you can tell it to deploy and it will rsync the new website for you in one simple command (it has profiles too like dev, staging and production)
  • you can store everything in Git

Now you have a Git stored, simply deployable, easily customizable and templated website. With a bit of scripting you could get people pushing content in Git and get the website automatically generated and published. Heck you can even generate content automatically as part of a project release and script that (my dream!).

Awestruct sites are in the middle between fully dynamic sites reading stuffs from a datasource and purely static pages manually edited. It's kind of a dynamic static website.

The only feature I miss is a search engine but one could imagine:

  • generating index pages during the website construction (/index/emmanuel.html, /index/hibernate.html ...)
  • get some piece of Javascript that read queries and do n intersection of the content stored in all matching indexes

Bob, got some free time?


Git: how my life has improved since last month when I used SVN

2010-05-31  |   |  git   svn   tool  

I've switched from SVN to Git (more git-svn actually) close to a month ago and that had to be a leap of faith. Rationally convincing someone that a DVCS is better is pretty hard because overall the life in SVN land is not that bad or does not appear to be. Note that I am using git-svn so I don't benefit from all the power of DVCSes. While nothing replace actually trying it, I thought it was worth the time to explain what I like about this new tool to help people jump too.

This is not a post on why merging is superior in Git compared to SVN (this is something you need to experience), it's a post on how Git is making my life easier.

This post is split is a few sections:

  • some intro
  • how to import a SVN repo into Git (feel free to skip this tutorial of you are interested in what I liked in Git compared to SVN)
  • use case: multitasking in isolation
  • use case: backporting bug fixes
  • use case: writing better commits and commit histories
  • resources
General

If I had to summarize, Git gives me more freedom than SVN. I am not constrained by the tool in any way:

  • it can follow whatever workflow I want
  • it is fast
  • as a net result my commits are clearer

I't hard for me to say that but since the move I do enjoy committing stuffs (I know, that's pretty scary).

The bootstrap

Here is a small tutorial section for people willing to import a project from SVN to Git and keep a bridge between the two. Due to a bug in git-svn for https imports, I am using Git 1.6.5.1 and not the 1.7.1 version.

mkdir project; cd project;
git svn init --trunk=my/svn/repo/project/trunk/ \
             --tags=my/svn/repo/project/tags \
             --branches=my/svn/repo/project/branches \
             my/svn/repo/project

You can optionally create a file containing a conversion between SVN logins and the committers names and email addresses

jdoe = John Doe <jdoe@foo.com>
agaulois = Asterix <asterix@gaule.fr>

In the logs, every time git-svn finds agaulois, it converts it to Asterix <asterix@gaule.fr>. If you want to do that, you need to create this file. Don't be afraid to miss a couple of logins, if you do, git-svn will stop and ask you to add it. In your Git repository directory, run

git config svn.authorsfile ~/dir/myauthors.txt

The next step is to fetch all the information. This is long, very long. The good news is that you can stop it and restart later.

git svn fetch

Once that is done, you are good to go. To update your Git repo with the new commits from SVN do

git svn rebase

To commit your set of local commits to SVN, do

git svn dcommit

Many people, especially in the open source community, consider DVCS as a bad thing because it encourages committers to keep their work locally and not share with others. In reality, it does not. People who share frequently will continue to do so, people who don't still don't and should be fired. Same as usual. In practice for me, I dcommit every 4 to 6 hours.

I do recommend to import one SVN project per Git repository. You will typically get several Git repos per SVN repo. The rule is import the biggest unit that you tag / branch in isolation in SVN. For example, for Hibernate, I've several Git repos:

  • Hibernate Core (which contains all the modules)
  • Hibernate Validator
  • Hibernate Search
  • JPA API
  • Bean Validation API
  • Bean Validation TCK

All of these have generally independent release cycles and version numbers. Apparently it is possible to aggregate Git repositories via the notion of superproject but I have not tried.

One golden rule: you cannot share a Git repository and pull changes back with someone else AND use it to commit in a SVN repository. That will be a mess because git-svn rewrites the commit unique identifiers. Forget sharing repos when you use git-svn unless you are abandoning SVN and are doing a one time import.

Multitasking in Isolation

The absolute coolest feature is the ability to work in total isolation on a given topic for very cheap. I am not necessarily talking about the ability to work offline on an island (though that's nice). I am talking about the ability to work on several subjects in parallel without complex settings.

Let's take an example. I was working on a new feature for Hibernate Search's query DSL. I branched master to dsl and started to work, including committing small chunks of work (more on this later). While working on it, I found a bug in the existing query engine. No problem, I literally stopped working on the new feature, put stuff aside (git stash). created a new branch off master named bug123 and fixed the bug. When I was done with the bug fix, I applied it on master and the dsl branch and resumed my work there. There is the workflow:

git checkout -b dsl #create the dsl branch and move to it
#work work commit work commit work
git stash #put not yet committed stuff aside
git checkout master

git checkout -b bug123 #create bug fix branch
#work work #fix bug 123
git commit
git rebase master #apply commits of master on bug123 (not necessary in this case as I did nothing in master)

git checkout master
git merge bug123 #merge bug123 and master
git branch -d bug123 #delete the useless branch

git checkout dsl
git rebase master # apply commits of bug123
git stash pop #reapply uncommitted changes
#work

It looks like a lot of operations but, it's very fluid and very fast!

What's the benefit? I've fixed a bug in isolation of my new feature even if the same files where impacted. I've committed the bug fix isolated: I can easily reapply it to maintenance branches (see below). Had I used SVN, I would have fixed the bug and committed "new feature + bug fix 123". I would not have backported the fix to our maintenance branch nor would have my co-workers because of the complexity to separate the new feature from the bug fix. In Git the process is so smooth that I even use it to bug fix typos in comments in isolation from my main work.

I should point out that switching branch is super fast and done in the same directory. You IDE quickly refreshes and you are ready to work in the same IDE window. For me that's a big plus over having to checkout a maintenance branch in a separate directory, set up my IDE and open a second IDE window to work in a different context of the same project: I work on five different projects on average, I can't afford a proliferation of IDE windows. With Git, the context switching comes with much less friction and saves me a lot of time.

Backporting bug fixes

In SVN land, to backport a bug, I either:

  • generate the patch off of the SVN commit, and apply it on a checkout of the maintenance branch
  • manually read the commit diff and select which change I want to apply (generally because somebody has committed the fix alongside a new feature or because it has committed the feature in 7 isolated commits)

In the first scenario (the easiest), it involves

  • generating the patch
  • saving it as a file
  • optionally checking out the maintenance branch (ie get a coffee)
  • opening my new IDE window
  • apply the patch
  • commit the change with a log message

In Git, you:

  • checkout the maintenance branch (2s)
  • run the cherry-pick command (git cherry-pich sha1) over the commit or commits you want to copy from the main branch (logs are copied automatically though you can change them if needed)

So easy you actually do it :)

Writing better commits and commit histories

A feature I do like is the ability to uncommit things and rewrite / rearrange them. This is something you only do on the commits you have not yet shared (in my case not yet pushed to SVN). That looks like a stupid and useless feature but it turns out I use it all the time:

  • I can commit an unstable work, explore a couple of approaches and come back if needed
  • I can fix a typo or bad log message
  • I can simplify the commit history by merging two or more commits (I typically merge commits I used as unstable checkpoints)
  • These operations typically require 5 seconds or less

The net effect of being able to do that is:

  • I write better log messages
  • I commit more often / in smaller pieces, making my changes more readable
  • If the pieces happen to be too small I merge them before synching with SVN

You can also do some more micro surgery. If you are changing code and realize that these are really two or three sets of changes and should be committed separately (changes from the the same file potentially). You can literally select which file / which line to commit. The tool GitX let's you do that very easily.

Git can do that because it does not track files, it tracks changes. You can stage some changes for commit (two new files and changes in three files), continue working on the same set of files and commit the state as defined when you initially staged it. Your subsequent changes can then be committed later. This is a subtle difference of approach (content management vs file management) but now that I have used it, I like it better. As a consequence, if you change a file, these changes won't be committed automatically next time you commit. You need to include them (that's what the -a option is for when you run git commit).

Resources

I absolutely recommend you to read Pro Git:

  • that's a top quality book
  • use case oriented
  • and it's also available for free online http://progit.org (though go buy it too, it's well deserved)

Aside from that, I do use

git help command

very often, their documentation is pretty good. Otherwise, Google is your friend, there are many resources out there.

I don't need / miss additional tools to work with Git. The command line is good enough and often less confusing (IntelliJ's integration confused me, so I don't use it unless I need to compare files). I do like to use GitX a graphical tool for two purposes:

  • it displays branches and commits graphically
  • it lets you easily stage specific lines of a file for later commit (the micro surgery tool)

That's it folks,

I hope you enjoyed the read and that I've encouraged you to give it a try. git-svn made the try a no brainer really. I've lost probably 16 hours to learn, try and understand Git. I'm confident I will get them back within the next three to four months (my ROI is covered :) ). You can also try it on any directory, I am now using Git to keep a revision history of all my presentations. Remember, no need to set up a server or anything complex. Run git init and you are good to go.

Disclaimer: this is not a thorough comparison, just the feedback of a one month old user.


How to install Git and git-svn on Mac OS X

2009-01-16  |   |  git   svn  

It's hard to find good Google links for installing git-svn on Mac OS X. Here is my piece.

  • Install port: download it at http://darwinports.com and run the installer
  • run sudo port install subversion-perlbindings (it takes a while as the installer download the internet)
  • run sudo port install git-core +svn (don't forget +svn or you will have to uninstall git-core and reinstall it)
You are ready to use git svn command. git-svn does not work but git svn does: that's because git-svn is not in your PATH. If you want to make git-svn work, add /opt/local/libexex/git-core to it (thanks Randall for the tip).

From there, have a look at http://viget.com/extend/effectively-using-git-with-subversion for a quick tutorial.


Name: Emmanuel Bernard
Bio tags: French, Open Source actor, Hibernate, (No)SQL, JCP, JBoss, Snowboard, Economy
Employer: JBoss by Red Hat
Resume: LinkedIn
Team blog: in.relation.to
Personal blog: No relation to
Microblog: Twitter, Google+
Geoloc: Paris, France

Tags