Category Archives: git

The neatest feature of #git I didn’t know was there

I have been using git for several years now. But like any other tool, you don’t REALLY know all its power without using it in the craziest situations. I’ve just discovered a core aspect of git that totally drives me nuts given what it can do!!!

Git is a distributed version control system, right? Those of us using it with github are probably well aware of pull requests, making local commits, and doing various day-to-day operations. But something I didn’t understand until I had to process a pull request from the Spring community was exactly WHAT a git commit was.

Git commits are hashed. The don’t use counters, because they must be unique across any machine. You can’t depend on the counters working correctly. Instead, the idea is that people can develop separately, in another repository, and then submit pull requests to make a contribution. I as the developer can then merge your pull request. Sounds great, right? But did you just skim over that that sentence, “merge your pull request“.

What are we doing when we “merge your pull request“? It typically means we are pulling changes from your forked clone to mine, but once you realize that there is zero requirement for a relationship between your fork and mine, you discover that commits can be pulled into ANY repo and branch.

For Spring’s getting started guides, we have a central repo where people can write entirely new guides, and submit them as pull requests. But we don’t merge them there. Instead, we create a whole new repo and pull in the commits there. And we do that after you have made your contribution. If you look at those commits, it will appear as if everything was created, developed, and published there. But it wasn’t.

For any guide, the tentative author first creates a draft for a guide. Authors are encouraged to create a fork of that central repo and then start making their edits. At that point, everything can be pushed up to a branch. The author crafts a pull request. From there, we could merge back to the original master, but we don’t want to. Instead, we want each guide in a separate repo to suitably manage each guide separately. This lets us take all of the author’s handiwork, and put in that separate repo.

Basically, commits are self contained. If you develop something on one branch, and I development on another branch, it’s possible to merge both of these efforts into a third branch in a totally separate repository. And it doesn’t matter if one is a forked clone of the other. Clean things up, and we can turn this all into a pull request back to the original master. Or it can go elsewhere.

The point is, breaking up work into commits, sharing them with others on branches, whether on your fork or not, you can keep things nicely segmented. Then it becomes easy to hammer out issues and fold things together. And the side effect is that ALL commits retain their history, status, and who wrote what. You can squash things if you wish, whereupon some of the history is flattened. But this type of flexibility is incomprehensible when using something like subversion.

A tale of two git workflows

My company has gone full on with using github as the place to host our code. It’s great! I love git and I love github. And lately, in the past couple of months, I have taken on using hub, and git + hub tool.

So what’s the difference in workflow? The standard approach suggested by github is fork-a-repo. In a nutshell, you fork the repo, clone it to your box, and add a remote to the “official” project named “upstream”. The problem I ran into all the time that was annoying was that I had to keep updating my own master branch to keep it in sync with the official project. Sometimes I would accidentally forget that my own master branch wasn’t up-to-date and have to back up to do that extra bookkeeping.

But then came hub. That tool has a different workflow that is superb. You create a fork of the official project, but that is not the one you clone to your work machine. Instead, you clone the official project and then add a remote to your own fork. Create a new branch, push it to your fork, and submit a pull request back to the official repo.

By taking away the need to sync my own repo, it has lifted this burden of remembering that step. Instead, I simply perform periodic “git pull”s against the official repo, and things stay up-to-date.

The tradeoff? Visit my own fork, and you won’t see an up-to-date master branch there. But so what? I don’t build anything against my own master branch. Instead, I’m more interesting in all the contributions going into the official master branch.

The confusion? Why does github (the producer of both of these workflows) suggest two flows? It’s confusing. I have already run into a co-worker that uses the classic one vs. hub’s. I had to chat him through an issue when it was time to deploy and I didn’t have the proper commit rights.

NOTE: It has come to my attention that “git pull-request” has been deprecated. That is simply because github might be deprecating the ability to convert issues into pull requests. But after I found the issue on the repo, it turns out that if github indeed removes that feature, hub will simply replace it with another command to issue a new pull request from the command line.

What git tools do you use?

A friend of mine who is getting warmed up to using git on some projects asked me recently what tools I had to recommend in helping with running some of his own, private repositories. This isn’t about github, but instead, just using git on the home network (which isn’t too hard to do)

To be honest, I don’t use many tools beyond the command line for git. Frankly, I used to have a bunch of aliases, but discovered that moving to a new machine (often in the form of cleanly setup VM) required me to do a lot more setup. Instead, I prefer to use what comes out-of-the-box. The only place I vary in this attitude are the following two git alias commands I have in ~/.gitconfig

[alias]
lg = log –graph –pretty=”format:’%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr)%Creset'” –abbrev-commit –date=relative
m = merge –no-ff


git lg provides the “railroad diagram” for a repository. It’s a nice visual using colored ASCII art to see the commits and merge history. git m is just a shortcut to support our general policy of not using fast forward merges when merging branches into master. If you can commit the command pattern to your fingers, you don’t even need that.

I have examined gitx and sourcetree, but neither provide a whole lot of leverage above what I can harness from the command line; not like the sort of leverage a good IDE with auto code completion provides. Stuff like the UNIX find command along with “git blame” and “git bisect” are some of the handiest tools I use everyday in tracking down bugs, who submitted them, and what ticket they were issued against (which requires people include bug numbers in commit log messages).

Thank you git rebase –interactive!

Today I had the good fortune to read http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html. The article perfectly served my needs.

I’ve been working on a new feature for several days. At first, I build some buttons and new pages, and had all the functionality working. I had also committed the changes. Before pushing them to origin, I sent a diff report to one of my co-workers, and asked him to inspect my handy work. Based on feedback, it was suggested that I put the functionality on another part of the screen. It looked nicer there, and the feedback was built using Ajax instead of a new page.
This meant that some of my pages could be thrown away. After getting the new screen parts working, I removed the old ones. In fact, I removed an edit I had made to a global CSS file. After all this was working, I made some more commits.
What I wanted to avoid was having a commit history showing me adding and then immediately removing some new files as well as adding and removing lines from certain files. Basically, I wanted to compress all this work into a single commit.
As it turns out, this is a piece of cake. git rebase -i HEAD~3 allowed me to look at the three latest commits, keep the first, and squash the other two into it, forming one commit. I did that. Next I was prompted to rewrite the commit log entry. When I realized I had forgotten to delete two files that I had added, I simply removed them and created another commit. I repeated the process by typing git rebase -i HEAD~2, and squashed it all into one commit. Inspecting things, everything appears to be in order.
I can certainly testify that Subversion never offered me anything like this, nor did Rational ClearCase.