This page contains a very short, partial and incomplete sum-up of chapter 1, 2 and 3 of the Pro Git book written by Scott Chacon (CC BY-NC-SA 3.0). This is not a standalone course. This should be considered as “speech notes” for the in-class lecture.
Diff: a command line tool that shows, on a line-basis, the differences between two text files.
Repository: more or less a whole database of files and their history.
Working copy/directory: a local set of files and folders (a project) that contains, not only, files from a version system. Working copies are used to create new snapshots (new versions) in the version system.
Snapshot: instantaneous state of the files in a filesystem (project / folder / set of files).
commit: a set of changes applied to the repository that constitute an atomic new version in the versioning system.
branch: An diverged set of commits from the original flow of commits.
About Version Control
Different types of version control systems have evolved.
Version Control System (VCS)
- Manual local copies of individual files.
+Easy to use.
-No team work.
- tools: rcs
Centralized Version Control System (CVCS)
- A single central server contains all the versioned files (the repository). Users only old working copies.
+Allows collaboration and team work,
-One single point of failure.
-Most of the operations require a network connection (slow).
-Branches are a pain to deal with.
- tools: CVS, Subversion , …
Distributed Version Control System (DVCS)
- Each collaborator/client contains the whole set of versioned files (the whole repository) with history, branches and all.
- After a server crash, any user’s repository can be copied back to the server.
+No single point of failure.
+Most operation are local. No network connection needed (fast).
+Thousands of branches can be gracefully handled.
- tools: Git, Mercurial, Bazaar or Darcs
Snapshot, not differences
Most Version Systems use a file-based and file modification-based storage system. Each revision is composed of a list of differences for each modified file.
In Git revisions are snapshots. Each one is like a filesystem. Only modified files are copied, the others are linked.
Mostly Local Operations
- Getting a previous version of a file, doing a
diffbetween two versions, can be done locally without any connection to the Server.
- Client can commit while offline/off-VPN because the all repository is available locally.
- This means most operation are fast!
- Since there is no single point of synchronization (no central server) revisions can’t be numbered sequentially.
- Revisions are identified with a check-sum based on the snapshot contents.
- GIT uses the SHA-1 hash algorithm to produce (most certainly) unique ids.
- A SHA-1 hash is a 40-hexadecimal string. (1640 possible combinations).
Git Generally Only Adds Data
By default we never remove anything from git.
+ Nothing ever gets lost. Even if a file is removed in a new revision, it still exists in the history.
- Users need to be careful at what they commit. We don’t version binaries, compiled files, logs…
The Three States
3 possible states for a file :
- committed : the file in its current version is stored in the repository.
- modified : some modifications have occurred on the file and are not yet committed.
- staged : a modified file has been selected to be part of the next commit snapshot. This state is important, it allows to chose which file to commit.
As a consequence there are 3 sections (locations) in a git project :
- the working directory;
- the staging area
- the repository (git directory)
- On Ubuntu
sudo apt-get install git
- On Windows
- Many possibilities…
One tool to configure git :
git config --list
3 configuration levels:
- the system level :
- the user level :
--globaloption, modifies ‘~/.gitconfig’
- the repository level : no option, modifies
$ git config --global user.name "John Doe" $ git config --global user.email email@example.com
git config --global core.editor emacs
Any of kdiff3, tkdiff, meld, xxdiff, emerge, vimdiff, gvimdiff, ecmerge, and opendiff
$ git config --global merge.tool vimdiff
$ git help <verb> $ git <verb> --help $ man git-<verb>
Initializing a new Repository
Usually you already have a folder with a project in it. Source files and so.
git init initializes a new repository.
cd myProject git init git add "*.java" git commit -m "Initial commit for myProject"
Clone an existing project
git clone pulls a complete repository to the local directory.
Warning : other VCS like Subversion use
checkout you only get a branch (a working directory).
clone gets the whole repository (history, branches, tags…)
$ git clone https://github.com/pigne/CountDownWebApp.git
Various transfer protocol are possible and depend on the server:
Life cycle of a file
A file can be tracked or untracked. If it is tracked it is whether unmodified, modified, or staged.
Checking the Status of Files
The main tool for checking the status of files is
$ git status # On branch master nothing to commit (working directory clean)
git help status.
Start tracking a file
- Start editing a file… Say
- Check the status with
$ git status On branch master Initial commit Untracked files: (use "git add <file>..." to include in what will be committed) AUTHORS nothing added to commit but untracked files present (use "git add" to track)
- Start tracking with
$ git add AUTHORS $ git status On branch master Initial commit Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: AUTHORS
Staging a modified file
When an already tracked file is modified we see it with
$ git status On branch master Your branch is up-to-date with 'origin/master'. Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: src/main/java/org/pigne/CountDown.java no changes added to commit (use "git add" and/or "git commit -a")
Staging a file for commit is also done with
$ git add src/main/java/org/pigne/CountDown.java $ git status # ... Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: src/main/java/org/pigne/CountDown.java
Some file we don’t want to store in the repository
- compiled files (*.class, *.o, *.so, Eclipse’s
- dependencies and external libraries,
- archive/deployment files (*.tar, *.jar, *.zip),
- log and backup files (*.log, *~),
- local IDE / Editor configuration (Eclipse’s
.gitignore file contains regex patterns to ignore such files :
target/ .projets .settings/ .classpath
Staged and Unstaged changes
git diff : line per line diff between files.
git diffwith no arguments to see what is changed but not yet staged.
git diff --stagedto see what is staged.
Committing Your Changes
git commit will actually take a snapshot of the staged files set and add it to the repository
git commitwith non argument will open the default text editor, waiting for a commit message.
git commit -m "My commit message that details what happens here..."to give an inline commit message.
Removing a file
- Removing a file from both the repository and the fiesystem
git rm readme.txt
- Removing a file from the repository but keeping it in the filesystem (stop tracking the file):
git rm --cached readme.txt
As well as removing files, moving files also require to be committed.
$ git mv file_from file_to
file_to is staged for the next commit.
Viewing the Commits History
$ git log $ git help log
Better use graphical tools to see git history:
- Linux : git-cola, gitg, git gui, qgit, SmartGit
- Mac / Windows : github GUI
- IDEs usually have good GIT support.
“You did not commit but you messed around with the working copy and the stage!”
- Unstage a file that was not already in the repository:
$ git rm --cached <file>
- Unstage a file that already exists:
$ git reset HEAD <file>
- forget unstaged modifications and revert to unmodified version
$ git checkout HEAD -- <file>
Moving through files states
“So you committed but it does not compile? Or maybe you committed a “.class” file? This is bad…”
- Undo last commit but keep your modifications in the working directory :
$ git reset HEAD~1
- You are sure you really don’t want to keep anything from this commit:
$ git reset --hard HEAD~1
In order to collaborate, common repositories have to be accessible from servers (remotes). Clients will be pushing and pulling data between their repository to other repositories.
git remote command helps dealing with those repositories.
$ git clone https://github.com/pigne/CountDownWebApp.git $ cd CountDownWebApp $ git remote -v origin https://github.com/pigne/CountDownWebApp.git (fetch) origin https://github.com/pigne/CountDownWebApp.git (push)
Add a remote with
git remote add [shortname] [url]
Fetch and Pull from Remotes
git fetch [remote-name] will retrieve any new material from the given remote into the local repository.
Fetching a remote only affects the local repository. It does not change the working directory.
In order to mix the changes from the remote into the local branch, we need to
by default (after a
git clone), the local “master” branch is synchronized with the remote’s “master” branch.
git pullwill do several steps :
- fetch from a remote
- merge the remote’s master to the local master. This might affect the working directory
Push to the Remote
Once local changes have been committed to the local repository, we publish them to the remote server with
git push [remote-name] [branch-name]
Usually the remote is called “origin” and the default branch is “master”
$ git push origin master
- Inspect a remote with
git remote show [remote-name].
- Rename a remote with
git remote rename [old-name] [new-name].
- Remove a remote with
git remote remove [remote-name].
Tags allow to associate textual information on a commit. Mostly useful for software releases.
- List all tags with
A new commit is created with it.
$ git tag -a v1.4 -m 'my version 1.4'
No other information is kept, only a file with the hash of the previous commit is stored.
$ git tag v1.4-lw
Tagging an older revision
$ git tag -a v1.2 -m 'version 1.2' 9fceb02
By default git push does not propagates tags, we have to specify it:
$ git push origin v1.5 $ # or $ git push origin --tags
The best advantage of Git over the other VCS and DVCS. Branches are fast and easy to merge.
Commits’ Network Representation
Commit in Git are identified by a unique number (check-sum) and one of more links to parent commits.
Branches are Pointers on Commits
A branch is a simple pointer to one of the commits in the repository.
- On the first commit of a repository, a default branch is created :
- Each time a new branch is created (
git branch cli_branch), a new pointer is created.
HEADis a spacial pointer that indicates which branch the working directory is in.
Working in Branches
git checkout <branch-name> moves the
HEAD to the given branch and puts the snapshot of that commit into the working directory.
- A commit on a branch moves that branch to the new commit
- A commit on a branch does not change the others branches.
- The special pointer
HEADmoves with the current branch when commits occur on it.
$ git checkout cli_branch # moves HEAD $ # modify some_file $ git add some_file $ git commit # moves cli_branch & HEAD
Committing from various branches makes the history of the repository diverge.
$ git checkout master # moves HEAD $ # modify some_file $ git add some_file $ git commit # moves master & HEAD
As soon as concurrent modifications are finished, we want to reintegrate branches with
- To merge branch B in branch A, one have to call
mergefrom branch A.
- Merging 2 branches will create a new commit with 2 (or more) parents.
- Conflicts may happen.
$ git checkout master ## be sure we are in destination branch $ git merge cli_branch $ git branch -d cli_branch ## remove cli_branch as it is useless
In case of conflict the commit is aborted and problematic files are unmerged
$ git status index.html: needs merge # On branch master # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # unmerged: index.html #
git mergetoolto call one of the merging tools available (kdiff3,tkdiff,xxdiff,meld,gvimdiff,opendiff,emerge,vimdiff).
- Or edit the unmerged files.
<<<<<<< HEAD:index.html <div id="footer">contact : firstname.lastname@example.org</div> ======= <div id="footer"> please contact us at email@example.com </div> >>>>>>> iss53:index.html
- At the end re-stage files and commit.
Remote branches are references to the state of branches on your remote repositories. They’re local branches that you can’t move.
- A remote branch is denoted as
git clonewill set local
- when you commit on
masterbut do not “push”, then you are “ahead”.
- We can push a local branch to a remote:
$ git push origin cli_branch
- We can create a new local branch that “tracks” a remote one:
$ git fetch origin $ git checkout -b bogus123 origin/fix_bug_123
git pushfrom the branch
bogus123will refer to remote