Git: All you need to know
In this article I will discuss about Git, its features, how Git works and also will be discussing a list of common Git commands, what each one means and understanding what Git is meant for and some fundamental concepts you need to understand about Git.
And I will also discuss about handling some scenarios which we might face in any projects and closing with what Git best practices one should follow.
What is Git?
Git is a free and open source distributed version control system.
Git is the most commonly used version control system (VCS).
Git’s purpose is to keep track of projects and files as they change over time with manipulations happening from different users.
There are many website that manages projects that use Git like GitHub, GitLab, BitBucket etc.
Git stores information about the project’s progress on a repository. A repository has commits to the project or a set of references to the commits called heads. All this information is stored in the same folder as the project in a sub-folder called .git
There is a .gitignore file, which contains just a list of files to ignore. It can also recognize wildcard characters.
How does Git work?
The heart of Git is a repository used to contain a project. A repository can be stored locally or on a website.
Throughout development, the project has several save points, called commits.
Git uses SHA-1 hashes to refer to the commits. Each unique hash points to a particular commit in the repository.
There are 4 fundamental elements in Git workflow.
- workspace(working directory): This is your local directory.
- index(stage): Adding modified files to the staging environment.
- local repository(HEAD): Snapshots of files from the staging area saved in the commit history.
- remote repository:
If you consider a file in your workspace, It can be in 3 possible state -
- committed: changes to file are safely stored in local repository
- modified: None of the changes are stored in local repository, locally modified.
- staged: changes are tagged to be considered in the next commit.
Commit files:
The first command that we normally run as soon we get access to remote repository, git clone <repository url>
. This will create a local copy of the repository in your workspace. If you are creating repository yourself, you don’t have to do it.
The second command that we use is git add <directory path>
. This will the files from workspace to index, but not actually committed.
The third command is git commit -m "commit message"
. This will commit all the staged file to local repository.
The fourth command is git push
. This will push the changes form local repository to remote repository.
Fetch files:
The first command is git fetch
. This will get the files from remote to local repository, but you won’t be seeing those changes in your local workspace.
For that we need to run git merge
. This will get the changes to your local workspace.
There is a shortcut to it, you can directly use git pull
.This get your remote changes to local repository and then to the local workspace at once.
The reason we use git fetch and git merge is that it allows us to compare latest changes in file before merging it to local workspace.
After fetching the files(
git fetch
), we can usegit diff head
to get the difference between the local files and local repository.You can use (
git diff
), which will tell you what files you have in local and the ones that are staged. So basically, it would tell what files you could still add to stage for further commit that you haven’t already.
You can use this for sanity testing before doing a commit, so all the local changes you have and that you want to commit, are actually staged.
Getting started
To get started with Git, you need to download it to your machine. Head over to https://git-scm.com/ and download the version most compatibe with your system.
During the installation of Git, make sure you choose to run Git on the normal console window as well, this will enable you to run Git on your command prompt using the git command.
Why Branching is important
This feature allows developers to work on a copy of the original code to fix bugs or develop new features. By working on a branch, developers don’t affect the master branch until they want to implement the changes.
Branching creates an isolated environment to try out the new features, and if you like them, you can merge them into the master branch. If something goes wrong, you can delete the branch, and the master branch remains untouched.
Branching allows everyone to work on their part of the code simultaneously.
Git Rebase vs Merge
Consider what happens when you start working on a new feature in a dedicated branch, then another team member updates the master branch with new commits.
This results in a forked history.
To incorporate the new commits into your feature branch, you have two options: merging or rebasing.
merge:
This creates a new “merge commit” in the feature branch that ties together the histories of both branches.
MERGE preserves history.
rebase:
This moves the entire feature branch to begin on the tip of the master branch, effectively incorporating all the new commits in master.
But, instead of using a merge commit, rebasing re-writes the project history by creating brand new commits for each commit in the original branch.
REBASE rewrites the history.
Benefit of rebasing is that you get a much cleaner project history
When to use Rebase and Merge
merge:
Let’s say you have created a branch for the purpose of developing a single feature.
When you want to bring those changes back to master, you probably want merge (you don’t care about maintaining all the interim commits).
rebase:
A second scenario would be if you started doing some development and then another developer made an unrelated change.
You probably want to pull and then rebase to base your changes from the current version from the repo.
Example:
- create a new changes in the code
git checkout <my_feature_branch>
git commit -m "message" - update your local master with node master
git tcheckout master
git pull
- rewinding head to replay your work on top of it (also allows you to check whether you have any conflicts or not)
git checkout <my_feature_branch>
git rebase master
- place commits of my_feature_branch and place them on the master branch (fast forward merge)
git checkout master
git merge <my_feature_branch>
git push
Git stash and its importance
This command is used to save your branch’s git repository changes that is you can save them somewhere in local repository and you can also use those changes again to revert changes.
stash can be carried over from branch to branch.
Let say you have done changes in 2nd line of the file in local branch but at the same time someone else also done the changes in the same line of the same file in master branch,
Then you can stash your changes and then do git rebase
to get the master branch changes.
And if you want to get your local branch changes, you can do git stash pop @stash_name_from_from_stash_list
but it will show you merge conflict so you can save the required changes which you want to keep and commit the changes.
stash commands:
git stash save
: This will save the changes in the stash (LIFO order) and removes the changes in the branch.
git stash list
: This will show the list of changes in stash.
git stash apply @stash
: apply the saved changes but also keeps the changes in the stash list.
git stash pop @stash
: apply the saved changes and removes the changes in the stash list.
Git Commands
Create:
- Clone an existing repository
git clone <http url/ssh link>
- Create a new local repository
git init
Local Changes:
- Changed files in your working directory
git status
- List the commit history of the current branch
git log
- Changes to tracked files
git diff
- See the difference between the last commit and the working directory
git diff HEAD
- Add all current changes to the next commit
git add .
- Add some changes in <file> to the next commit
git add -p <file>
- Commit all local changes in tracked files
git commit -a
- Commit previously staged changes
git commit
- Change the last commit (Don‘t amend published commits!)
git commit --amend
Branches and Tags:
- List all existing branches
git branch -av
- Switch HEAD branch
git checkout <branch>
- Create a new branch based on your current HEAD
git branch <new-branch>
- Create a new tracking branch based on a remote branch
git checkout --track <remote/branch>
- Delete a local branch
git branch -d <branch>
- Rename a branch you are currently working in
git branch -m <new-branch-name>
- Mark the current commit with a tag
git tag <tag-name>
Update and Publish:
- List all currently configured remotes
git remote -v
- Show information about a remote
git remote show <remote>
- Add new remote repository, named <remote>
git remote add <shortname> <url>
- Download all changes from <remote>, but don‘t integrate into HEAD
git fetch <remote>
- Download changes and directly merge/integrate into HEAD
git pull <remote> <branch>
- Publish local changes on a remote
git push <remote> <branch>
- Delete a branch on the remote
git branch -dr <remote/branch>
- Publish your tags
git push --tags
Merge and Rebase:
- Merge <branch> into your current HEAD
git merge <branch>
- Rebase your current HEAD onto <branch> (Don‘t rebase published commits!)
git rebase <branch>
- Abort a rebase
git rebase --abort
- Continue a rebase after resolving conflicts
git rebase --continue
- Use your configured merge tool to solve conflicts
git mergetool
- Use your editor to manually solve conflicts and (after resolving) mark file as resolved
git add <resolved-file>
git rm <resolved-file>
Undo:
- Discard all local changes in your working directory
git reset --hard HEAD
- Discard local changes in a specific file
git checkout HEAD <file>
- Revert a commit (by producing a new commit with contrary changes)
git revert <commit>
- Reset your HEAD pointer to a previous commit …and discard all changes since then
git reset --hard <commit>
…and preserve all changes as unstaged changesgit reset <commit>
…and preserve uncommitted local changesgit reset --keep <commit>
Git Configuration
- Attach an author name to all commits that will appear in the version history
git config --global user.name "name"
- Attach an email address to all commits by the current user
git config --global user.email "email-address"
- Apply Git’s automatic command line coloring which helps you keep track and revise repository changes
git config --global color.ui auto
- Create a shortcut (alias) for a Git command
git config --global alias.<alias-name> <git-command>
Making Changes
- Stage changes for the next commit
git add <file/directory>
- Stage everything in the directory for an initial commit
git add .
- Commit staged snapshots in the version history with a descriptive message included in the command
git commit -m "commit message"
Rewriting History:
- Replace the last commit with a combination of the staged changes and the last commit combined
git commit --amend
- Rebase the current branch with the specified base (it can be a branch name, tag, reference to a HEAD, or a commit ID)
git rebase <base>
- List changes made to the HEAD of the local repository
git reflog
Handle Scenarios
I will brief about some scenarios which we may face while working in medium sized team in a project and how we can resolve this.
- The scenario where you make changes to a file and then you want to undo it -
** After you have added a file and committed it **
git checkout <FILE_NAME>
git status (working directory is clean.)
git diff (No difference.)
- The scenario where you want to modify the commit message after making commit -
git commit -amend -m "NEW_MESSAGE" (This will add new commit that is commit hashcode will change.)
git log (This will show the modified commit.)
- The scenario where after making commit lets say ‘A’, if you want to add a new file and wanted that to be part of latest commit (‘A’) -
git add <FILE_NAME>
git commit --amend
git log --stat (This will be the new file in that commit.
The commit hashcode will be same)
- The scenario where you wanted to undo some commits while other team member had already pulled it -
git revert <HASHCODE> (This will create a new commit to reverse the changes of other commit.
<hashcode> => This is the commit code which you want to undo.)
- The scenario where you make commit in wrong branch (lets say ‘master’) and moving that commit to respective branch -
git checkout master
git log (To grab the hashcode of the last commit)
git checkout <BRANCH> (Switch to desired branch)
git cherry-pick <HASHCODE> (This will bring that commit to desired branch)
git checkout master
**You have to remove/ keep the changes,
You can use any of the below steps to do based on your requirements**
git reset –soft <HASHCODE> (--soft => This will keep the changes in the staging area.)
git reset –hard <HASHCODE> (--hard => This will remove the changes on the tracked files.)
git reset <HASHCODE> (<hashcode> => This is the commit code on master branch before the last wrongly committed.
This will keep the changes in the working directory.)
cherry-pick — This will create a new commit based on that commit.
- The scenario where you ran git reset –hard but want those changes -
git reflog
** Grab the hashcode of that commit which you want to save **
git checkout <HASHCODE>
** This will be in a branch which will be in future may gets deleted.
So to save those changes, you need to make a branch from it. **
git branch <BACKUP_BRANCH>
git branch (This will show the active branch which will look something like - Head detached at <HASHCODE>.
This will garbage collected at some point)
git checkout <MASTER BRANCH>
git checkout <BACKUP_BRANCH>
git reflog — This will show the commits in order of you last referenced them.
Git Best Practices
Commit related changes -
A commit should be a wrapper for related changes. For example, fixing two different bugs should produce two separate commits. Small commits make it easier for other developers to understand the changes and roll them back if something went wrong.
Commit often -
Committing often keeps your commits small and, again, helps you commit only related changes. Moreover, it allows you to share your code more frequently with others. That way it‘s easier for everyone to integrate changes regularly and avoid having merge conflicts.
Having few large commits and sharing them rarely, in contrast, makes it hard to solve conflicts.
Don’t commit half-done work -
You should only commit code when it‘s completed. This doesn‘t mean you have to complete a whole, large feature before committing. Split the feature‘s implementation into logical chunks and remember to commit early and often.
If you‘re tempted to commit just because you need a clean working copy (to check out a branch, pull in changes, etc.) consider using Git‘s «Stash» feature instead.
Test code before you commit -
Test your code thoroughly to make sure it really is completed and has no side effects.
Don’t commit half-baked things in your local repository, instead have your code tested is even more important when it comes to pushing/sharing your code with others.
Write good commit message -
Begin your message with a short summary of your changes. The body of your message should provide detailed answers to the following questions:
› What was the motivation for the change?
› How does it differ from the previous implementation?
Use the present tense («change», not «changed» or «changes») to be consistent with generated messages from commands like Git merge.
Don’t use VCS as backup -
Having your files backed up on a remote server is a nice side effect of having a version control system.
When doing version control, you should pay attention to committing semantically.
Use Branches -
Branches are the perfect tool to help you avoid mixing up different lines
of development.
You should use branches extensively in your development workflows: for new features, bug fixes, ideas…
Prevent Merge Conflicts -
Use a new file instead of an existing one whenever possible.
Avoid adding changes at the end of the file.
Push and pull changes as often as possible.
Do not beautify code or organize imports on your own.
Avoid the solo programmer mindset by keeping in mind the other people who are working on the same code.
check this link to understand more on merge conflicts and how to handle them.
Git commands are very useful functionalities in developers day to day lives.
Git has become part of the toolbox for every software developer. It’s success is largely due to its power, flexibility, distributed mode of operation and ability to manage extremely complex projects with multiple developers collaborating on the same code-base.
Connect with me on LinkedIn