Click here to Skip to main content
13,668,886 members
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

3.9K views
6 bookmarked
Posted 19 Jul 2018
Licenced CPOL

A Note on GIT & Miscellaneous

, 19 Jul 2018
Rate this:
Please Sign up or sign in to vote.
This is a note on GIT & miscellaneous subjects

Introduction

This is a note on GIT & miscellaneous subjects.

Background

This is a note on GIT & miscellaneous subjects. GIT is originally written by Linus Torvalds. GIT is a source control system and GIT is more than a source control system. The best GIT reference that I found is this book that I strongly recommend you to read it. But in this note, I want to limit the scope to the most commonly used GIT capabilities in our day to day life.

Install the Latest GIT

The computer that I use for this note is a "Linux Mint 18.3 Cinnamon 64-bit" VM. The latest GIT is not available in the Ubuntu default repositories. In order to install the latest GIT, I need to add the "Ubuntu Git Maintainers" to the repository list.

sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update

By the time when this note is prepared, the latest GIT version is "2.18.0".

We can then install the latest GIT by the following command.

sudo apt-get install git

We can check the version of the GIT being used by the following command.

git --version

GIT is available on all the operating systems and it has been pretty stable in the recent versions. You should not see much difference if you use a different version on other operating systems.

The First GIT Repository & ".gitignore" & ".git"

Initiate a GIT Repository

A GIT repository is simply a directory with an optional file named ".gitignore". In my example, I added the simplest ".gitignore" that tells GIT not to ignore any files, particularly the ".gitignore" file.

# Do not ignore anything

!.gitignore

To make the directory into a GIT repository, we can simply issue the following command in the directory.

git init

The "git init" command initiated the "git-test" directory into a GIT repository and added the ".git" directory in it.

Add Files and Commit to the Repository

After the repository is initiated, we can add some files to it.

touch A B C

In a GIT repository, we can use the "git status" command to check the status of the files.

GIT recognized the new files including the ".gitignore" file and recommends us to commit them. In order to commit the files into the repository, we need to first stage them.

For example, if we want to stage the file "A", we can issue the following command.

git add -- A

If we want to stage all the files, we can issue the following command.

git add .

We can also use the "-A" option to stage all the files, which is equivalent to the "." option. If we want to stage only the modified and deleted files, we can use the "-u" options.

git add -u

After staging the files, we can make a commit.

git commit -m "First commit"

If we check the status of the repository after the commit, GIT will tell us that everything is kept properly.

If this is your first time to make a "commit", GIT will ask you to provide some information for book-keeping purposes. You can set your name and email in GIT by the following commands.

git config --global user.email "song.li@example.com"
git config --global user.name "Song Li"

If you want your name and email only apply to the current repository, you can use the following command.

git config user.email "song.li@example.com"
git config user.name "Song Li"

The ".git" Directory is Everything

GIT keeps all the configurations and commits in the ".git" directory. In order to have a taste on the GIT capability, we can delete all the files except the ".git" directory.

rm -f A
rm -f B
rm -f C
rm -f .gitignore

GIT can tell us that these files have been deleted.

We can recover all the files to their last commit by the "git checkout" command.

git checkout -- .

If we want to take the "git-test" directory out from the GIT control, we can simply delete the ".git" directory and the "git-test" directory is no longer a GIT repository.

GIT Branches & "git checkout"

Create a New Branch

As any source control systems, people typically work with GIT by branches. We can find the branches in a repository by the "git branch" command.

By default, the "master" branch is created when we initiate the GIT repository. When we are in a GIT branch, we can create a new branch based upon the current working branch.

git branch branch-1

Because the active branch is "master", the initial state of the newly created "branch-1" will be exactly the same as the last commit of the "master" branch. As a short cut, we can also create a branch by the "git checkout" command.

git checkout -b branch-2

The "git checkout -b" command creates a new branch and switches to it immediately.

Now we have created two new branches and the active working branch is the "branch-2".

The "git checkout" a Branch

In GIT terminology, switching to a branch is called checking out a branch. If we want to switch to the "branch-1", we can issue the following command.

git checkout branch-1

Delete a Branch

You may want to delete a branch some sometimes. Let us first create a branch named "to-be-deleted".

git checkout -b to-be-deleted

You cannot delete a branch if it is actively checked out unless you force it. We normally switch to another branch and use the "-d" option to delete it.

git checkout master
git branch -d to-be-deleted

Working in a GIT Branch

Although we constantly switching branches, most of our time is spent on working in a particular branch. Instead of focusing on committing our changes in the branch, I will spend more time on how to revert our changes in the branch in this section.

git checkout -b working-branch

Because we only made one commit in the whole repository so far, the "working-branch" matches the "master" branch. We have three files in it.

Now let us add one file, delete one file and modify one file in the branch.

rm -f C
touch D
echo "Modified in working-branch" >> A

In order to continue with the rest of the examples, let us stage the changes now.

git add .

Un-stage the Changes Before a Commit

After staging, GIT is ready to commit the staged changes. In this example, we deleted "C" and added "D". GIT recognized the deletion/add as a file name change and it is OK.

If we do not want to commit the deletion of "C", we can issue the following command to un-stage it.

git reset -- C

If we want to take out all the changes from staging, we can issue the following command.

git reset

After calling "git reset", all the files are removed from staging.

Recover the Files to the Last Commit

For an un-staged file, we can recover it to the last committed state by the following command.

git checkout -- A

If we want to recover all the files, we can issue the following command.

git checkout -- .

We can find that "git checkout" does not delete the newly added file. In GIT, a newly added file is called an untracked file. To delete the untracked files, we can use the following command.

git clean -fd

The "-f" option tells GIT to clean the untracked files, the "-d" options tells GIT to clean the untracked directories. If we want to clean even the ignored files and directories, we can use the "-x" options.

git clean -fdx

Regardless if a file is staged or not, if you want to set the state of all the files in the branch to the last commit, you can use the following command.

git reset --hard

Discard a Commit

If we already made a commit, GIT still gives us the opportunity to discard it.

rm -f C
touch D
echo "Modified in working-branch" >> A
git add .
git commit -m "A commit to regret"

After the commit, we can use "git log" to find all the commits on the branch.

git log

To discard the "A commit to regret", we can issue the following command.

git reset --hard c3b2c91f00ff4e4c97ba4484592c5c0284ae198e

We can also use the "HEAD~" to represent the hash code of the previous commit.

git reset --hard HEAD~

The "HEAD" represents the most recent commit, "HEAD~" represents the parent commit. In case of a merge, a commit may have two parents. The "HEAD^1" is the first parent, and the "HEAD^2" is the second parent.

The "git diff"

GIT provides a nice tool "git diff" to compare a file with its staged state. If no new change staged, it compare the file with its last committed state.

git checkout working-branch
echo "AAA" > A
git commit -am "Commit No.1"
    
echo "BBB" >A
git diff -- A

If we want to compare all the files in the branch, we can use the following command.

git diff

If we want to compare with another branch, such as the "master" branch, we can use the following command.

git diff master

The "git checkout" vs. "git reset"

GIT has a lot of highly overloaded commands which can make us easily confused. The "git checkout" and "git reset" can be pretty confusing because they are internally related. Without going to the internal implementations, I will only talk about how they are most commonly used.

The "git checkout"

The typical use case of "git checkout" is to checkout a branch.

git checkout branch-1

If you want to create a branch and switch to it at the same time, you can use the "-b" option.

git checkout -b a-new-branch

The "git checkout" can be also used to undo the un-staged changes. For example, the following command undo the changes to the file "A" to its last committed state.

git checkout -- A

If you want to discard all the un-staged changes, you can use the following command.

git checkout -- .

The "git reset'

One of the most common use cases of "git reset" is to un-stage a file. For example, we can un-stage the file "A" by the following command.

git reset -- A

If we want to un-stage all the changes, we can issue the following command.

git reset

If we want to completely discard the most recent commit from a branch, we can issue the following command.

git reset --hard HEAD^1

According to the documentation, the "git reset" has three commonly used modes.

  • The "--soft" modeDoes not touch the index file or the working tree at all (but resets the head to <commit>, just like all modes do). This leaves all your changed files "Changes to be committed", as git status would put it;
  • The "--mixed" mode - Resets the index but not the working tree (i.e., the changed files are preserved but not marked for commit) and reports what has not been updated. This is the default action;
  • The "--hard" modeResets the index and working tree. Any changes to tracked files in the working tree since <commit> are discarded.

The "--mixed" mode is the default mode, which explains why we can use "git reset" to un-stage all the changes that we do not want to commit.

A Little Bit in the ".git" Directory

As we have known that GIT keeps all the information in the ".git" directory. Without making the effort to understand every detail related to how GIT works, it is beneficial to at least take a quick look into the ".git" directory.

The "config" & the "description" Files

In a GIT repository, the "config" file keeps the basic configuration information related to the repository.

For example, in our repository, it has the user name and the email. The "description" file keeps the name of the repository. When we create the repository, we did not give it a name. But we can give a name to it by modifying the "description" file.

echo "git-test" > description

The "HEAD" File & the "refs" Directory

We may be curious to know how GIT knows about all the branches. Let us take a look at the "HEAD" file.

The "HEAD" file is a small file, which has the name of the current working branch. The branch name "working-branch" corresponds to a file in the "refs/heads" directory.

For each branch in the repository, we can find a file of the same name in the "refs/heads" directory. When we switching branches, we are modifying the "HEAD" file to point to the new working branch. Now let us take a look at the branch file "master".

The file "master" is a small file, it has only a hash code, which points to a commit represented by the hash code. When we checkout a branch, GIT can find the commit by the branch name and reconstruct the working directory. It is important to know that different branches can point to the same commit.

The "objects" Directory & the "git cat-file"

Every commit in the GIT repository and all the versions of the files for all the commits are kept in the "objects" directory. For easy management and retrieval, Linus Torvalds organized the data in layers of subdirectories.

We can use the "git cat-file" to look into the data saved in the "objects" directory. For example, we can explore the data for the commit "c3b2c91f00ff4e4c97ba4484592c5c0284ae198e", which is the HEAD of the "master" branch. 

git cat-file -p c3b2c91f00

You may notice that I did not use the full hash code, but only the first couple of letters. In GIT, it is sufficient to identify the whole hash code in most of the cases. The above command tells us that the commit has a tree structure represented by another hash code. Let us further take a look at this tree.

We can see all the files that we have committed in the tree. We can further take a look at the version of the ".gitignore" file in this branch.

git cat-file -p b4e54723341

It is exactly the content of the ".gitignore" file when we initiated the GIT repository. GIT keeps it in the "objects" directory for us.

The "index" File & the "git ls-files"

The index file serves as the pivotal point between the commits and the working directory.

  • When we checkout a branch, the index is updated to match the information of the commit pointed by the HEAD file of the branch and the working directory is updated to match the content;
  • When we stage a file, the index is updated for the file, so GIT knows that we have staged a file, but have not committed it yet;
  • When we make a commit, a new commit is created that all the information in the index is stored to the "objects" directory.

The "index" file is a binary file that we cannot check its content directly. But we can use the "git ls-files" command to look at it.

git checkout working-branch
git ls-files --stage

We can see all the files have the same hash code except the ".gitignore" because all the files are empty at this time.

echo "Add some content" >> A
git add -- .

If we modify the file "A" and stage it, we can see that the index is updated.

The "git cat-file" can tell us the exact content that we have staged.

The GIT Merge & Conflicts & "--continue"

With any source control system, you will eventually face merges sooner or later and GIT is not an exception. One of the simplest but contently overlooked question is "Who is whom".

Who is Whom?

Thanks to this note that made the "Who is whom" question explicit and clear. In Git, performing a merge requires two steps.

  • Check out the branch that should receive the changes;
  • Call the "git merge" command with the name of the branch that contains the desired changes.

This clearly answered the question that your current working branch will be updated after a successful merge.

The Fast Forward Merge

Before making a merge, let us first take a look at the state of our branches.

git show-ref

All my branches point to the same commit. It means that all my branches have exactly the same content. Now let us make some changes in the "branch-1" and commit it.

git checkout branch-1
echo "Mofified in branch-1" >> A
git add -- .
git commit -m "branch-1 is updated"

Now let us checkout the information on the HEAD of the "branch-1".

git rev-parse HEAD
git cat-file -p HEAD

Whenever GIT makes a commit, it keeps a record of its parent commit. If now we want to merge "branch-1" into the "master" branch, GIT has sufficient information to complete the merge without looking at the content of each branch.

  • If the HEAD of the receiving branch is a parent or a remote parent along the parent chain of the desired branch, GIT recognizes no change has been made in the receiving branch;
  • In such a case, GIT will simply replace all the content of the receiving branch by the desired branch. In GIT terminology, it is called a "fast-forward" merge.

Now let us merge the "branch-1" into the "master" branch.

git checkout master
git merge branch-1

In cases of a fast forward merge, GIT will simply update the HEAD of the "master" branch with the HEAD of the "branch-1".

The result of the "master" branch is equivalent to the result when we issue the following command.

git reset --hard f5e3b0658afeb194340620d912e389d9c06f2cd0

The Three Way Merge

In case changes have been made in both branches, GIT needs to look into the content of each branch. Most commonly GIT will perform a three-way merge. GIT will compare the HEAD commit of each branch and their most recent common ancestor to decide if a file has been added or deleted and if a file has been modified. GIT will declare a conflict in the following situations and leave the decisions to us.

  • If a file has been modified in both branches and if the content on the same line or adjacent lines are different;
  • If a file has been deleted in one branch but modified in the other branch;
  • If a file has been added in both branches but the content is different on the same line or on adjacent lines.

Now let us commit some changes to the "branch-1" and "branch-2" to complete a three-way merge.

git checkout branch-1
echo "Modified in branch-1" > A
rm -f B
echo "Added in branch-1" >> D
git add -- .
git commit -m "Prepare for merge with branch-2"

In "branch-1", we modified the file "A", deleted the file "B", and added the file "D" with content.

git checkout branch-2
echo "Modified in branch-2" > A
echo "B is modified" >> B
echo "Added in branch-2" >> D
git add -- .
git commit -m "Prepare for merge with branch-1"

In "branch-2", we modified the file "A", modified the file "B", and add the file "D" with different content. Now let us merge "branch-2" into "branch-1".

git checkout branch-1
git merge --no-commit branch-2

We have conflicts and we can further look at the conflicts by the "git status".

In case of a conflict, GIT make change to the content of the conflict file to help us to make the decision. For example, the following is the content in the file "D".

Due to the conflicts, GIT is unable to make decisions for us. We will need to exam every file to decide what we want to do. If we want to keep the file "B" and manually resolve the conflict on "A" and "D", we can issue the following command.

git checkout --theirs B
echo "Actual merge result for A" > A
echo "Actual merge result for D" > D

Of course, this is an over simplified conflict resolution. In real situations, you need to use your favorite editor to look at each file closely to decide your final decision on each file. After resolving the conflicts, you can commit the merge.

git add -- .
git commit -m "Merge branch 2"

You can also use "git merge --continue" to complete the merge.

git merge --continue

If you now exam the HEAD of the branch, you can find that the commit has two parents.

The two parents are the both commits that the new commit merged from.

The Repository Size & "reflog" & "git gc"

GIT does a good job to minimize the size of the repository. In most of the cases, you do not need to worry about it. But if you are curious, you can take a look at the GIT garbage collection. This is the best note that I found on GIT garbage collection that I would recommend you to take a look at it. In order to see how garbage collection works GIT, let us make a commit and then discard it.

git checkout working-branch
echo "Whatever modification" >> B
git add -- .
git commit -m "Commit to be discarded"

We can find the hash code of this commit by the following command.

git rev-parse HEAD

bffe4e1bfcbbf026e3e9b34dacc66cf18dcb501c

We can issue the following command to discard this commit and revert the branch to the previous commit.

git reset --hard HEAD~

After discarding the commit, the commit "bffe4e1..." is no longer associated to any branch. Ideally we should be able to garbage collect it.

git gc --prune=now

But if we take a further look, we can find that the commit is still in the repository.

git cat-file -p bffe4e1

The reason why the garbage collection did not work on this commit is that it is associated with the log. When we make a commit, the commit is added to the log that we can reference later. It is called "reflog" in GIT. We need to clear the log, so we can collect this commit.

git reflog expire --expire=now --all

After clearing the log, we can run "git gc --prune=now" again. We should see the dangling commit is cleared from the repository.

If you want to find all the dangling commits and objects, you can use the following command.

git fsck --full

I could not find the default expiration time of the "reflog" from the GIT website. But from this link, the default time is 90 days and it is configurable.

The Remote Repository

Working with local repositories gives us the opportunity to learn most of GIT. but without a remote repository, you are unable to share works with your team. The effort to learn to work with a remote repository is no match to the effort that we have spent on the local repository. I will not spend a lot of time to talk about it. I will just list the most commonly used commands here for completeness. For most people the first thing to do with a remote repository is the "git clone".

git clone https://github.com/BigMountainTiger/lu-decomposition.git

I created a repository on "GITHUB" named "lu-decomposition", please feel free to clone it. If you want to find all the remote branches, you can use the "-r" option.

git branch -r

If you have the permission, you can also push your changes to the remote repository. If you created a local branch that the remote repository is not aware of, you can use the following command to publish it to the remote.

git checkout -b new-branch
git push --set-upstream origin new-branch

After publishing the branch, you can simply use "git push" to send your local commits to the remote.

git push

You can use "git pull" to get the commits that other people pushed to the remote.

git pull

If you want to check if any new updates on the remote branch, but do not want to pull it locally, you can use the fetch command.

git fetch

If you want to get the status of the remote repository, but not limited to your working branch, you can use the following command.

git remote update

You can delete the remote branch.

git push origin --delete new-branch

Of course, you can always delete your local branch.

git branch -d new-branch

Points of Interest

  • This is a note on GIT & miscellaneous subjects;
  • The best GIT reference that I found is this book that I strongly recommend you to read it. This note is just putting the most commonly used GIT commands together;
  • I hope you like my postings and I hope this note can help you one way or the other.

History

First Revision - 6/27/2018.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Dr. Song Li
United States United States
I have been working in the IT industry for some time. It is still exciting and I am still learning. I am a happy and honest person, and I want to be your friend.

You may also be interested in...

Comments and Discussions

 
GeneralMy vote of 4 Pin
gicalle755-Aug-18 20:19
professionalgicalle755-Aug-18 20:19 
GeneralRe: My vote of 4 Pin
Dr. Song Li6-Aug-18 4:06
mvpDr. Song Li6-Aug-18 4:06 
Questionnice article - good overview Pin
gicalle755-Aug-18 20:18
professionalgicalle755-Aug-18 20:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web04-2016 | 2.8.180820.1 | Last Updated 19 Jul 2018
Article Copyright 2018 by Dr. Song Li
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid