Git guide I wish I had when I was a student

Git guide I wish I had when I was a student
Photo by William Bout / Unsplash1“

When I was a student, we learned nothing about Git in college. Instead, we uploaded our coding practices to Moodle.

Back in the days of my first job as a junior software engineer, I didn't know a single Git command or how it worked. I had to learn it all by myself and from others. Nowadays Git has become so important for a software engineer's day-to-day work that everyone assumes that you know Git basics.

This article aims to get you familiar with all git basics. If you want something advanced, for example understanding how Git works in the background, you should read books about Git.

What is Git?

Imagine you work as a furniture seller and designer. You draw some schemas for a TV stand for a customer:

The customer says it's ok, but wants something with 6 drawers. You edit the schema with 6 drawers and send it to a customer:

Then the customer says, he prefers the first version and wants to order that one.

If you didn't save the first version in a separate file, before you made new changes, you lost all your work.

In this case, it's not a big deal but imagine if it was something more complex with many changes. You would waste hours to draw that again.

In software engineering, code in projects is similar to drawings in the schemas above. You change the code and your app behaves differently. And in the projects, changes are added/removed very frequently.

Due to that, there is a need for an elegant code management system. Sometimes, you must go to a version of your code from couple days ago.

That's what Git is doing. So, Git is a tool for code management.

It allows you to manage different versions of your code, so you always have a safe point to fall back to. This way you have no fear of adding new changes, experimenting with new code, or destroying everything.

Core Git concepts

Now that we know what Git is, I will explain core Git concepts.

Branch

In most situations, you will work with a team on a project. Branches are there to ensure everyone can work independently on their tasks, without interrupting others.

Usually, in projects, there is a branch called main (or master) where the most up-to-date code is stored. Let's say your team member John and you got some new tasks.

You want to take implementation for a user profile page and John will tackle the user's posts page.

You can both branch out of the main branch, create new feature branches, and then code without interrupting each other. Later, when you finish your work, you can merge your code into the main branch. All the changes you added in the feature branch will be added to the main branch.

You can think about branching this way: clone the Word file and make text changes inside the copied file, without worrying about the original. Later when you are sure your text is good you can overwrite the text from the copied document into the original document.

So branches are streams of work in progress that make teamwork easier.

Commit

Commits are packages of code. They can contain a couple of lines of code or thousands.

As a software engineer, you should always strive to make your commits small and not commit huge amounts of code, so your team members can review it easily.

You should commit the code when you finish implementation for small logical unity. For example, if you are building a user profile page, you can commit a code change for implementation of uploading a profile picture, then commit code changes when you add a user's bio, etc.

Each commit contains code changes since the last commit that was created. With all commits together, you have a nice history of how your project was built:

You can add as many commits as you want.

The last commit you added on the currently checked-out branch is described as HEAD. This is nothing more than a pointer to that last commit. Sometimes this HEAD can point to some other specific commit. This is called a detached HEAD.

Each commit has its unique identifier called commit hash. You can use this command:

git show f5366ccb21588c0d7a5f7d9fa1d3f85e9f9d1ffe

And it will display the commit information:

commit f5366ccb21588c0d7a5f7d9fa1d3f85e9f9d1ffe
Author: Mensur Durakovic
Date:   Sun Apr 26 16:24:33 2024 +0000

    Project setup

Stage

Git stages are crucial for understanding what your Git workflow looks like.

Code goes through various flows to finally be stored in a specific repository. Let's take a look at this image:

It all starts with the working directory. This is the folder/directory where all your project files are located. You can edit them, add new ones, delete existing ones, etc.

The staging space is a warming-up area for the code. Code is located in this stage before you commit it to the local repository.

The local repository is a place on your machine where git permanently saves committed code changes. You can view your project working history, commits, and changes, you can "jump" to previously committed versions of your code, etc.

The remote repository is usually on some server (GitLab, GitHub, BitBucket, etc). When you push your changes they are stored on that server and you can collaborate with other team members that work on the same project as you.

Tag

Git tags are useful for marking specific events in your project, for example, if you finished the MVP and pushed it to production, you can mark that event in your Git with a specific tag.

After a couple of months of work, you pushed v2 of your app. You can add another tag.

You can list, add, delete, view the versions of files a tag is pointing to, etc.

The most used Git commands

Now that we know some core Git concepts, let's explore some of the most used Git commands.

git config

This is usually the first step you do when you set up a new project, you make sure you can push/pull code from a remote repository.

You will need a user account with email and password for using GitLab, GitHub, or any other code management service. To push the code to the remote repository you need to be authenticated with your account.

To avoid typing the email and password every time you want to push/pull the code, you should store your credentials in the gitconfig file so this action is automatic. To do so, you can run these 2 commands:

git config --global user.name "Mensur Durakovic"
git config --global user.email mdurakovic@example.com

To list all stored Git credentials you can run:

git config --list

git clone

You can understand this as a simple download repository command:

git clone https://github.com/facebook/react.git

The repository will be downloaded inside the current directory you were positioned in when you executed the command.

git checkout

To switch from one branch to another you can do:

git checkout my_branch_name

or if you want to create a new branch and switch to it:

git checkout -b my_new_branch_name

git add

You finished coding your task, now you want to add changed files to the stage area. If you need to add all changed files in your current local repository, you can run this command:

git add .

or if you want to add specific files:

git add Documentation.txt

git commit

You added your changes to the stage area, now you want to create your commit. You can do that with this command:

git commit -m "your commit message"

The commit message should be precise, short, and meaningful. Write what you did in that commit. If you mess up your git commit message and want to update it, you can do that with:

git commit --amend 

It's better to commit more often than to have one big fat commit. It's easier to squash all those commits later than to have huge changes in a single commit.

git status

Sometimes you want to see the status of your current work on the branch, so you can execute:

git status

this will give your output similar to this:

# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#   modified:   index.md
#   deleted:    intro.md
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#   history.md
#   intro.md

Similar command if you want to see the history of your commits, from the newest one to the oldest one, you can use this command:

git log

It will get you output like this:

commit 9f7ff8f8b484df56c26427c72dcab7a165c1041d (HEAD -> TEST-2532_new_timeline_updates, origin/TEST-2532_new_timeline_updates)
Author: Mensur Durakovic <mdurakovic@example.com>
Date:   Sun Apr 26 16:24:33 2024 +0000

    TEST-2532 Timeline component add counter

commit b1c0ebfd6f06be96fca24cc42333783f8372695f (development)
Merge: 8801b82be 8e9d23155
Author: Mensur Durakovic <mdurakovic@example.com>
Date:   Sun Apr 26 13:24:33 2024 +0000

    Merge branch 'TEST-3016_user_profile_updates' into 'development'
    
    TEST-3016 User profile image updates
    
    Closes TEST-3016

git push

After you have committed your code, you can push it to the remote repository with this command:

git push origin your_branch_name

If the branch with that name doesn't exist on the remote repository, you will get a warning prompt, which you can confirm and the branch will be created.

git pull

If you want to pull the latest changes from a remote branch into your current branch, you can execute this command:

git pull origin your_remote_branch_name

An important thing to note is that you can use each of these commands above with specific flags eg. --force or --amend. Look at git documentation online to see the full list of supported ones for your command.

Also, there are GUI Git clients that you can use, instead of typing the commands. But in the background, it's the same thing. For example, Visual Studio Code has integrated Git source control where you can check your commits, branches, etc.

There are also popular ones like GitKraken, SourceTree, Fork, etc. Some of them are free to use.

I prefer console commands in combination with VS Code.

Conflicts

When you pull or push your code, you will sometimes experience Git conflicts.

This means there are conflicting changes between different versions of the same file (or multiple files) in your repository. Those conflicts can be easily resolved in most cases.

Git will append conflict markers on specific places in affected files, so you won't have trouble locating them. They are usually marked with multiple <, > and = signs. Here is how it looks:

<<<<<<< HEAD
hello
=======
world
>>>>>>> cb1abc6bd98cfc84317f8aa95a7662815417802d

Now what this means:

  • The content above the = signs displays the content from the branch you are merging into.
  • The content below the = signs displays the content from the commit you are trying to merge in.

It's up to you to decide what content you want to remove, modify or add.

Delete conflict markers and update the code. After that, always check if it works correctly. If everything is ok, you can commit the changes and push/pull the code.

If you need more examples on this topic, check this video here.

Squash, cherry-pick, merge, and rebase

There are a couple of techniques when merging your branch into a main branch. Some of them are more popular than others. For example, people are still debating on the right approach, merge or rebase. But let's start with simpler ones.

Squashing

As the name suggests squashing is combining multiple commits into a single commit. It's not hard to understand it, as the image suggests:

Be careful with squashing because those commits will be gone and you will end up with a single commit that contains all the changes.

You can do squashing with these commands:

git reset --soft HEAD~X
git commit -m "Feature upload profile image"
git push origin branch_name --force

This first command will move the HEAD pointer X commits behind while keeping all the code changes. So you only need to count how many commits you have to squash and put it in the command.

Also in the last command, you see the --force flag. Be careful with using that as it will rewrite the history of your remote repository with the history you have on your local repository.

Cherry-picking

In certain situations, you want to copy the commit from one branch to another branch. You can do that with cherry-picking.

To cherry-pick a commit from one branch to another, you need a commit hash. You can get info on that with the git log command, as explained above. Then you can execute these commands:

git checkout my_target_branch
git cherry-pick cbf1b9a1be984a9f61b79a05f23b19f66d533537
git push origin my_target_branch

If you want to cherry-pick multiple commits, execute the 2. command multiple times with targeted commit hashes.

Merge and rebase

Merge and rebase have the same outcome. They merge the feature branch code into the main branch.

Still, the difference is that the rebase will rewrite the branch history. On the other hand, merging will keep the history of both branches and create a new merge commit. Here is the visual:

It's up to you to choose which way you prefer, but my preference is rebase. I usually do rebase like this:

git checkout your_feature_branch
git fetch
git rebase main_branch
git push origin your_feature_branch --force

Here is the breakdown for each command:

  1. make sure you are on the branch you want to rebase (eg. your feature branch)
  2. fetch the latest changes from the remote repository
  3. rebase on the targeted branch (eg. main branch)
  4. push your changes

Sometimes you will have to use the --force flag to push the changes, for example, if you already pushed the branch to a remote repository before.

There is also something called interactive rebase, which is a bit advanced. It allows you to execute multiple actions through interactive prompts. You can play with commits and use these key characters:

  • d as drop - to remove commit
  • e as edit - to use commit, but stop for amending
  • p as pick - to use commit
  • r as reword - to use commit, but edit the commit message
  • s as squash - to use commit, but squash with previous commit

There are many other options but these are the most used ones. Those allow you to delete, rearrange, edit, or squash commits before you apply them to a specific branch.

Most used git collaboration workflows

Depending on the project's complexity, the number of team members, company policies, etc, people work with different git collaboration workflows.

I will explain some of the most popular ones.

Feature branch workflow

For each new feature or bug fix you open a new branch. You implement the feature on that feature branch and once done, you merge it to the main branch.

Pull request workflow

Similar to feature branch workflow, except here you don't merge your branch into the main branch automatically.

Instead, you create a pull request (or merge request). Then the maintainer or team members review your code changes. If needed, they can comment or request changes to your code.

Later, when your pull request is approved, the code is merged into the main branch.

Trunk based workflow

In this workflow, small code changes are frequently pushed into the main branch. No branches are created.

This workflow requires a good setup for automated tests and deployment pipelines. It is mostly used in situations when you want to deploy small, frequent changes to the production environment.

Gitflow workflow

Here you have predefined long-living branches that are used for specific types of tasks or environments. For example, you have branches like main, development, UAT, staging, release, hotfix, etc.

Conclusion

In this article, we learned:

  • what is Git
  • what are core Git concepts like branches, commits, stages, and tags
  • what are some of the most used Git commands
  • what are git conflicts
  • what is squash, cherry-pick, merge, and rebase
  • what are some of the most popular git collaboration workflows

Here is my final Git tip for you:

On your first day at a new job, when you finally get your hands on the project, squash every commit into a single commit with the message "Legacy code" and force push to the main branch 😉