Introduction to Git
January 08, 2020
Given that we strive to ready our students for the workplace from day one of the program, students at Rithm begin working with Git almost immediately. Our students have varying familiarity with Git, so one of our early lectures serves as a Git refresher.
What is Version Control?
To understand what purpose Git serves, we need to first understand the role of version control in software.
For a moment, let’s think outside of software development: consider the case of making changes to a text document or even to an image within a photo editor.
Once a change is made, we gain the ability to ‘undo’ that change (as well as any changes that came before our most recent edit). In order to do this, we have to keep a record of the changes that have been made, and the order that they were made in.
This is an example of version control: keeping track of changes that are made so that we can revert back to or compare to a previous state.
What is Git?
Git is a version control system used for code. Like the example above, Git keeps track of the changes we make to the files within our project, including such information as when each change was made and who it was made by. With a change history, we can then compare or jump back to previous versions of our project.
How does this work?
Every time we commit, or save changes to our project in Git, it takes a snapshot of what our files look like at that moment in time, stored with a unique identifier. If the file hasn’t changed between commits, Git is smart enough not to include it in the snapshot.
Git Workflow
Git has three main states that your files can be in: modified, staged and committed.
- A modified file is one that we’ve made changes to, but have not yet staged or committed.
- A staged file is a modified file that we’ve specified we want to include in our next commit.
- A committed file means that the changes made and previously staged are now stored in your local repository -- a snapshot has been taken and recorded in the project’s history.
Git Basics
In order to use Git to track our changes, our project needs to be set up to do so.
Initializing a Local Repository
The first step is to create a local repository where our change history can be stored. To initialize a new repository locally in a project, we use the command git init. This creates a new folder in our project named .git, though you won’t see it -- it’s hidden!
Committing Changes with Git
Once we’ve made changes to our files that we’re happy with, we need to do a few things.
- Check the state of the repository. We first want to check the status of our files. Running
git status
will list the files that we have changes to, as well as the files that have been staged. - Review the changes made. Sometimes we’ve made enough changes that we may have forgotten everything that’s been changed. To check the difference between our current files' state and the previous one, we run
git diff
. - Add the file(s) to staging. If we’re happy with the changes that have been made, we add the changed file to staging with the command
git add <FILE_NAME_HERE>
. If you want to add every file that’s been changed, you can use the shortcutgit add *
. - (optional) Re-check the status of the repository. I always recommend that students get in the habit of doing this before they commit. Are the files you thought would be added included? Did you accidentally include files you didn’t intend to?
- Commit the staged files. If everything looks good in staging, it’s time to make it official by running
git commit -m “YOUR MESSAGE HERE”
. The message you include with your commit should be a brief, straight-to-the-point summary of the changes you’ve made.
Reviewing Changes
To check the history of your repository, you can run the command git log
. This will show you each commit made to the repository, including the message, timestamp and author information.
Storing Changes Remotely
At this point, we have a nice, neat record of the changes that we’re making to our project. The problem is that these changes are only local to our machine. If our machine fails us, we will lose all of our work. We call this having a single point of failure.
To make sure this doesn’t happen, we need to make sure that our code exists in another repository -- one that isn’t on our machine. This is where the remote repository comes in.
Git + Github = <3
Github is a remote repository hosting service. It’s a place where we can store copies of our local repositories for safekeeping, collaborating, and sharing our work with the world.
If something goes wrong with the code we have locally, as long as we’ve kept our remote repository up to date, we’ll have a copy of the latest code for us to pull down to our local machine and begin where we left of.
Creating a Remote Repository on Github
To create a repository on Github, you’ll first need an account. After that, it’s as simple as following the instructions at https://www.github.com/new.
Once the new repository is created, you’ll be taken to a page that provides different code snippets for different cases. Are you starting with a blank repository, or do you have a local repository already? It’s up to you to select which setup is right for you.
Verifying the Remote Repository
To make sure that the remote repository was not only created, but is now linked to your current project, from your project’s directory in the Terminal, run the command git remote -v
. This will list the remote repositories that are currently linked to the local one. You should see something like:
origin git@github.com:misscoded/misscoded.git (fetch)
origin git@github.com:misscoded/misscoded.git (push)
If you do not see anything, re-read the instructions on Github and try it again!
The origin you see (if you’ve followed Github’s instructions) is a nickname for the repository address that follows. When using Git, we make our lives easier by using the nickname to issue commands instead of the full repository address.
Fetch and push indicate what repositories you will be pulling changes from and pushing changes to, respectively. Most of the time this will be the same address.
Working with Others
When working with others in a collaboration on a single repository, you have two options for your workflow.
Cloning a Repository
If you own the repository, or have been granted Collaborator access by the owner, you can clone the remote repository to your local machine and follow the workflow we’ve covered above.
When you’ve made commits and are ready to push your changes, run the command git push
and those changes will be included in the source code that lives on Github. The next time whoever you’re working with on this repository pulls to get the latest code, they will also get the changes that you have made.
Forking a Repository
In some cases, you will not have Collaborator access and cannot push your changes directly, but you still want to suggest changes to be made. This is often the case if you’re working on an Open Source project.
In this case, you will need to either clone or fork the repository and submit a Pull Request against the remote repository.
The process of forking and creating Pull Requests is out of the scope of this article (lecture), but Github has a robust guide available which can be found here.