Version Control System on top of Git and GitHub – an Introduction



Git and GitHub


Version Control Systems - Introduction


Before jumping to Git or GitHub, we will have to understand about the Version Control System and its history. Version Control Systems known as VCS are a kind of software tool that focus is to manage and track the changes to the files, programs, logs, and other information related to code development, code deployment, and code operation over time.

Version Control Systems keeps track of every modification to the code in a special kind of database. It lets you save a snapshot of the complete project at any point in time. It enables multiple people to simultaneously work on a single project as well as integrates the work done all together with different team members. If a mistake is made, developers can turn back the code and compare earlier versions of the code to help fix the mistake as well bug promptly.

In the context of the Version Control workflow approach, two jargon plays the key roles, as –
  1. Repository 
  2. Working Copy


Repository and Working Copy


A Repository is a database of changes, and/or historical versions or snapshots of the project. At the other end, a Working Copy sometimes called a Checkout is the copy where you do your work , all files in the project. It leverages you to edit the copy without affecting your coworkers. When you are done and confirm with your edits, you proceed to commit your changes into a repository.

At the same time Version Control System is also popularly known as – 
  • Revision Control System
  • Configuration Management System
  • Source Control Management System


Varieties of Version Control Systems


If I talk about the way Version Control works, then there are two general varieties of Version Control exist, as – 
  1. Centralized Version Control
  2. Distributed Version Control


Basically, the Centralized Version Control works on the concept of a Client-Server relationship. There is only one  repository that located in one place and provide access to many clients, and each user gets his or her own working copy. It is possible for your team member to update and see your edits as soon as you Commit.

But in case of Distributed Version Control, there are multiple repositories, each user gets his or her own local repository and working copy. Once you make Commit, others have no access to your changes until you Push the changes to the central repository. Similarly, when you update, you do not get other’s changes unless you Pull those changes into your repository. 

In this way we can see the following activities under version control – 

Centralized Version Control System
  • Commit
  • Update

Distributed Version Control System
  • Commit
  • Update
  • Push
  • Pull


In fact, Distributed Version Control is more modern, runs faster, has more features, and is less prone to errors in contrast with Centralized Version Control.


Centralized and Distributed Version Control System


In this consequence, now we can have some fair idea about the importance of Version Control System, following are some key significance – 
  • Registered users only check files into version control.
  • Documents and files can be identified by file names, authors, and the modified dates.
  • Each file check-in gets a new version that is usually a number.
  • Supports in differentiating the latest versions and tracking the recent updates.
  • The latest versions of all files are often referred to as heads.
  • Source control systems are organized into repositories.
  • Version control supports branching where the head of the repository is split for parallel development.
  • Maintain independent branches of code for team members to track their respective changes
  • Continuous integration is easier with the new version control systems.
  • Simple comparison across versions to help resolve conflicts in code during merging.
  • Revert changes made to the code to any state from its history. 
  • Advantages in coding collaboratively and remotely.


Git - Introduction


We just covered the overview of Version Control Systems, that make a positive impact in software development. In today’s era, there are different types of Version Control Systems  that are available in the market, like few top systems are – 
  • Git – It is a distributed version control and used to scrutinize the changes.
  • CVS (Concurrent Version System) – It is a free client-server revision control system.
  • SVN (Apache Subversion) – It is a distributed version control tool under the Apache license.
  • Assembla – It is a web-based version control and source code management system.
  • Mercurial – It is a distributed revision control tool that supports Windows and Unix-like environments.
  • Bazaar – It is both, a distributed and a client-server revision control system.


Till now we walked through the overview of Version Control System and came to know about different types of VCS. Here we will talk about one of the most popular Version Control System that is Git. Git is a free, open source Distributed Version Control System designed to handle all types of projects, small to large with speed and proficiency. 

It was originally developed in 2005 by Linus Torvalds, the famous creator of the Linux operating system kernel. Git covers the following objectives – 
  • Speed
  • Simple design
  • Fully distributed
  • Excellent support for parallel development, support for hundreds of parallel branches.


Git is a Distributed Version Control system, it does mean your local copy of the code is itself a complete version control repository. Such fully-functional local repositories make it easier to work offline or remotely. In the meantime, you commit your work locally, and then sync up your copy of the repository with the copy on the server.

In the context of states, Git has three main shapes that your files can reside in -
  1. Committed – It means that the data is safely stored in your local database.
  2. Modified – it means that you have changed the file, but have not committed it to your database yet.
  3. Staged – It means that you have marked a modified file in its current version to go into your next commit snapshot.


Based on Git working structure, there are three main sections of a Git project- 
  1. Git Directory – It is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer.
  2. Working Directory – It is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify.
  3. Staging Area – It is a file, generally contained in your Git directory, that stores information about what will go into your next commit. Its technical name in Git jargon is the “index”, but the phrase “staging area” works just as well.


Three Main Sections of a Git Project


Now we can draw the basic workflows of a Git ecosystem something like this – 
  1. You modify files in your working tree.
  2. You selectively stage just those changes you want to be part of your next commit, which adds only those changes to the staging area.
  3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.


Every time you save your work, Git creates a Commit and if a particular version of a file is in the Git directory, it is considered as Committed. If it has been modified and was added to the staging area, it is Staged. And if it was changed since it was checked out, but has not been staged, it is a Modified.

Git Workflow


Benefits of Git 


In addition to being a free, open source and distributed system, Git has been designed keeping the key features in core, like performance, security and flexibility etc. In a nutshell, following are the key benefits of Git.

  • Performance – Committing new changes, branching, merging and comparing past versions are all optimized for performance. Along with this, being distributed enables significant performance benefits as well.
  • Flexibility – Git is flexible in several respects, like to support for various kinds of nonlinear development workflows, in its efficiency in both small and large projects and in its compatibility with many existing systems and protocols.
  • Security – The content of the files as well as the true relationships between files and directories, versions, tags and commits, all of these objects in the Git repository are secured with a cryptographically secure hashing algorithm called SHA1.
  • Parallel development – Due to distributed feature everyone can work simultaneously on their own branches.
  • Quick release – Branches helps a flexible and simultaneous development and by separating your release branch from development in progress, you can manage your stable code better and ship updates more quickly.
  • Built-in integration – Due to its popularity, Git is integrated into most tools and products. Every major IDE has built-in Git support, and many tools that allow you to manage continuous integration, continuous deployment.
  • Pull request – By using pull requests you can discuss code changes with your team before merging them into your main branch.
  • Branch Management – Branching with Git is very simple, takes only few seconds to create, delete, and merge branches. Feature branches provide an isolated environment for every change to your codebase.


GitHub – Introduction 


If you are in the software development world, then undoubtedly heard about the trending buzzword GitHub. In short-term, GitHub is a code hosting platform for collaboration and version control. It is a website and cloud-based service that helps developers store and manage their code, as well as track and control changes to their code.

Basically, it works on two linked principles, which we have been just walked through in previous sections – 
  • Version Control
  • Git


GitHub is a Git repository hosting service, but it adds many of its own features. While Git is a command line tool, GitHub provides a web-based graphical interface. Besides simple code storage, it is an entire ecosystem complete with an elegant social networking site, allowing individual developers to contribute to multiple teams and projects.

GitHub


Next, time to go ahead to catch some key terminology of GitHub – 
  • Repositories 
  • Branches
  • Commits
  • Pull Requests


Repository (repo)

A GitHub repository (repo) is a location where all the files for a development project are stored. It can contain folders and any type of files like HTML, CSS, JavaScripts, Documents, Images etc. Each project has its own repo, and you can access it with a unique URL.

By default GitHub repos are open to the public. If users want to restrict the public access to their repos, they can choose to keep their projects private for a small fee.

Branch (fork)

A branch, also known as a fork, is simply a repository that has been copied from one member's account to another member's account. A GitHub branch is used to work with different versions of a repository at the same time.

By default a repository has a master branch and any other branch is a copy of the master branch. 

Commits

Changes at GitHub are called commits a kind of like saving a file. Each commit has a description that explains why a change was made in the repository. Depending on how a repository is set up, you also might be able to create your own branch and make your own commits there.

Pull Requests

A pull request is a way through which you can submit the code changes back to a branch once you did commits there. By using a pull request you are proposing that your code changes should be merged (pilled in) with the master branch.

Commit, Branch and Pull Request

In brief, the basic flow of GitHub is Creating a Repository, Managing Branches, Making Changes, and Merging those changes via Pull Request. Forks and branches allow a developer to make modifications without affecting the original code. If the developer would like to share the modifications, he/she can send a pull request to the owner of the original repository.

If, after reviewing the modifications, the original owner would like to pull the modifications into the repository, he/she can accept the modifications and merge them to the original repository.

In the next article we will do some hands on activity like deploy files to GitHub repository by using Git. However, if you keen to get started with Git and GitHub, then – 


Keep visiting for further posts. 



5 comments:

  1. Thank you for the informative post about Security challenges in AWS , Found it useful . cloud migration services have now become secured and with no-risk
    Cloud Migration services

    Aws Cloud Migration services

    Azure Cloud Migration services

    ReplyDelete
  2. I have gone through your post and I found it very helpfull. Looking forward to see more post from you.
    Vmware Cloud Migration services

    Database Migration services

    ReplyDelete
  3. I am really impressed with the way of writing of this blog. The author has shared the info in a crisp and short way.
    Lia Infraservices

    ReplyDelete
  4. We are a part of the success story for many of our customer's successful cloud Migrations.
    Cloud Migration services


    Best Cloud Migration Tool

    ReplyDelete