Forking - generic
=================

What is an Forking Workflow?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Prior to discussing what a "fork" is, it is worth laying out what a "remote"
repository is compared to a local one. For most of us, the repos we
typically interact with are on-disk or "local" repositories which we can
navigate to and manually modify. In contrast, a remote respository (hereafter
referred to as a "remote"), as laid out in the ``git``
`documentation <https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes>`_, 
is a version of your repo that exists *somewhere else* which (often) has stricter read/write
permissions. Typically, these remotes are hosted on an internet server, such as:

- `gitlab.com <gitlab.com>`_
- `github.com <github.com>`_
- `gitlab.science.gc.ca <gitlab.com>`_ - *internal ECCC employees only*
- `bitbucket.com <bitbucket.com>`_

but they can also be stored on local servers. Regardless of where they are
located, developers typically work with their own local repos and then
push/pull changes to and from remotes in order to share changes. 

Now, **in a non-forking workflow there is typically only one remote
repository** which developers clone from and push/pull changes to and from.
This type of workflow is illustrated below:

.. figure:: /images/Single_Remote_Workflow.png
    :width: 600
    :align: center
    :alt: Non-Forking Workflow Schematic

    Single remote (non-forking) workflow

Under this non-forking workflow, it is easy for a team of developers to share
their code through one remote, however, this model can break down when:

- the number of development branches and developers gets large or
- external developers wish to contribute to the code-base.

The first condition can result in a large number of orphaned branches in the
central repo, or worse, lead to one developer overwritting/deleting a
colleague's changes. The second condition causes problems because it requires
explicitly giving external developers access to your central repository, but
depending on the code base and location of the remote, that may not be advised nor
possible. 

To avoid these problems, we tag in the forking workflow. In summary, what this
means is that instead of relying on a single remote, **each developer "forks"
the central repo, creating a "server-side" clone of it**, which results in 
many developer specific remotes. Under this workflow, developers push changes
to their fork, and when they want their changes integrated into the central
repository, they make a merge request across remotes (from their remote into
the central one). A schematic of this workflow is shown below:

.. figure:: /images/Forking_Workflow.png
    :width: 700
    :align: center
    :alt: Forking Workflow Schematic

    Forking workflow 

Now, it goes without saying that visually, this information flow looks notably more 
complex than the single remote workflow. However, practically, the differences
experienced by the developer are minor, with the main differences being that:

- developers need to be aware of what remote they are interacting with, and
- how to update their fork when the central repo is updated

On a day to day basis, developers will typically still only interact with **one** 
remote (their personal fork), and only need to interact with the central repo 
when making merge requests or pulling in changes.