Forking - generic ================= What is an Forking Workflow? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Prior to discussing what a "fork" is, it is worth laying out what a "remote" repository is compared to a local one. For most of us, the repos we typically interact with are on-disk or "local" repositories which we can navigate to and manually modify. In contrast, a remote respository (hereafter referred to as a "remote"), as laid out in the ``git`` `documentation `_, is a version of your repo that exists *somewhere else* which (often) has stricter read/write permissions. Typically, these remotes are hosted on an internet server, such as: - `gitlab.com `_ - `github.com `_ - `gitlab.science.gc.ca `_ - *internal ECCC employees only* - `bitbucket.com `_ but they can also be stored on local servers. Regardless of where they are located, developers typically work with their own local repos and then push/pull changes to and from remotes in order to share changes. Now, **in a non-forking workflow there is typically only one remote repository** which developers clone from and push/pull changes to and from. This type of workflow is illustrated below: .. figure:: /images/Single_Remote_Workflow.png :width: 600 :align: center :alt: Non-Forking Workflow Schematic Single remote (non-forking) workflow Under this non-forking workflow, it is easy for a team of developers to share their code through one remote, however, this model can break down when: - the number of development branches and developers gets large or - external developers wish to contribute to the code-base. The first condition can result in a large number of orphaned branches in the central repo, or worse, lead to one developer overwritting/deleting a colleague's changes. The second condition causes problems because it requires explicitly giving external developers access to your central repository, but depending on the code base and location of the remote, that may not be advised nor possible. To avoid these problems, we tag in the forking workflow. In summary, what this means is that instead of relying on a single remote, **each developer "forks" the central repo, creating a "server-side" clone of it**, which results in many developer specific remotes. Under this workflow, developers push changes to their fork, and when they want their changes integrated into the central repository, they make a merge request across remotes (from their remote into the central one). A schematic of this workflow is shown below: .. figure:: /images/Forking_Workflow.png :width: 700 :align: center :alt: Forking Workflow Schematic Forking workflow Now, it goes without saying that visually, this information flow looks notably more complex than the single remote workflow. However, practically, the differences experienced by the developer are minor, with the main differences being that: - developers need to be aware of what remote they are interacting with, and - how to update their fork when the central repo is updated On a day to day basis, developers will typically still only interact with **one** remote (their personal fork), and only need to interact with the central repo when making merge requests or pulling in changes.