A detailed explanation of AGit-Flow, a new generation of efficient Git collaboration model

Posted May 25, 202011 min read

[The following is the sharing record, with abbreviations]

Git workflow overview and the advantages of AGit-Flow

Currently, Git has become the standard and infrastructure for source code management. "Why is Git so successful?" In an interview with Git's 10th anniversary, Linux, the creator of Git, revealed the mystery:

The big thing about distributed source control is that it makes one of the main issues with SCM s go away the politics around who can make changes.

He believes that the key to Git's success is not because it is faster and safer, or because Git is distributed, but to solve the problem of "who can contribute code". The traditional centralized version control system can only open write authorization for core users, which is not good for the project to grow and strengthen in the long run. And Git changed the traditional version control system to not allow the code to contribute to the stubborn disease of multiple developers, so that "read-only users" can also participate in project development through "code review".

There are currently two most commonly used Git workflows in the industry:GitHub and Gerrit. They all have the characteristics of a simple authorization model of the warehouse, and read-only users can participate in the code contribution.

As shown in the figure above, I analyzed the advantages and disadvantages of these two GIT workflows:\ * \ * The code review mode is different:\ * \ * The code review of GitHub is called "pull request", and each feature change generates code once Review. Gerrit's code review is called "Change", and each commit student will generate a "change order", which is a code review.

Different types of workflows:GitHub workflows are distributed. When developers need to participate in the project, although they do not have the "write" permission, they can create a personal warehouse(derived warehouse) through the "Fork" method. Create code branches and create pull requests in this derived repository. The bottom layer of GitHub uses native Git(that is, CGit).

Gerrit's workflow is centralized, and all users work in a centralized warehouse under unified management and control. Gerrit requires users to install a "commit-msg" hook in the local clone repository in order to insert a unique "Change-Id" in the generated commit, and push to the server using a special git push command. Gerrit uses JGit(Java's Git implementation).

\ * \ * Respective advantages:\ * \ * GitHub is simple and easy to use, and can use standard Git commands to complete code contributions; it has complete control over the derived warehouse and is not affected by upstream projects; it can create cross-project open source communities and global development The synergy between the two is also one of the reasons why GIT can form the world s largest open source community.

Because Gerrit uses a centralized workflow, the administrator can strictly control the project, and can strictly control who can access the warehouse and who can contribute to my warehouse. Another advantage of Gerrit is that it can achieve multi-warehouse project management. We know that the "Android" project has more than 1,000 warehouses, which are managed by Gerrit. It is difficult to imagine how to use GitHub to manage more than 1,000 warehouses of Android.

Disadvantages:As mentioned above, GitHub is difficult to manage Android-like multi-warehouse projects. In addition, because GitHub uses the working mode of the derived warehouse, there will be a problem of server-side data redundancy.

Gerrit needs centralized management and control, and the administrator is responsible for creating projects, while ordinary users cannot create projects. This makes a Gerrit instance usually only manage a project or an organization's projects. It is difficult to form code reuse between projects, and it is also difficult Bring together developers across projects to form a developer community.

Through the investigation and study of Git workflows such as GitHub and Gerrit, we have "recommended strengths and weaknesses" to create Alibaba's Git workflow, namely AGit-Flow.

At Alibaba, we like pull request, CGit, like creating a centralized workflow for code review directly on the command line, and like the open developer community. We don't like the "commit-msg" hook associated with the submitted code review, we don't like a decentralized code platform.

We have also developed a supporting client tool "git-repo", which can not only work under a single warehouse, but also support multi-warehouse project collaboration like Android.

At Alibaba, how do we use AGit-Flow

Let me show you how we use AGit-Flow inside Alibaba.

We first use Git standard commands to clone the repository to the local, and then develop in the local repository, create a commit. Execute the git pr command in the workspace, push the local commit to the server, and the server will automatically create a new code review, for example:pull request # 123. The code reviewers of the team can open the code review with the number "123" to submit review comments. Developers continue to develop, add or modify submissions in the local work area based on the review comments. Execute the git pr command again in the workspace to push the local commit to the server. The server found that a pull request from the same user and the same local branch already exists on the target branch. Therefore, the user does not create a new pull request this time, but updates the existing pull request.

If it has been modified many times, the code is still not OK. Code reviewers can also directly initiate changes to the review code to help the original developer update the pull request. Code reviewers can use git download 123 to download the pull request with the number 123 to the local repository. After the code is modified, execute the git pr --change 123 command to push the local modification to the server. The server receives the special git push command from the code reviewer and updates the pull request created by the developer before. The project manager merges the pull request into the master branch by clicking the merge button on the pull request review interface. The master branch is updated and the pull request is closed.

Use the AGit-Flow workflow without creating new branches on the server, without creating write permissions for newly joined students, you can only assign read permissions to all developers, and then update by creating code reviews and then merging into the trunk Backbone code. This is also the most popular backbone research and development model.

At present, you can implement the AGit-Flow workflow through the cloud effect code management platform(Codeup).

AGit-Flow implementation principle

How is AGit-Flow implemented? First, the client initiates a code push request to the server using a special git push command to trigger the AGit-Flow workflow. Why is this git push command special? Because its target branch is a code branch that contains a special prefix "refs/for /", followed by a "". This "" is used to distinguish the local branch name. Code reviews submitted by different developers contain different "", so they will not cover each other.

We can also pass "-o" to pass different parameters, for example, you can specify who will review my code, which "issue" my code review will be associated with. These operations can be completed through the "git push" command, and later we found that this git push command is more complicated, so we encapsulated a command line tool git-repo. At present, git-repo has been open sourced, and everyone can use it for free.

Then the "Push" command will be entered into the server, and the server will start a process "git-receive-pack".(We made some modifications to the front-end authorization module on the server side so that it can recognize this special git push command, allowing read-only users to also "Push") As shown in the figure above, "git-receive-pack" I made a star Mark, because it is a special "git-receive-pack". When it finds that the target of the push command is a special reference, it will not take Git's original internal workflow, but take the "external hook". Use "external hooks" to complete some fun operations, such as creating code reviews.

In March of this year(2020), we have contributed this modified git-core to the Git community, which is currently under review. A new version of Git will include this new feature:proc-receive. This feature has undergone 15 iterations, from the initial server-side expansion to a complete solution that combines server-side expansion and client-side protocol upgrades. We open source this technology, on the one hand, it prospered the Git ecosystem, so that more people can benefit from Alibaba's technology; on the other hand, Alibaba has also benefited, and our code contributions are supported by the Git client, Git Adapted to our new gameplay, Git users including Alibaba have a better experience.

Technical details of AGit-Flow implementation

In order to explain the technical details of AGit-Flow implementation, let's first understand the original implementation of the git push command. ! \ [05.jpg ]( https://ucc.alicdn.com/pic/ developer-ecology/8552743fa0b24934bd9a93146ffedef4.jpg)

As shown on the left side of the figure above, the git push command contains two parts of information, one is packaged data, called "packfile"; the other is the push command, called "commands". After "packfile" and "commands" are pushed to the server, "packfile" will take the path on the left, and first enter "quarantine" for "quarantine". When "commands" are checked by the "pre-receive" hook, and the user's permissions are OK, the submission instructions are OK, and the modified file is ok, "packfile" will be released from the quarantine area and enter the objects library(objects).(If the "pre-receive" hook script fails, delete the quarantine area and return an error message to terminate the execution of the push command.)

Next, "commands" will be passed to the built-in "execute \ _commands()" function to implement branch creation, update, and deletion. Then report to the client through the "repor()" function, and finally execute the "post-receive" hook script to complete the event notification.

New hook, new ecology. AGit-Flow made changes to the source code of "git-receive-pack". The new process is shown in the following figure:

When the client executes the new git push command, the propagation path of "packfile" has not changed, but we changed the "git-receive-pack" command and added a "filter"(the funnel part in the figure). The filter divides commands into two groups, one is the standard Git command(group1), and the other is the AGit-Flow special command(group2). After these two sets of commands are checked by the "pre-receive" hook, the ordinary command on the left(group1) will execute the execute \ _commands function built in Git, generate new references, and create and update branches. The special command on the right will call a new external hook "proc-receive", and then create a special code review reference, such as "refs/pull/123/head", and you can download it with a special Git command To local.

This new feature we contributed to Git consists of three parts. The first part is the "filter", which is achieved by adding a new configuration variable "receive.procReceiveRefs" to the server. As long as this special configuration variable is defined, when the client pushes with the git push command, Git will match according to the configuration variable, and when the corresponding command is matched, this command will go through a special process. This configuration variable is a multi-value variable. For example, the settings of the Alibaba code platform are:

  • git config --add receive.procReceiveRefs refs/for
  • git config --add receive.procReceiveRefs refs/drafts
  • git config --add receive.procReceiveRefs refs/for-review

These three configuration variables correspond to the three push modes of git pr, which will generate a standard pull request, a draft mode pull request, or a code reviewer who wants to push a specified pull request.

The second part is the proc-receive hook. Our hook should be said to be the most complicated hook in Git history. It can communicate with the server(git-receive-pack) in both directions. First, the server and the hook will do "version negotiation", because we think that this protocol will be upgraded later. In order to ensure backward compatibility, we must first negotiate a version. The server tells the hook which version I am, and the client tells the hook which version I support. Later Git can use the corresponding version protocol to communicate with the hook. The server will use the "pkt-line" encoding command, which is a three-segment command that contains the old ID, the new ID, and the reference name that needs to be updated. The server will send commands and parameters to the hook. The hook will process these commands and parameters, which are implemented by the developer through API calls, and then report the processing results to the hook. The hook then reports this execution result to the server in a special way.

The third part is the customizable client status report. In the right part of the figure above, you can see that there is a dotted line from the "proc-receive" hook to the "execute-commands" function. This means that some commands that cannot be processed by the hook are returned to the "execute-commands" function for processing. When the "execute-commands" and "execute-commands" hooks process all commands, the "report()"(report) will be carried out in a unified manner. Our "report()" has an advantage over Gerrit, which allows the client to know that you did not change to a "refs/for" reference, but to another reference. In addition, we also considered the backward compatibility problem, so that the old version of Git can receive the report information when it encounters the new version of the Git server. For example, the old version of Git will think that you created "refs/for/master"; but the new version of Git understands the extension protocol and can recognize that you have created a special reference, such as knowing that you are not creating "refs/for/master" It is a reference like "refs/pull/123".

Introduction to Alibaba's open source client tool git-repo

git-repo is a centralized workflow command line tool open sourced by Alibaba. It encapsulates native Git commands, which simplifies the slightly cumbersome Git commands when using centralized workflows such as AGit-Flow. git-repo is developed in Go language and does not depend on other software except Git at runtime, so it has the advantages of unpacking and using.

git-repo has good compatibility and can support AGit-Flow compatible code platforms and Gerrit. Can be cross-platform, currently supports Linux, Mac, windows systems, where Windows is the Beta version. In addition to the multi-warehouse management capabilities of the Android repo, you can also operate on a separate code warehouse.

How to download and install git-repo? git-repo has been open sourced into Git hub, you can visit https://github.com/alibaba/git-repo-go/releases page to download the appropriate installation package. Then move the unzipped git-repo file to an executable directory(such as the/usr/local/bin directory under Linux) to complete the installation. The detailed usage method is not repeated here, you can visit the git-repo homepage to understand.

to sum up:

AGit-Flow is a Git centralized collaboration agreement developed by Alibaba. It has already taken root within Alibaba Group and is supported externally through a cloud-efficient code management platform.

The core components of AGit-Flow have been open sourced and will become the core components of Git.

In order to facilitate the operation of developers, we developed the git-repo command line tool. git-repo is developed in Go language, and has the characteristics of unpacking, cross-platform, good compatibility, and scalability. git-repo has been open sourced in the Git hub community. Everyone can enter the product home page and download it for free.

The above content organizes the video sharing "AGit-Flow:A New Generation of High-efficiency Git Collaboration Model" for self-knowledge. Welcome everyone to join the cloud effect developer exchange group(nail group number:34532418) to watch the video playback and download the speech PPT.

About Yunxiao

Cloud efficiency, an enterprise-level one-stop DevOps platform, stems from Alibaba's advanced R & D concepts and engineering practices, and is committed to becoming a digital enterprise's R & D efficiency engine! Yunxiao provides end-to-end online collaborative services and R & D tools from "demand-> development-> testing-> release-> operation and maintenance-> operations", and helps developers improve R & D efficiency through the application of artificial intelligence and cloud native technologies. Deliver effective value.

[Cloud effect official website] https://www.aliyun.com/product/yunxiao?channel=zhibo [public beta guide] https://developer.aliyun.com/article/756207 [Apply for public beta][ https://devops.aliyun.com] ( https://devops.aliyun .com /) [Learning Path] https://help.aliyun.com/document_detail/153739.html [Developer Community] https://developer.aliyun.com/group/yunxiao [Wonderful event]Cloud effect public beta opens the recruitment of "Product Experience Officer" https://www. aliyun.com/activity/yunxiao/Beta2020

Welcome to scan the code to join the cloud effect developer club(nail group:34532418)