Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-6089][INFRA] Improve the merge PR script #4831

Merged
merged 6 commits into from
Sep 21, 2024

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Sep 13, 2024

What is this PR for?

Zeppelin has dev/merge_zeppelin_pr.py that was borrowed from Spark, I would recommend committers use this script over the GitHub button to merge PR, which has some benefits:

  1. Simplify the backport process

the tools will ask you to backport the commit to lower maintained branches after you merge a PR to master, if there are no conflicts, all things you need to do are just type a "branch name" that you want to backport.

  1. Automatically update JIRA information

the script uses the python jira client to update JIRA ticket, for example, automatically closes the JIRA ticket after PR is merged, fills in the fixed versions, which is important to users to know the features/bug fixes applied to versions.

  1. Better PR title, body, and "Signed-off-by" info

Before
image

After
image

This PR syncs the change from the Spark upstream (around 4.0.0-preview2), and has several improvements recently, e.g. support using tokens instead of passwords for GitHub and JIRA authentication. Additionally, this PR switches to the GitHub open API @jongyoul suggested to merge the PR, which fixed the merged PR status from "Closed" to "Merged"

What type of PR is it?

Improvement

Todos

  • - verify this script by merging at least 3 PRs

What is the Jira issue?

ZEPPELIN-6089

How should this be tested?

Manually test. Currently not work due to permission issues.

$ dev/merge_zeppelin_pr.py 
git rev-parse --abbrev-ref HEAD
Which pull request would you like to merge? (e.g. 34): 4837

=== Pull Request #4837 ===
title   [MINOR] Remove duplicate entry in .gitignore
source  MyLanPangzi/patch-2
target  master
url     https://api.github.com/repos/apache/zeppelin/pulls/4837
Proceed with merging pull request #4837? (y/N): y
git config --get user.name
git config --get user.email
git fetch apache master
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 2), reused 0 (delta 0), pack-reused 0 (from 0)
Unpacking objects: 100% (3/3), 1.53 KiB | 260.00 KiB/s, done.
From github.com:apache/zeppelin
 * branch                master     -> FETCH_HEAD
   ad79848a9..35e129912  master     -> apache/master
Pull request #4837 merged!
Merge hash: 35e12991

Would you like to pick 35e12991 into another branch? (y/N): y
Enter a branch name [branch-0.9]: branch-0.11
git fetch apache branch-0.11:PR_TOOL_PICK_PR_4837_BRANCH-0.11
From github.com:apache/zeppelin
 * [new branch]          branch-0.11 -> PR_TOOL_PICK_PR_4837_BRANCH-0.11
git checkout PR_TOOL_PICK_PR_4837_BRANCH-0.11
Switched to branch 'PR_TOOL_PICK_PR_4837_BRANCH-0.11'
git cherry-pick -sx 35e12991
Pick complete (local ref PR_TOOL_PICK_PR_4837_BRANCH-0.11). Push to apache? (y/N): y
git push apache PR_TOOL_PICK_PR_4837_BRANCH-0.11:branch-0.11
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 10 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.60 KiB | 1.60 MiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
remote: 
remote: GitHub found 199 vulnerabilities on apache/zeppelin's default branch (19 critical, 70 high, 87 moderate, 23 low). To find out more, visit:
remote:      https://github.com/apache/zeppelin/security/dependabot
remote: 
To github.com:apache/zeppelin.git
   7128f7da4..a04da2e09  PR_TOOL_PICK_PR_4837_BRANCH-0.11 -> branch-0.11
git rev-parse PR_TOOL_PICK_PR_4837_BRANCH-0.11
Restoring head pointer to ZEPPELIN-6089
git checkout ZEPPELIN-6089
Switched to branch 'ZEPPELIN-6089'
git branch
Deleting local branch PR_TOOL_PICK_PR_4837_BRANCH-0.11
git branch -D PR_TOOL_PICK_PR_4837_BRANCH-0.11
Pull request #4837 picked into branch-0.11!
Pick hash: a04da2e0

Would you like to pick 35e12991 into another branch? (y/N): n
Would you like to update an associated JIRA? (y/N): n
Okay, exiting
Restoring head pointer to ZEPPELIN-6089
git checkout ZEPPELIN-6089
Already on 'ZEPPELIN-6089'
git branch
Restoring head pointer to ZEPPELIN-6089
git checkout ZEPPELIN-6089
Already on 'ZEPPELIN-6089'
git branch

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? No.
  • Is there breaking changes for older versions? No.
  • Does this needs documentation? No.

@pan3793
Copy link
Member Author

pan3793 commented Sep 13, 2024

to enable this capability, I need to revert #4452
@jongyoul @Reamer @zjffdu what do you think about it?

# will be unauthenticated. You should only need to configure this if you find yourself regularly
# exceeding your IP's unauthenticated request rate limit. You can create an OAuth key at
# https://github.com/settings/tokens. This script only requires the "public_repo" scope.
GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both JIRA and GITHUB encourage using TOKEN instead of PASSWORD for security purposes.



def grant_contributor_role(user: str):
role = asf_jira.project_role("ZEPPELIN", 10010)
Copy link
Member Author

@pan3793 pan3793 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is correct, I don't have the Zeppelin project JIRA admin permission

@jongyoul
Copy link
Member

IIRC, this script closes PRs instead of merging them. Can we use github API to merge the PR? It's the reason why I prefer to use github's merge button.

to enable this capability, I need to revert #4452

If the script works properly, I don't care to revert it :-)

@Reamer
Copy link
Contributor

Reamer commented Sep 17, 2024

It is more important to me that we have a reasonable GitHub history than the history on the mirrored Apache Git instance.
The old merge script marked the pull request as closed, which doesn't reflect the history.

In my opinion, most developers are working on the Zeppelin code via GitHub. Where did you get the screenshots?

If we get both, that's fine with me, but I don't think the Squash&Merge feature should be disabled for that.

Working with a merge script is fine with me. As you wrote, this can simplify many things.

@pan3793
Copy link
Member Author

pan3793 commented Sep 17, 2024

... this script closes PRs instead of merging them.
... merge script marked the pull request as closed, which doesn't reflect the history.

Yes, this is a drawback. If you look at Spark's PRs, the committer who merged the PR will clarify that by commenting something like "Merged to master/3.5".

Where did you get the screenshots?

A git GUI client, https://fork.dev/

... I don't think the Squash&Merge feature should be disabled for that.

Okay, and I'm not against using this approach, as long as the committer does not forget to do other things manually.

@jongyoul
Copy link
Member

I have a new idea. How about using github action more? E.g. we can trigger it by approving it, then change the commit message with squash and amend the commit. WDYT?

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

@jongyoul GitHub Actions can not solve backport and update JIRA information issues.

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

Additionally, I think it's hard to write a rule to evaluate whether the PR is ready for merge.

Even if the PR receives sufficient approval, it still has a chance of having unaddressed comments or concerns from other reviewers. Committing code to the codebase is serious, and I think we should judge the check-in condition manually(either by clicking a button, or by using a script).

@jongyoul
Copy link
Member

jongyoul commented Sep 19, 2024

@jongyoul GitHub Actions can not solve backport and update JIRA information issues.

I mean using the logic implemented in our current script.

For cherry-pick, we can make a ticket like

name: Cherry-pick on Comment

on:
  issue_comment:
    types: [created]

jobs:
  cherry-pick-on-comment:
    runs-on: ubuntu-latest
    if: contains(github.event.comment.body, '/cherry-pick')
    steps:
      - name: Checkout the repository
        uses: actions/checkout@v3

      - name: Parse cherry-pick branch from comment
        id: parse_branch
        run: |
          COMMENT_BODY="${{ github.event.comment.body }}"
          if [[ "$COMMENT_BODY" =~ /cherry-pick\ ([^\ ]+) ]]; then
            echo "::set-output name=branch::${BASH_REMATCH[1]}"
          else
            echo "Cherry-pick command not found"
            exit 1
          fi

      - name: Set up Git
        run: |
          git config user.name "Your Name"
          git config user.email "[email protected]"

      - name: Fetch and cherry-pick the commit
        run: |
          TARGET_BRANCH=${{ steps.parse_branch.outputs.branch }}
          git fetch origin $TARGET_BRANCH
          git checkout $TARGET_BRANCH
          git cherry-pick ${{ github.event.issue.pull_request.merge_commit_sha }}
          git push origin $TARGET_BRANCH

It will work /cherry-pick branch-0.11

For JIRA Update,

name: Close JIRA Ticket on Merge

on:
  pull_request:
    types:
      - closed

jobs:
  close-jira-ticket:
    runs-on: ubuntu-latest
    if: github.event.pull_request.merged == true
    steps:
      - name: Checkout the repository
        uses: actions/checkout@v3

      - name: Extract ticket number from PR title
        id: extract_ticket
        run: |
          PR_TITLE="${{ github.event.pull_request.title }}"
          if [[ "$PR_TITLE" =~ ([A-Z]+-[0-9]+) ]]; then
            echo "::set-output name=ticket::${BASH_REMATCH[1]}"
          else
            echo "No JIRA ticket number found in the title"
            exit 1
          fi

      - name: Update JIRA ticket status to Done
        run: |
          TICKET_NUMBER=${{ steps.extract_ticket.outputs.ticket }}
          curl -X PUT -H "Authorization: Basic ${{ secrets.JIRA_AUTH }}" \
          -H "Content-Type: application/json" \
          --data '{"transition": {"id": "31"}}' \
          "https://your-jira-domain.atlassian.net/rest/api/3/issue/$TICKET_NUMBER/transitions"

It will update JIRA ticket as well.

FYI, this code is generated by GPT and we should modify something. I would just like to give you an idea.

I use a similar way to manage other codes.

@jongyoul
Copy link
Member

I also don't want to merge some commits automatically. After reviewing by reviewers, only the commits will be squashed and pushed forcibly into the same PR branch. (We should check if it could work with contributors' branches)

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

@jongyoul the script is interactive, it asks the committer to fill in the backport branch and do a local cherry-pick before pushing to repo, if there are some conflicts, it allows the committer to resove it manually and continue the process, in some cases, the cherry-picked code has no line-level conflicts but breaks the compile, committers still have a chance to do a local compile before pushing the code. For updating JIRA, if we use the GitHub Actions, how to tell that the fixed version? by a comment in special words? it's another burden.

@jongyoul
Copy link
Member

jongyoul commented Sep 19, 2024

Yes, right. we need to make another function to handle this way. By the way, I agree with Reamer. we need to keep proper histories for Github because most interactions with contributors happen in Github. Merging issues are not dependent on contributors but contributions are related to contributors directly, so I feel like we need to prioritize contributors' experiences. Keeping the experience, using Github buttons or scripting is not essential.

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

keep proper histories for Github

If you are speaking the PR status, yes, the status merged by the script is "Closed" instead of "Merged", this can not be fixed due to the GitHub API limitation. I don't see contributors confused with that in the other projects that use a similar script for merging PR if we leave proper words at the PR end like "Merged to master/3.5".

PS: when I explore a PR with closed status, I usually scroll down to the end to find the close reason.

@jongyoul
Copy link
Member

jongyoul commented Sep 19, 2024

this can not be fixed due to the GitHub API limitation.

We can fix it by squashing it locally in the script, pushing it to the PR, and calling Github API to merge the PR instead of committing and pushing the PR directly to the master.

I don't see contributors confused with that

When I was a contributor, I was confused really because I didn't know why my PR was closed instead of merging even though it exists in the master branch. That's why I insisted on using the Giuthub merge button and tried to work others manually like closing JIRA with the proper fix version and cherry-picking some commits by myself.

@jongyoul
Copy link
Member

jongyoul commented Sep 19, 2024

I found an API for merging PR with proper titles and messages. Can we investigate it?

We can choose merge_method to squash and merge and add a title and messages when calling it. Using it, we won't break Github's history and can achieve the goal that the script gives to us.

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

We can fix it by squashing it locally in the script, pushing it to the PR, and calling Github API to merge the PR instead of committing and pushing the PR directly to the master.

this involves another drawback https://github.com/orgs/community/discussions/32934

@jongyoul
Copy link
Member

this involves another drawback https://github.com/orgs/community/discussions/32934

But it preserves the author. Could we allow it?

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

concerns are

it is the committing person who is responsible for the decision to merge. So it is them who should be recorded in Commit[ter], not GitHub.

I tend to agree with that.

@jongyoul
Copy link
Member

Agreed. That's the reason we do the approve button. We can guarantee it by checking the PR's approval. If some PRs weren't approved and we found it, we can revert it.

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

Though this violates the git design, given the committer's name is recorded in the commit message body, I'm OK with this approach, let me try this way.

Authored-by: Author Name <[email protected]>
Signed-off-by: Committer Name <[email protected]>

@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

@jongyoul I have updated the script to use the GitHub open API you suggested to merge the PR, and have merged 4 PRs 876d1dc 81783dd ad79848 35e1299 to test the cases:

  • fail-fast if the PR has no approval
  • commit message contains PR description and "Signed-off-by" info
  • auto backport
  • auto update JIRA ticket

@pan3793 pan3793 changed the title [WIP][ZEPPELIN-6089][INFRA] Improve the merge PR script [ZEPPELIN-6089][INFRA] Improve the merge PR script Sep 19, 2024
@pan3793
Copy link
Member Author

pan3793 commented Sep 19, 2024

@Reamer your concerns are also addressed, now, the merged PR status is "Merged"

Copy link
Member

@jongyoul jongyoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pan3793 pan3793 merged commit e20cbbc into apache:master Sep 21, 2024
28 checks passed
@pan3793
Copy link
Member Author

pan3793 commented Sep 21, 2024

Thanks, merged to master for 0.12

@pan3793 pan3793 deleted the ZEPPELIN-6089 branch September 29, 2024 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants