How to clone all repos at once from GitHub?

2020-01-25 12:41发布

I have a company GitHub account and I want to back up all of the repositories within, accounting for anything new that might get created for purposes of automation. I was hoping something like this:

git clone git@github.com:company/*.git 

or similar would work, but it doesn't seem to like the wildcard there.

Is there a way in Git to clone and then pull everything assuming one has the appropriate permissions?

26条回答
小情绪 Triste *
2楼-- · 2020-01-25 12:51

You can use open-source tool to clone bunch of github repositories: https://github.com/artiomn/git_cloner

Example:

git_cloner --type github --owner octocat --login user --password user https://my_bitbucket

Use JSON API from api.github.com. You can see the code example in the github documentation: https://developer.github.com/v3/

Or there:

https://github.com/artiomn/git_cloner/blob/master/src/git_cloner/github.py

查看更多
祖国的老花朵
3楼-- · 2020-01-25 12:56

I found a comment in the gist @seancdavis provided to be very helpful, especially because like the original poster, I wanted to sync all the repos for quick access, however the vast majority of which were private.

curl -u [[USERNAME]] -s https://api.github.com/orgs/[[ORGANIZATION]]/repos?per_page=200 |
  ruby -rubygems -e 'require "json"; JSON.load(STDIN.read).each { |repo| %x[git clone #{repo["ssh_url"]} ]}'

Replace [[USERNAME]] with your github username and [[ORGANIZATION]] with your Github organization. The output (JSON repo metadata) will be passed to a simple ruby script:

# bring in the Ruby json library
require "json"

# read from STDIN, parse into ruby Hash and iterate over each repo
JSON.load(STDIN.read).each do |repo|
  # run a system command (re: "%x") of the style "git clone <ssh_url>"
  %x[git clone #{repo["ssh_url"]} ]
end
查看更多
地球回转人心会变
4楼-- · 2020-01-25 12:56
curl -s https://api.github.com/orgs/[GITHUBORG_NAME]/repos | grep clone_url | awk -F '":' '{ print $2 }' | sed 's/\"//g' | sed 's/,//' | while read line; do git clone "$line"; done
查看更多
太酷不给撩
5楼-- · 2020-01-25 12:57

I tried a few of the commands and tools above, but decided they were too much of a hassle, so I wrote another command-line tool to do this, called github-dl.

To use it (assuming you have nodejs installed)

npx github-dl -d /tmp/test wires

This would get a list of all the repo's from wires and write info into the test directory, using the authorisation details (user/pass) you provide on the CLI.

In detail, it

  1. Asks for auth (supports 2FA)
  2. Gets list of repos for user/org through Github API
  3. Does pagination for this, so more than 100 repo's supported

It does not actually clone the repos, but instead write a .txt file that you can pass into xargs to do the cloning, for example:

cd /tmp/test
cat wires-repo-urls.txt | xargs -n2 git clone

# or to pull
cat /tmp/test/wires-repo-urls.txt | xargs -n2 git pull

Maybe this is useful for you; it's just a few lines of JS so should be easy to adjust to your needs

查看更多
够拽才男人
6楼-- · 2020-01-25 12:58

So, I will add my answer too. :) (I found it's simple)

Fetch list (I've used "magento" company):

curl -si https://api.github.com/users/magento/repos | grep ssh_url | cut -d '"' -f4

Use clone_url instead ssh_url to use HTTP access.

So, let's clone them all! :)

curl -si https://api.github.com/users/magento/repos | \
    grep ssh_url | cut -d '"' -f4 | xargs -i git clone {}

If you are going to fetch private repo's - just add GET parameter ?access_token=YOURTOKEN

查看更多
做个烂人
7楼-- · 2020-01-25 12:58

A Python3 solution that includes exhaustive pagination via Link Header.

Pre-requisites:


import json
import requests
from requests.auth import HTTPBasicAuth
import links_from_header

respget = lambda url: requests.get(url, auth=HTTPBasicAuth('githubusername', 'githubtoken'))

myorgname = 'abc'
nexturl = f"https://api.github.com/orgs/{myorgname}/repos?per_page=100"

while nexturl:
    print(nexturl)
    resp = respget(nexturl)

    linkheads = resp.headers.get('Link', None)
    if linkheads:
        linkheads_parsed = links_from_header.extract(linkheads)
        nexturl = linkheads_parsed.get('next', None)
    else:
        nexturl = None

    respcon = json.loads(resp.content)
    with open('repolist', 'a') as fh:
        fh.writelines([f'{respconi["full_name"]}\n' for respconi in respcon])

Then, you can use xargs or parallel and: cat repolist | parallel -I% hub clone %

查看更多
登录 后发表回答