通常情况下,您正在编写某种类型的项目,一段时间后,项目的某些组件作为独立组件(也许是库)实际上是有用的。如果您从一开始就有这个想法,那么大部分代码很有可能在它自己的文件夹中。
有没有办法将 Git 项目中的一个子目录转换为子模块?
理想情况下,会发生这样的情况,即从父项目中删除该目录中的所有代码,并在其位置添加子模块项目,并带有所有适当的历史记录,并且所有父项目提交都指向正确的子模块提交.
通常情况下,您正在编写某种类型的项目,一段时间后,项目的某些组件作为独立组件(也许是库)实际上是有用的。如果您从一开始就有这个想法,那么大部分代码很有可能在它自己的文件夹中。
有没有办法将 Git 项目中的一个子目录转换为子模块?
理想情况下,会发生这样的情况,即从父项目中删除该目录中的所有代码,并在其位置添加子模块项目,并带有所有适当的历史记录,并且所有父项目提交都指向正确的子模块提交.
要将子目录隔离到其自己的存储库中,filter-branch
请在原始存储库的克隆上使用:
git clone <your_project> <your_submodule>
cd <your_submodule>
git filter-branch --subdirectory-filter 'path/to/your/submodule' --prune-empty -- --all
然后就是删除你的原始目录并将子模块添加到你的父项目中。
首先将目录更改为将成为子模块的文件夹。然后:
git init
git remote add origin <repourl>
git add .
git commit -am 'first commit in submodule'
git push -u origin master
cd ..
rm -rf <folder> # the folder which will be a submodule
git commit -am 'deleting folder'
git submodule add <repourl> <folder> # add the submodule
git commit -am 'adding submodule'
我知道这是一个旧线程,但这里的答案会压缩其他分支中的任何相关提交。
克隆和保留所有这些额外分支和提交的简单方法:
1 - 确保你有这个 git 别名
git config --global alias.clone-branches '! git branch -a | sed -n "/\/HEAD /d; /\/master$/d; /remotes/p;" | xargs -L1 git checkout -t'
2 - 克隆遥控器,拉出所有分支,更改遥控器,过滤您的目录,推送
git clone git@github.com:user/existing-repo.git new-repo
cd new-repo
git clone-branches
git remote rm origin
git remote add origin git@github.com:user/new-repo.git
git remote -v
git filter-branch --subdirectory-filter my_directory/ -- --all
git push --all
git push --tags
假设我们有一个名为的存储库repo-old
,其中包含一个子目录, sub
我们希望将其转换为具有自己 repo的子模块repo-sub
。
还打算将原始存储库repo-old
转换为修改后的存储库repo-new
,其中所有涉及先前存在的子目录sub
的提交现在都应指向我们提取的子模块存储库的相应提交repo-sub
。
git filter-branch
可以通过两步过程来实现这一点:
repo-old
从到的子目录提取repo-sub
(已在接受的答案中提到)repo-old
从到的子目录替换repo-new
(使用正确的提交映射)备注:我知道这个问题很老,并且已经提到过,git filter-branch
它有点被弃用并且可能很危险。但另一方面,它可能会帮助其他人使用转换后易于验证的个人存储库。所以请注意!请让我知道是否有任何其他工具可以做同样的事情而不会被弃用并且可以安全使用!
我将解释我是如何使用 git 版本 2.26.2 在 linux 上实现这两个步骤的。旧版本可能会在一定程度上起作用,但需要进行测试。
为了简单起见,我将自己限制在原始 repo中只有一个master
分支和一个远程的情况。另请注意,我依赖带有前缀的临时 git 标签,这些标签将在此过程中被删除。因此,如果已经有类似名称的标签,您可能需要调整下面的前缀。最后请注意,我没有对此进行广泛测试,并且可能存在配方失败的极端情况。所以请在继续之前备份所有内容!origin
repo-old
temp_
以下 bash 片段可以连接成一个大脚本,然后应该在存储库所在的同一文件夹中执行该脚本repo-org
。不建议将所有内容直接复制并粘贴到命令窗口中(即使我已经成功测试过)!
# Root directory where repo-org lives
# and a temporary location for git filter-branch
root="$PWD"
temp='/dev/shm/tmp'
# The old repository and the subdirectory we'd like to extract
repo_old="$root/repo-old"
repo_old_directory='sub'
# The new submodule repository, its url
# and a hash map folder which will be populated
# and later used in the filter script below
repo_sub="$root/repo-sub"
repo_sub_url='https://github.com/somewhere/repo-sub.git'
repo_sub_hashmap="$root/repo-sub.map"
# The new modified repository, its url
# and a filter script which is created as heredoc below
repo_new="$root/repo-new"
repo_new_url='https://github.com/somewhere/repo-new.git'
repo_new_filter="$root/repo-new.sh"
# The index filter script which converts our subdirectory into a submodule
cat << EOF > "$repo_new_filter"
#!/bin/bash
# Submodule hash map function
sub ()
{
local old_commit=\$(git rev-list -1 \$1 -- '$repo_old_directory')
if [ ! -z "\$old_commit" ]
then
echo \$(cat "$repo_sub_hashmap/\$old_commit")
fi
}
# Submodule config
SUB_COMMIT=\$(sub \$GIT_COMMIT)
SUB_DIR='$repo_old_directory'
SUB_URL='$repo_sub_url'
# Submodule replacement
if [ ! -z "\$SUB_COMMIT" ]
then
touch '.gitmodules'
git config --file='.gitmodules' "submodule.\$SUB_DIR.path" "\$SUB_DIR"
git config --file='.gitmodules' "submodule.\$SUB_DIR.url" "\$SUB_URL"
git config --file='.gitmodules' "submodule.\$SUB_DIR.branch" 'master'
git add '.gitmodules'
git rm --cached -qrf "\$SUB_DIR"
git update-index --add --cacheinfo 160000 \$SUB_COMMIT "\$SUB_DIR"
fi
EOF
chmod +x "$repo_new_filter"
cd "$root"
# Create a new clone for our new submodule repo
git clone "$repo_old" "$repo_sub"
# Enter the new submodule repo
cd "$repo_sub"
# Remove the old origin remote
git remote remove origin
# Loop over all commits and create temporary tags
for commit in $(git rev-list --all)
do
git tag "temp_$commit" $commit
done
# Extract the subdirectory and slice commits
mkdir -p "$temp"
git filter-branch --subdirectory-filter "$repo_old_directory" \
--tag-name-filter 'cat' \
--prune-empty --force -d "$temp" -- --all
# Populate hash map folder from our previously created tag names
mkdir -p "$repo_sub_hashmap"
for tag in $(git tag | grep "^temp_")
do
old_commit=${tag#'temp_'}
sub_commit=$(git rev-list -1 $tag)
echo $sub_commit > "$repo_sub_hashmap/$old_commit"
done
git tag | grep "^temp_" | xargs -d '\n' git tag -d 2>&1 > /dev/null
# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_sub_url"
# git push -u origin master
cd "$root"
# Create a clone for our modified repo
git clone "$repo_old" "$repo_new"
# Enter the new modified repo
cd "$repo_new"
# Remove the old origin remote
git remote remove origin
# Replace the subdirectory and map all sliced submodule commits using
# the filter script from above
mkdir -p "$temp"
git filter-branch --index-filter "$repo_new_filter" \
--tag-name-filter 'cat' --force -d "$temp" -- --all
# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_new_url"
# git push -u origin master
# Cleanup (commented for safety reasons)
# rm -rf "$repo_sub_hashmap"
# rm -f "$repo_new_filter"
备注:如果新创建的 reporepo-new
在此期间挂起,git submodule update --init
则尝试以递归方式重新克隆存储库一次:
cd "$root"
# Clone the new modified repo recursively
git clone --recursive "$repo_new" "$repo_new-tmp"
# Now use the newly cloned one
mv "$repo_new" "$repo_new-bak"
mv "$repo_new-tmp" "$repo_new"
# Cleanup (commented for safety reasons)
# rm -rf "$repo_new-bak"
可以做到,但并不简单。如果你搜索git filter-branch
,subdirectory
和submodule
, 会有一些关于这个过程的不错的文章。它本质上需要创建项目的两个克隆,git filter-branch
用于删除一个中的一个子目录以外的所有内容,并仅删除另一个中的那个子目录。然后您可以将第二个存储库建立为第一个存储库的子模块。
这会就地进行转换,您可以像使用任何过滤器分支(我使用git fetch . +refs/original/*:*
)一样将其退出。
我有一个项目,其中包含一个utils
开始在其他项目中有用的库,并且想将其历史拆分为子模块。没想到先看 SO,所以我自己写了,它在本地构建历史,所以速度要快一些,之后如果你愿意,你可以设置辅助命令的.gitmodules
文件等,并将子模块历史自己推送到任何地方你要。
剥离的命令本身在这里,文档在评论中,在后面的未剥离的命令中。subdir
使用set 将其作为自己的命令运行,就像subdir=utils git split-submodule
您要拆分utils
目录一样。这很 hacky,因为它是一次性的,但我在 Git 历史记录中的 Documentation 子目录中对其进行了测试。
#!/bin/bash
# put this or the commented version below in e.g. ~/bin/git-split-submodule
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
| git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
subfam=($( set -- ${fam[@]}; shift;
for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
git rev-parse -q --verify $tpar:"$subdir"
done
))
git rm -rq --cached --ignore-unmatch "$subdir"
if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
git update-index --add --cacheinfo 160000,$subfam,"$subdir"
else
subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
| git commit-tree $GIT_COMMIT:"$subdir" $(
${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
` &&
git update-index --add --cacheinfo 160000,$subnew,"$subdir"
fi
}
${debug+set +x}
#!/bin/bash
# Git filter-branch to split a subdirectory into a submodule history.
# In each commit, the subdirectory tree is replaced in the index with an
# appropriate submodule commit.
# * If the subdirectory tree has changed from any parent, or there are
# no parents, a new submodule commit is made for the subdirectory (with
# the current commit's message, which should presumably say something
# about the change). The new submodule commit's parents are the
# submodule commits in any rewrites of the current commit's parents.
# * Otherwise, the submodule commit is copied from a parent.
# Since the new history includes references to the new submodule
# history, the new submodule history isn't dangling, it's incorporated.
# Branches for any part of it can be made casually and pushed into any
# other repo as desired, so hooking up the `git submodule` helper
# command's conveniences is easy, e.g.
# subdir=utils git split-submodule master
# git branch utils $(git rev-parse master:utils)
# git clone -sb utils . ../utilsrepo
# and you can then submodule add from there in other repos, but really,
# for small utility libraries and such, just fetching the submodule
# histories into your own repo is easiest. Setup on cloning a
# project using "incorporated" submodules like this is:
# setup: utils/.git
#
# utils/.git:
# @if _=`git rev-parse -q --verify utils`; then \
# git config submodule.utils.active true \
# && git config submodule.utils.url "`pwd -P`" \
# && git clone -s . utils -nb utils \
# && git submodule absorbgitdirs utils \
# && git -C utils checkout $$(git rev-parse :utils); \
# fi
# with `git config -f .gitmodules submodule.utils.path utils` and
# `git config -f .gitmodules submodule.utils.url ./`; cloners don't
# have to do anything but `make setup`, and `setup` should be a prereq
# on most things anyway.
# You can test that a commit and its rewrite put the same tree in the
# same place with this function:
# testit ()
# {
# tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
# echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
# }
# so e.g. `testit make~95^2:t` will print the `t` tree there and if
# the `t` tree at ~95^2 from the original differs it'll print that too.
# To run it, say `subdir=path/to/it git split-submodule` with whatever
# filter-branch args you want.
# $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
| git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
subfam=($( set -- ${fam[@]}; shift;
for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
git rev-parse -q --verify $tpar:"$subdir"
done
))
git rm -rq --cached --ignore-unmatch "$subdir"
if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
# one id same for all entries, copy mapped mom's submod commit
git update-index --add --cacheinfo 160000,$subfam,"$subdir"
else
# no mapped parents or something changed somewhere, make new
# submod commit for current subdir content. The new submod
# commit has all mapped parents' submodule commits as parents:
subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
| git commit-tree $GIT_COMMIT:"$subdir" $(
${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
` &&
git update-index --add --cacheinfo 160000,$subnew,"$subdir"
fi
}
${debug+set +x}
@knittl 使用的当前答案filter-branch
使我们非常接近预期的效果,但是在尝试时,Git 向我发出了警告:
WARNING: git-filter-branch has a glut of gotchas generating mangled history
rewrites. Hit Ctrl-C before proceeding to abort, then use an
alternative filtering tool such as 'git filter-repo'
(https://github.com/newren/git-filter-repo/) instead. See the
filter-branch manual page for more details; to squelch this warning,
set FILTER_BRANCH_SQUELCH_WARNING=1.
现在,在这个问题被首次提出和回答 9 年后,filter-branch
已弃用git filter-repo
. 事实上,当我使用 git 查看我的 git 历史记录时git log --all --oneline --graph
,它充满了不相关的提交。
那怎么用git filter-repo
呢?Github 有一篇很好的文章在这里概述了这一点。(请注意,您需要独立于 git 安装它。我使用的是 python 版本pip3 install git-filter-repo
)
如果他们决定移动/删除文章,我将在下面总结和概括他们的程序:
git clone <your_old_project_remote> <your_submodule>
cd <your_submodule>
git filter-repo --path path/to/your/submodule
git remote set-url origin <your_new_submodule_remote>
git push -u origin <branch_name>
从那里,您只需要将新存储库注册为您想要的子模块:
cd <path/to/your/parent/module>
git submodule add <your_new_submodule_remote>
git submodule update
git commit