When adopting an open-source software project that you do not own, you may find it necessary to modify it partially to meet your specific requirements. However, as you implement these changes, it’s important to recognize that the upstream project will eventually update itself, leading to potential conflicts in the files that both you and the upstream project have altered.
Typically, you can manage this process using Git. You would create a fork of the upstream repository, make your modifications on your fork, and when it’s time to update your fork with the latest version of the upstream repository, you would perform a merge and resolve any conflicts in the modified files.
However, there are instances where your requirements may prevent you from forking the upstream Git project. For example, you might need to make changes directly to artifacts such as tar.gz
or RPM
files, particularly if you want to avoid or cannot repeat the upstream build processes.
This situation arises with certain NetEye components, where the NetEye CI re-packages some upstream RPM files. In these cases, we take the original RPM, apply our customizations, and then rebuild the RPM.
Since we cannot rely on Git’s functionalities to manage our forks in this context, we have developed a custom solution that allows us to easily and safely maintain the modifications made to the upstream project.
To efficiently and safely manage our forks while integrating updates from the upstream project, we aimed to implement the following features:
Let’s consider customizing the configuration file located at /etc/cool_project.conf
for the upstream project.
Our primary objective is to maintain a Git repository that tracks the following files:
cool_project.conf.customized
.
/etc/cool_project.conf
.cool_project.conf.orig
.
At this stage, we can utilize the following pseudo-code during the build process to implement our two desired features:
new_cool_project_version = 1.2.3-4
# download upstream artifact (e.g. tar.gz or rpm)
new_cool_project_artifact = get_artifact(new_cool_project_version)
# unpack the artifact in a temporary folder
unpacked_artifact_folder="/tmp/my_unpacked_artifact_folder/"
unpack_artifact(new_cool_project_rpm, unpacked_artifact_folder)
forked_file_path = unpacked_artifact_folder + "/etc/cool_project.conf"
if (forked_file_path != "cool_project.conf.orig") {
print_diff(forked_file_path, "cool_project.conf.orig")
print("Please do the following: port the changes to cool_project.conf.customized, update cool_project.conf.orig and relaunch the build")
exit 1
} else {
# do nothing
}
# install the customized path in the final path of the a
copy("cool_project.conf.customized", forked_file_path)
# reconstruct the artifact that now will contain our customized file
repack(unpacked_artifact_folder)
In this blog post, we explored a low-effort method that enables you to efficiently and safely maintain a fork of an external project. This is particularly useful when you need to work directly with artifacts such as tar.gz
or RPM
files, where the conventional method of forking a Git project may not be applicable.
Based on our experience, this approach offers a straightforward way to keep your forks synchronized with the upstream project. It minimizes the risk of “losing” changes from upstream while keeping manual intervention to a minimum during updates. This method has proven especially effective for forks of configuration files, which are typically limited in number and infrequently updated by upstream projects.