How to push and pull from github without sharing s

2019-03-02 09:49发布

问题:

When I pull from github to a server repository I want to avoid overwriting localized sensitive information in certain files, for example config.php.

Note: it's not an open-source type repo; I have full control over the repository, I'm the only user, it's private, but critically, it's based on an open-source framework that might change the structure of the config files. I just want to be able to pull from it to test, staging, and production and not accidentally have production's config end up on test, etc. But I can't re-code the config files to pull data from somewhere else without making for tough merging situations later if the framework gets updated.

Ideally I'd want to be able to tell Git, when pulling, during fetching from REPO_URI, always discard any hunks that might change the information presently to be found on line 24 of FILE_PATH. However I gather that is not possible (correct me if I'm wrong).

However unless someone can offer a way to do the above, then please read the below solution and let me know if that seems like the ideal way to do this:

I would use keyword expansion as described in git's user guide here. Below I'll describe how I would do this and then at the bottom ask some questions about this approach.

Description of Method

First I'd write two scripts, "sensitive_values_inserter" and "sensitive_values_remover", that swap certain dummy keywords (that will be in the github repo master) with the particular sensitive information like passwords, usernames, database paths, etc.:

#! /bin/sh -f
sed -e 's/@USERNAME@/dummyvalue/' -e 's/@PASSWORD@/dummyvalue/' $1

etc.

Second I would make three versions of this script, one for each environment: test/staging/production. Each version would contain the specific passwords, usernames, and database paths relevant to the environment it belongs to, instead of the dummy values. I'd place each one of these scripts in a path relative to each of these code repositories, like this:

/live/filters/sensitive_values_inserter
/live/filters/sensitive_values_remover
/live/repo/{LIVE}
/test/filters/sensitive_values_inserter
/test/filters/sensitive_values_remover
/test/repo/{TEST}
/stag/filters/sensitive_values_inserter
/stag/filters/sensitive_values_remover
/stag/repo/{STAG}

Each of these filters would have the specific values for the relevant setups.

Then the entire repo's config would be modified as such:

$ git config filter.infosafe.smudge '../filters/sensitive_values_inserter'
$ git config filter.infosafe.clean '../filters/sensitive_values_remover'

Finally in the server repository do this:

$ echo 'config.php filter=infosafe' >> .gitattributes

That way whenever pulling from the main server, if I understand this correctly, these filters would replace the "dummy" values with the ones I want to use.

Note: to get this to work, as pointed out in this other stackoverflow question, after setting up everything as mentioned above you must:

cd /path/to/your/repo
git stash save
git checkout HEAD -- "$(git rev-parse --show-toplevel)"
git stash pop

In between the checkout and stash pop I had to commit all the changes to the files where the clean operation had taken place. Don't worry, after you commit them, the ones in the working directory get smudged. (It's kind of counter-intuitive, but it works.)

I was able to successfully push to github and only the clean values appear.

(There is an alternate, more advanced technique along these lines that involves using one .gitignore per branch, and two drivers and two filters per branch. This allows for live passwords to be cleaned out when switching to test branch, and vice-versa. The trick is to invoke the cleaners for both branches in the .gitignore of each branch, but only invoke the smudger of the branch that's the home of the .gitignore, so it restores the password of itself. Still in this scenario, when pushing to github all sensitive information remains cleaned out, which is nice. I could go into detail on that if anyone is interested.)

Questions About This Method & Alternatives

I tested this, and it works. But...

Is there a better way to do this using git? I might add that it's not an option to just ignore the files that have the sensitive information in them and it's not an option to ignore changes to them when merging, because I want to be able to pull changes to these files while retaining certain configuration values. That is why I don't want to simply do use git update-index --assume-unchanged FILENAME to permanently ignore future local modifications to the entire files.

Thanks.