How to use git metadata strategies compared to Cle

2019-05-17 07:23发布

问题:

In my previous developer life, clearcase was the tool, during 10+ years, for version control. Now the organisation I work for have moved over to git since 4 years. In clearcase there are easy accessible metadata constructions such as attributes on all levels of items such as repositories OR branches OR labels. git notes exists, but after some web surfing I have not come across any clear good way of doing this efficiently and why. For example UCM ClearCase baseline promotion level is a good concept that I wish would be as simple in git.

The development community stats I represent for this particular problem: < 100 developers, < 5 major release branches, < 100 customer patch branches, code base size: < 1000000 lines of code.

Hence the need for some appropriate metadata strategy and tooling.

In clearcase the following metadata constructs exist:

  • labels (common usage: pointing out all file revisions included in a external SW delivery)
  • attributes, can be applied to labels or branches:

    • label attribute, can have any values, common usage: telling the status of a label: TEST_RESULT:OK|NOK or CUSTOMER_AVAILABILITY:GENERAL|LIMITED|INTERNAL_ONLY
    • branch attribute, common usage: BRANCH_STATUS:ACTIVE|OBSOLETE
  • UCM baselines which is a form of label with a status attribute (see for example: https://www-304.ibm.com/support/docview.wss?uid=swg21135893)

  • hyperlinks (used to point merge directions for instance)

In particular:

  • the label + attribute construction which can be used for TEST_RESULT
  • branch + attribute that can bring clarity on BRANCH_STATUS

回答1:

I confirm, after using ClearCase for 10+ years, and git for 7+ years that git is about simple metadata: tag, branch, blob, commit, date, author, execution bits, ... that is pretty much it.
Any additional property would be managed by git notes.

You can see Git compared to ClearCase in my old answer "What are the basic ClearCase concepts every developer should know?".

Any release-management kind of metadata is either managed through:

  • merge workflow (and a branch strategy). git-flow is the most famous one, but certainly not the only one.
  • publication workflow, where you managed multiple instances of the same repo (in a distributed model used by git, a repo can and should be cloned around).
    You can push to a QA repo where tests are triggered, before pushing to the blessed repo, which only accepted "valid" commits (meaning you know the code compile and pass the test).
    This is a "guarded commits" approach, used for continuous integration, or code review.

Don't forget that, in a distributed model, you have other metadata not available by design: anything related to authentication or authorization is gone, as I detail in "Distributed Version Control Systems and the Enterprise - a Good mix?".


  • labels: those are done with git tag (for the all repo)
  • attributes: managed by git notes, or with dedicated branches or with dedicated repos.
  • UCM baselines: again tags (with a naming convention if you want to distinguish them from regular labels)
  • hyperlinks: not needed in git (the tag reference the commit without any intermediate "hyperlink"). The merges are memorized as "merge commit" with tow parent commits, which clearly indicate the sens of the merge.
    Since there are no parent/children stream in git (only branches), you do not have the same "deliver/rebase" semantic.

Remember: in git, a repo is similar to an UCM component.