-->

How to run a shell command at analysis time in baz

2020-06-29 06:43发布

问题:

I'm trying to bake the mercurial version into my Bazel file, so that I can get something like this:

# These I set manually, since they're "semantic"
MAJOR_VERSION = 2
MINOR_VERSION = 3
BUGFIX_VERSION = 1

# This should be the result of `hg id -n`
BUILD_VERSION = ?

apple_bundle_version(
    name = "my_version",
    build_version = "{}.{}.{}.{}".format(MAJOR_VERSION, MINOR_VERSION, BUGFIX_VERSION, BUILD_VERSION),
    short_version_string = "{}.{}.{}".format(MAJOR_VERSION, MINOR_VERSION, BUGFIX_VERSION),
)

This obviously is not hermetic, so I know it violates some of Bazel's assumptions, so I'm open to other options.

Here's some possible options:

  1. Actually run hg id -n during Bazel analysis, which I don't know how to do.

  2. Pass the build version in via command line, e.g., --define=build_version=$(hg id -n). Unfortunately, this requires a separate command to wrap bazel build.

  3. Manually set the BUILD_VERSION. Obviously, this would be annoying.

Is there a way to do #1? What are my other options?

回答1:

Yes, you can do this with a custom --workspace_status_command and a genrule that processes the information and generates source files with this data.

EDIT: I removed the parts about the --stamp flag, it's not needed.

Summary

  1. Build with --workspace_status_command=/path/to/binary with a custom binary or shell script that runs hg and outputs the information you need.
  2. Write a genrule with stamp=1.

Details

1. --workspace_status_command=/path/to/binary

The --workspace_status_command=<path> flag lets you specify a binary.

Bazel runs this binary before each build. The binary should write key-value pairs to stdout. Bazel partitions the keys into two buckets: "stable" and "volatile". (The names "stable" and "volatile" are a bit counter-intuitive, so don't think much about them.)

Bazel then writes the key-value pairs into two files:

  • bazel-out/stable-status.txt contains all keys and values where the key's name starts with STABLE_
  • bazel-out/volatile-status.txt contains the rest of the keys and their values

The contract is:

  • "stable" keys' values should change rarely, if possible. If the contents of stable-status.txt change, it invalidates the actions that depend on them, e.g. the genrule.cmd if that genrule has stamp=1. In other words, if a stable key's value changes, it'll make Bazel rebuild stamped actions. Therefore the stable status should not contain things like timestamps, because they change all the time, and would make Bazel rebuild the stamped actions with each build.
  • "volatile" keys' values may change often. Bazel expects them to change all the time, like timestamps do, and duly updates the volatile-status.txt file. In order to avoid rebuilding stamped actions all the time though, Bazel pretends that the volatile file never changes. In other words, if the volatile status file is the only one whose contents changed, that will not invalidate actions that depend on it. If other inputs of the actions have changed, then Bazel rebuilds that action, and the action may then use the updated volatile status. But just the volatile status changing alone will not invalidate the action.

Example for my-status.sh:

#!/bin/bash
echo STABLE_GIT_BRANCH $(git rev-parse HEAD)
echo MY_TIMESTAMP $(date)

2. Write a genrule with stamp=1.

This attribute is undocumented, which surprises me. I'll file a bug about that.

Example for foo/BUILD:

genrule(
    name = "x",
    srcs = ["input.txt"],
    outs = ["x.txt"],
    cmd = " ; ".join([
        "( echo 'volatile data:'",
        "cat bazel-out/volatile-status.txt",
        "echo ---",
        "echo 'stable data:'",
        "cat bazel-out/stable-status.txt",
        ") > $@",
    ]),
    stamp = 1,
)

Putting it all together

The genrule isn't rebuilt

...when only bazel-out/volatile-status.txt changes:

  $ bazel build --workspace_status_command=/tmp/foo/ws.sh //foo:x &>/dev/null && cat bazel-genfiles/foo/x.txt
volatile data:
BUILD_TIMESTAMP 1512379211456
MY_TIMESTAMP Mon Dec 4 10:20:11 CET 2017
---
stable data:
BUILD_EMBED_LABEL 
BUILD_HOST <redacted>
BUILD_USER <redacted>
STABLE_GIT_BRANCH d3fed125d00f6f61bfbfe05f4566656cdac1ea6e

  $ cat bazel-out/volatile-status.txt 
BUILD_TIMESTAMP 1512379425898
MY_TIMESTAMP Mon Dec 4 10:23:45 CET 2017

  $ bazel build --workspace_status_command=/tmp/foo/ws.sh //foo:x &>/dev/null && cat bazel-genfiles/foo/x.txt
volatile data:
BUILD_TIMESTAMP 1512379211456
MY_TIMESTAMP Mon Dec 4 10:20:11 CET 2017
---
stable data:
BUILD_EMBED_LABEL 
BUILD_HOST <redacted>
BUILD_USER <redacted>
STABLE_GIT_BRANCH d3fed125d00f6f61bfbfe05f4566656cdac1ea6e

  $ cat bazel-out/volatile-status.txt 
BUILD_TIMESTAMP 1512379441919
MY_TIMESTAMP Mon Dec 4 10:24:01 CET 2017

The genrule is rebuilt

...when the stable status or the genrule's inputs change:

  $ echo bar > foo/input.txt 

  $ bazel build --workspace_status_command=/tmp/foo/ws.sh //foo:x &>/dev/null && cat bazel-genfiles/foo/x.txt
volatile data:
BUILD_TIMESTAMP 1512379566646
MY_TIMESTAMP Mon Dec 4 10:26:06 CET 2017
---
stable data:
BUILD_EMBED_LABEL 
BUILD_HOST <redacted>
BUILD_USER <redacted>
STABLE_GIT_BRANCH d3fed125d00f6f61bfbfe05f4566656cdac1ea6e


  $ git checkout HEAD~1 &>/dev/null

  $ bazel build --workspace_status_command=/tmp/foo/ws.sh //foo:x &>/dev/null && cat bazel-genfiles/foo/x.txt
volatile data:
BUILD_TIMESTAMP 1512379594890
MY_TIMESTAMP Mon Dec 4 10:26:34 CET 2017
---
stable data:
BUILD_EMBED_LABEL 
BUILD_HOST <redacted>
BUILD_USER <redacted>
STABLE_GIT_BRANCH b3da717469e23f5293297175a80709956416fd2c


回答2:

By default, the apple_bundle_version rule uses regular expressions to parse whatever you pass in to --embed_label on the Bazel command line.

So, the simplest way to achieve this would be first writing something like this:

bazel build //your:target --embed_label="$(hd id -n)"

That will set BUILD_EMBED_LABEL to the your Mercurial version in your workspace info file, for example:

BUILD_EMBED_LABEL 156
BUILD_HOST ...
BUILD_USER ...

Then you have to tell apple_bundle_version what your version numbers should look like:

VERSION_PREFIX = "{}.{}.{}".format(MAJOR, MINOR, BUGFIX)

apple_bundle_version(
    name = "my_version",
    # This is where you define that BUILD_EMBED_LABEL will look like
    # this, where each thing inside {} is one of the capture groups
    # below. So in this case, it will be a number.
    build_label_pattern = "{build}",
    capture_groups = {
        "build": "\d+",
    },
    # CFBundleVersion should be the VERSION_PREFIX above, plus a dot
    # followed by {build} extracted from BUILD_EMBED_LABEL.
    build_version = VERSION_PREFIX + ".{build}",
    # CFBundleShortVersionString will just be the VERSION_PREFIX.
    short_version_string = VERSION_PREFIX,
)

Something like that should do the trick.

Getting more advanced

If you don't want to manually pass the result of hg id -n each time you build, the apple_bundle_version rule is extensible so you can make it happen automatically. The only requirement of the version attribute on ios_application and related rules is that it point to a target that returns an AppleBundleVersionInfo provider—it doesn't have to be apple_bundle_version.

That provider propagates a small JSON file containing your bundle version and short bundle version string, so you could write your own rule that invokes hg id -n as a custom action, writes it out to a JSON formatted file with the right keys, and then use that target as your version attribute on your application instead.