How to pass optimization flags to bazel build for

2019-07-13 01:28发布

I am trying to build TF for android with bazel. I noticed that when I build TF with makefile c++ code is optimized and it almost 2 times faster than the bazel produced library. What can be the reason of this? Here the modified tf_copts()

def tf_copts():
  return ([
           "-Wno-sign-compare",
           "-fno-exceptions",
           ] +
          if_cuda(["-DGOOGLE_CUDA=1"]) +
          if_android_arm(["-mfpu=neon", "-mfloat-abi=softfp"]) +
          if_x86(["-msse4.1"]) +
          select({
              "//tensorflow:android": [
                  "-DNDEBUG",
                  "-std=c++11",
                  "-DTF_LEAN_BINARY",
                  "-O2",
                  "-fno-rtti",
                  "-DGOOGLE_PROTOBUF_NO_RTTI",
                  "-DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER",
                  "-fPIE",
                  "-finline-functions",
                  "-funswitch-loops",
                  "-fpredictive-commoning",
                  "-fgcse-after-reload",
                  "-ftree-loop-distribute-patterns",
                  "-fvect-cost-model",
                  "-ftree-partial-pre",
                  "-fpeel-loops"
              ],
              "//tensorflow:darwin": [],
              "//tensorflow:windows": [
                "/DLANG_CXX11",
                "/D__VERSION__=\\\"MSVC\\\"",
                "/DPLATFORM_WINDOWS",
                "/DEIGEN_HAS_C99_MATH",
                "/DTENSORFLOW_USE_EIGEN_THREADPOOL",
              ],
              "//tensorflow:ios": ["-std=c++11"],
              "//conditions:default": ["-pthread"]}))

And here is the build command that I use.

bazel build -c opt //tensorflow/contrib/android:libtensorflow_inference.so    --crosstool_top=//external:android/crosstool    
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain --cpu=armeabi-v7a

Also c++ flag section in makefile:

    CXXFLAGS +=\
--sysroot $(NDK_ROOT)/platforms/android-$(ANDROID_API_VERSION)/arch-$(sysroot_arch) \
-Wno-narrowing \
-fPIE \
-DGOOGLE_PROTOBUF_NO_RTTI \
-DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER \
-DTF_LEAN_BINARY \
-O2 \
-finline-functions \
-funswitch-loops \
-fpredictive-commoning \
-fgcse-after-reload \
-ftree-loop-distribute-patterns \
-fvect-cost-model \
-ftree-partial-pre \
-fpeel-loops \
-mfloat-abi=softfp \
-mfpu=neon \
-march=armv7-a

0条回答
登录 后发表回答