how to prevent gitlab ci from downloading sbt ever

2019-03-10 04:38发布

问题:

We have a play2/scala application which we are building with gitlab ci.

Our .gitlab-ci.yml (at least the important part) looks as follows:

image: hseeberger/scala-sbt

variables:
  SBT_GLOBAL_BASE_DIR: "$CI_PROJECT_DIR/cache/.sbt"
  IVY2_CACHE_DIR: "$CI_PROJECT_DIR/cache/.ivy2"
  SBT_BOOT_DIR:  "$CI_PROJECT_DIR/cache/.sbt/boot"
  M2_HOME_DIR: "$CI_PROJECT_DIR/cache/.m2"

before_script:
  # Log the sbt version
  - sbt sbt-version

build:
  stage: build
  script:
    - ./build.sh

with build.sh:

sbt -Dsbt.global.base=$SBT_GLOBAL_BASE_DIR \
  -Dsbt.ivy.home=$IVY2_CACHE_DIR \
  -Dsbt.boot.directory=$SBT_BOOT_DIR \
  compile

Unfortunately, our pipeline always runs for around 30-40 minutes with all the steps (build, verification, deploy). Most of the time it spends by downloading sbt over and over again what is really annoying.

I might not know enough about gitlab ci runners but from my understand, by using hseeberger/scala-sbt as the image, sbt should be globally available and there should be no need to download it.

Then also this solution from gitlab would not be necessary.

Anyhow, I would be glad if sbt would not be downloaded totally 6 times during each deployment whenever the server runs any sbt command.

Can someone explain me how to use the right image or the image in the right way or otherwise how I can cache the sbt stuff?

Update

Over the last days I fought a lot with docker and gitlab ci. I found that this problems is pretty much the same as described in don't downloading the internet. It seems that this is some hard task to have all the dependencies and should be best done by mounting them. That's unfortunately not possible as such on a shared gitlab ci runner.

I went on and discovered sbt-docker which allows you to build docker containers from a build.sbt file. With the package basic approach I tried to include all the locally available dependencies for the project into the container as global sbt plugins. But also this didn't help.

My last discovery was this answer regarding the maven solution and tried to translate that into our sbt project:

.gitlab-ci.yml

image: hseeberger/scala-sbt

variables:
  MAVEN_OPTS: -Dmaven.repo.local=/cache/maven.repository

stages:
  - build
  - test
  - staging
  - deploy

build:
  stage: build
  script:
    - sbt compile -Dsbt.ivy.home=/cache/.ivy2 -Dsbt.global.base=/cache/.sbt/0.13 -Dsbt.boot.directory=/cache/.sbt/boot -Dsbt.repository.config=/cache/.sbt/repositories

I can access the gitlab ci logs again. They look basically as follows:

[info] Loading project definition from /builds/kwiqjobs/backend/project
[info] Updating {file:/builds/kwiqjobs/backend/project/}backend-build...
[info] Resolving com.typesafe.play#sbt-plugin;2.5.4 ...

[info] Resolving com.typesafe.play#sbt-plugin;2.5.4 ...

[info] Resolving com.typesafe.play#sbt-routes-compiler_2.10;2.5.4 ...

[info] Resolving com.typesafe.play#sbt-routes-compiler_2.10;2.5.4 ...

[info] Resolving org.scala-lang#scala-library;2.10.6 ...

[info] Resolving com.typesafe.play#twirl-api_2.10;1.1.1 ...

[info] Resolving com.typesafe.play#twirl-api_2.10;1.1.1 ...

... a **lot** more

[info]  [SUCCESSFUL ] com.typesafe.sbt#sbt-twirl;1.1.1!sbt-twirl.jar (1033ms)
[info] downloading https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/com.typesafe.sbt/sbt-native-packager/scala_2.10/sbt_0.13/1.0.3/jars/sbt-native-packager.jar ...
[info]  [SUCCESSFUL ] com.typesafe.sbt#sbt-native-packager;1.0.3!sbt-native-packager.jar (954ms)
[info] downloading https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/com.typesafe.sbt/sbt-web/scala_2.10/sbt_0.13/1.3.0/jars/sbt-web.jar ...
[info]  [SUCCESSFUL ] com.typesafe.sbt#sbt-web;1.3.0!sbt-web.jar (1010ms)
[info] downloading https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/com.typesafe.sbt/sbt-js-engine/scala_2.10/sbt_0.13/1.1.3/jars/sbt-js-engine.jar ...
[info]  [SUCCESSFUL ] com.typesafe.sbt#sbt-js-engine;1.1.3!sbt-js-engine.jar (1147ms)
[info] downloading https://repo1.maven.org/maven2/com/typesafe/play/twirl-api_2.10/1.1.1/twirl-api_2.10-1.1.1.jar ...
[info]  [SUCCESSFUL ] com.typesafe.play#twirl-api_2.10;1.1.1!twirl-api_2.10.jar (89ms)
[info] downloading https://repo1.maven.org/maven2/commons-io/commons-io/2.4/commons-io-2.4.jar ...
[info]  [SUCCESSFUL ] commons-io#commons-io;2.4!commons-io.jar (48ms)

a **lot** more
[info] Done updating.
[info] Compiling 228 Scala sources and 4 Java sources to /builds/kwiqjobs/backend/target/scala-2.11/classes...
[info] 'compiler-interface' not yet compiled for Scala 2.11.8. Compiling...
[info]   Compilation completed in 17.735 s
[success] Total time: 149 s, completed Jan 20, 2017 2:22:52 PM
Build succeeded

And I would like to get rid off all the downloading.

回答1:

If you don't want to use custom made images, the best solution is to use Gitlab CI's caching mechanism.

It's a little hard to get it right, but this blog post describes how to do it for SBT.

Example .gitlab-ci.yml

Quoted from the blog post, minor errors corrected by myself:

# some parts originally from https://github.com/randm-ch/units-of-information/blob/master/.gitlab-ci.yml

image: "hseeberger/scala-sbt"

variables:
  SBT_VERSION: "0.13.9"
  SBT_OPTS: "-Dsbt.global.base=sbt-cache/.sbtboot -Dsbt.boot.directory=sbt-cache/.boot -Dsbt.ivy.home=sbt-cache/.ivy"

cache:
  key: "$CI_BUILD_REF_NAME" # contains either the branch or the tag, so it's caching per branch
  untracked: true
  paths:
    - "sbt-cache/.ivy/.cache"
    - "sbt-cache/.boot"
    - "sbt-cache/.sbtboot"
    - "sbt-cache/target"

stages:
  - test

test:
  script:
    - sbt test

Second example, also including apt-get caching

This is what I used for my project, usable for more general use cases and Docker images:

image: java:8

stages:
  - test

variables:
  SBT_VERSION: "0.13.9"
  SBT_OPTS: "-Dsbt.global.base=sbt-cache/.sbtboot -Dsbt.boot.directory=sbt-cache/.boot -Dsbt.ivy.home=sbt-cache/.ivy"
  SBT_CACHE_DIR: "sbt-cache/.ivy/cache"

cache:
  key: "$CI_BUILD_REF_NAME" # contains either the branch or the tag, so it's caching per branch
  untracked: true
  paths:
    - "apt-cache/"
    - "sbt-cache/.ivy/cache"
    - "sbt-cache/.boot"
    - "sbt-cache/.sbtboot"
    - "sbt-cache/target"

before_script:
  - export APT_CACHE_DIR=`pwd`/apt-cache
  - mkdir -pv $APT_CACHE_DIR
  - ls $APT_CACHE_DIR || echo "no apt-cache dir found"
  - apt-get -o dir::cache::archives=$APT_CACHE_DIR update -y
  - apt-get -o dir::cache::archives=$APT_CACHE_DIR install apt-transport-https -y
  # Install SBT
  - mkdir -pv $SBT_CACHE_DIR
  - ls $SBT_CACHE_DIR || echo "no ivy2 cache fir found"
  - echo "deb http://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list
  - apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823
  - apt-get -o dir::cache::archives=$APT_CACHE_DIR update -y
  - apt-get -o dir::cache::archives=$APT_CACHE_DIR install sbt -y
  - sbt -v sbtVersion

test:
  stage: test
  script:
     - sbt -v sbtVersion


回答2:

There are 4 things you can do:

  1. own docker image
  2. cache
  3. artifacts
  4. use own caching solution

The best solution will be a mix of them all.

  1. You can build your own docker image with all the dependencies you need, it's the fastest solution as it won't have to download everything but it will introduce another piece of the puzzle you need to take care of. You can use the in-built gitlab repository for storing it and have gitlab build it and then use it.
  2. You can use cache in Gitlab CI jobs so it won't have to download everything all the time. The default cache for sbt seems to be ~/.ivy2 so add
cache:
  paths:
    - ~/.ivy2/

As the first lines in your gitlab file to use the caches for each stage.

  1. Define the target directory as the artifact to pass it between builds and so you don't have to build it on each stage.
artifacts:
  paths:
    - target/
  1. You can store the files you need to cache in your own s3/minio/nfs if the options supplied by gitlab are not enough. This will be a very custom solution so you will have to find your own way around it.