with the help of kubernetes I am running daily jobs on GKE, On a daily basis based on cron configured in kubernetes a new container spins up and try to insert some data into BigQuery.
The setup that we have is we have 2 different projects in GCP in one project we maintain the data in BigQuery in other project we have all the GKE running so when GKE has to interact with different project resource my guess is I have to set an environment variable with name GOOGLE_APPLICATION_CREDENTIALS which points to a service account json file, but since every day kubernetes is spinning up a new container I am not sure how and where I should set this variable.
Thanks in Advance!
NOTE: this file is parsed as a golang template by the drone-gke plugin.
---
apiVersion: v1
kind: Secret
metadata:
name: my-data-service-account-credentials
type: Opaque
data:
sa_json: "bas64JsonServiceAccount"
---
apiVersion: v1
kind: Pod
metadata:
name: adtech-ads-apidata-el-adunit-pod
spec:
containers:
- name: adtech-ads-apidata-el-adunit-container
volumeMounts:
- name: service-account-credentials-volume
mountPath: "/etc/gcp"
readOnly: true
volumes:
- name: service-account-credentials-volume
secret:
secretName: my-data-service-account-credentials
items:
- key: sa_json
path: sa_credentials.json
This is our cron jobs for loading the AdUnit Data
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: adtech-ads-apidata-el-adunit
spec:
schedule: "*/5 * * * *"
suspend: false
concurrencyPolicy: Replace
successfulJobsHistoryLimit: 10
failedJobsHistoryLimit: 10
jobTemplate:
spec:
template:
spec:
containers:
- name: adtech-ads-apidata-el-adunit-container
image: {{.image}}
args:
- -cp
- opt/nyt/DFPDataIngestion-1.0-jar-with-dependencies.jar
- com.nyt.cron.AdUnitJob
env:
- name: ENV_APP_NAME
value: "{{.env_app_name}}"
- name: ENV_APP_CONTEXT_NAME
value: "{{.env_app_context_name}}"
- name: ENV_GOOGLE_PROJECTID
value: "{{.env_google_projectId}}"
- name: ENV_GOOGLE_DATASETID
value: "{{.env_google_datasetId}}"
- name: ENV_REPORTING_DATASETID
value: "{{.env_reporting_datasetId}}"
- name: ENV_ADBRIDGE_DATASETID
value: "{{.env_adbridge_datasetId}}"
- name: ENV_SALESFORCE_DATASETID
value: "{{.env_salesforce_datasetId}}"
- name: ENV_CLOUD_PLATFORM_URL
value: "{{.env_cloud_platform_url}}"
- name: ENV_SMTP_HOST
value: "{{.env_smtp_host}}"
- name: ENV_TO_EMAIL
value: "{{.env_to_email}}"
- name: ENV_FROM_EMAIL
value: "{{.env_from_email}}"
- name: ENV_AWS_USERNAME
value: "{{.env_aws_username}}"
- name: ENV_CLIENT_ID
value: "{{.env_client_id}}"
- name: ENV_REFRESH_TOKEN
value: "{{.env_refresh_token}}"
- name: ENV_NETWORK_CODE
value: "{{.env_network_code}}"
- name: ENV_APPLICATION_NAME
value: "{{.env_application_name}}"
- name: ENV_SALESFORCE_USERNAME
value: "{{.env_salesforce_username}}"
- name: ENV_SALESFORCE_URL
value: "{{.env_salesforce_url}}"
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/etc/gcp/sa_credentials.json"
- name: ENV_CLOUD_SQL_URL
valueFrom:
secretKeyRef:
name: secrets
key: cloud_sql_url
- name: ENV_AWS_PASSWORD
valueFrom:
secretKeyRef:
name: secrets
key: aws_password
- name: ENV_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: secrets
key: dfp_client_secret
- name: ENV_SALESFORCE_PASSWORD
valueFrom:
secretKeyRef:
name: secrets
key: salesforce_password
restartPolicy: OnFailure
So, if your GKE project is project
my-gke
, and the project containing the services/things your GKE containers need access to is projectmy-data
, one approach is to:my-data
project. Give it whatever GCP roles/permissions are needed (ex.roles/bigquery. dataViewer
if you have some BigQuery tables that yourmy-gke
GKE containers need to read)..json
file containing the SA credentials.Create a Kubernetes secret resource for those service account credentials. It might look something like this:
Mount the credentials in the container that needs access:
Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable in the container to point to the path of the mounted credentials:With that, any official GCP clients (ex. the GCP Python client, GCP Java Client, gcloud CLI, etc. should respect the
GOOGLE_APPLICATION_CREDENTIALS
env var and, when making API requests, automatically use the credentials of themy-data
service account that you created and mounted the credentials.json
file for.