Using cloud function to load data into Big Query T

2020-07-23 06:32发布

问题:

I have a cloud function that currently will take a .csv file that is landing on the cloud storage and will load the file into a Big Query Table. the issue is it is appending to it, I need it to overwrite, I found a way to do this using the command line --replace but not sure how to do it in .json using a cloud function. Below is my current code:

exports.ToBigQuery_Stage = (event, callback) => {
  const file = event.data;
  const context = event.context;

  const BigQuery = require('@google-cloud/bigquery');
  const Storage = require('@google-cloud/storage');

  const projectId = "gas-ddr";
  const datasetId = "gas_ddr_qc_stage";
  const bucketName = file.bucket;
  const filename = file.name;



  // Do not use the ftp_files Bucket to ensure that the bucket does not get crowded. 
  // Change bucket to gas_ddr_files_staging
  // Set the table name (TableId) to the full file name including date, 
  // this will give each table a new distinct name and we can keep a record of all of the files recieved.
  // This may not be the best way to do this... at some point we will need to archive and delete prior records. 
    const dashOffset = filename.indexOf('-');
  const tableId = filename.substring(0, dashOffset) + "_STAGE";


  console.log(`Load ${filename} into ${tableId}.`);

 // Instantiates clients
  const bigquery = new BigQuery({
    projectId: projectId,
  });

  const storage = Storage({
    projectId: projectId,
  });

  const metadata = {
  allowJaggedRows: true,
  skipLeadingRows: 1

 };



  let job;

  // Loads data from a Google Cloud Storage file into the table
  bigquery
    .dataset(datasetId)
    .table(tableId)
    .load(storage.bucket(bucketName).file(filename),metadata)
    .then(results => {
      job = results[0];
      console.log(`Job ${job.id} started.`);

      // Wait for the job to finish
      return job;
    })
    .then(metadata => {
      // Check the job's status for errors
      const errors = metadata.status.errors;
      if (errors && errors.length > 0) {
        throw errors;
      }
    })
    .then(() => {
      console.log(`Job ${job.id} completed.`);
    })
    .catch(err => {
      console.error('ERROR:', err);
    });

  callback();
};

回答1:

You can add this into metadata:

const metadata = {
allowJaggedRows: true,
skipLeadingRows: 1,
writeDisposition: 'WRITE_TRUNCATE'
};

You can find more in the documentation.