可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Has somebody occur same problem with me that Google Cloud Dataflow BigQueryIO.Write happen unknown error (http code 500)?

I use Dataflow to handle some data in April, May, June, I use same code to process April data (400MB) and write to BigQuery success, but when I process May (60MB) or June (90MB) data, It was fail.

The data format in April, May and June are same.
Change writer from BigQuery to TextIO, job will success, so I think data format is good.
Log Dashboard no any error log.....
System only same unknown error

The code I wrote is here: http://pastie.org/10907947

Error Message after "Executing BigQuery import job":

Workflow failed. Causes: 
(cc846): S01:Read Files/Read+Window.Into()+AnonymousParDo+BigQueryIO.Write/DataflowPipelineRunner.BatchBigQueryIOWrite/DataflowPipelineRunner.BatchBigQueryIONativeWrite failed., 
(e19a27451b49ae8d): BigQuery import job "dataflow_job_631261" failed., (e19a745a666): BigQuery creation of import job for table "hi_event_m6" in dataset "TESTSET" in project "lib-ro-123" failed., 
(e19a2749ae3f): BigQuery execution failed., 
(e19a2745a618): Error: Message: An internal error occurred and the request could not be completed. HTTP Code: 500

回答1:

Sorry for the frustration. Looks like you are hit a limit on the number of files being written to BQ. This is a known issue that we're in the process of fixing.

In the meantime, you can work around this issue by either decreasing the number of input files or resharding the data (do a GroupByKey and then ungroup the data -- semantically it's a no-op, but it forces the data to be materialized so that the parallelism of the write operation isn't constrained by the parallelism of the read).