Opscenter backup to S3 location fails

2019-05-31 17:49发布

问题:

Using OpsCenter 5.1.1, Datastax Enterprise 4.5.1, 3 node cluster in AWS. I set up a scheduled backup to local server and also to bucket in S3. The On Server backup finished successfully on all 3 nodes. The S3 backup runs slowly and fails on all 3 nodes.

Some keyspaces are backed up, files are created in the S3 bucket. It appears that not all tables are backed up. Looking at /var/log/opscenter/opscenterd.log, I see an OOM error. Why should there be an out-of-memory error when writing to S3 when the local backup is successful??

EDIT: The data is about 6GB, I'm backing up all keyspaces. There are less than 100 tables altogether. I've set the backup to once daily.

Here's a snippet from the log:

2015-03-31 14:30:34+0000 []  WARN: Marking request 15ae726b-abf6-42b6-94b6-e87e6b0cb592 as failed: {'sstables': {'solr_admin': {u'solr_resources': {'total_size': 186626, 'total
_files': 18, 'done_files': 18, 'errors': []}}, 'stage_scheduler': {u'schedule_servers': {'total_size': 468839, 'total_files': 12, 'done_files': 12, 'errors': []}, u'lock_flags'
: {'total_size': 207313249, 'total_files': 30, 'done_files': 25, 'errors': [u'java.lang.OutOfMemoryError: Java heap space', u'java.lang.OutOfMemoryError: Java heap space', u'ja
va.lang.OutOfMemoryError: Java heap space', u'java.lang.OutOfMemoryError: Java heap space', u'java.lang.OutOfMemoryError: Java heap space']}, u'scheduled_tasks': {'total_size':
 3763468, 'total_files': 18, 'done_files': 18, 'errors': []}

回答1:

Increase the memory allocated to OpsCenter's datastax-agent:

One option is to try increasing the size of the Java heap that is allocated to the opscenter agent to avoid the OOM:

around your cluster look for the datastax-agent-env.sh file and modify the following properties:

-Xmx128M
-Djclouds.mpu.parts.size=16777216

The -Xmx setting controls the heap size of the agent. The -Djclouds setting controls the chunk size for files when uploading to S3. Since S3 supports multipart file uploads with a maximum number of 10,000 parts, the chunk size controls how large a file we can upload. Increasing the chunk size also requires using more memory on the agent, so the agent heap size also needs to be increased. Here are example settings that allow loading 250 GB SSTables:

-Xmx256M
-Djclouds.mpu.parts.size=32000000

These settings increase the chunk size to 32MB and the heap size to 256MB and allow for the larger SSTable sizes.

Please add the following information to your post:

1)How many tables are you backing up and how large are they per node?

2)How frequently did you configure your backups?