Django haystack with Elasticsearch cannot find dat

2019-08-04 09:00发布

问题:

I added Haystack to a Django project that was already succesfully deployed to an AWS ElasticBeanstalk instance. Haystack is working locally but in the AWS environment when I run rebuild_index. I get this error:

Failed to clear Elasticsearch index: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused')))
All documents removed.
ERROR:root:Error updating api using default 
Traceback (most recent call last):
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/haystack/management/commands/update_index.py", line 188, in handle_label
    self.update_backend(label, using)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/haystack/management/commands/update_index.py", line 219, in update_backend
    total = qs.count()
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 318, in count
    return self.query.get_count(using=self.db)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 464, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 445, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 838, in execute_sql
    cursor = self.connection.cursor()
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 162, in cursor
    cursor = self.make_debug_cursor(self._cursor())
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 135, in _cursor
    self.ensure_connection()
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 130, in ensure_connection
    self.connect()
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/utils.py", line 97, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 130, in ensure_connection
    self.connect()
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/base/base.py", line 119, in connect
    self.connection = self.get_new_connection(conn_params)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 176, in get_new_connection
    connection = Database.connect(**conn_params)
  File "/opt/python/run/venv/local/lib/python2.7/site-packages/psycopg2/__init__.py", line 164, in connect
    conn = _connect(dsn, connection_factory=connection_factory, async=async)
OperationalError: could not connect to server: Connection refused
        Is the server running on host "localhost" (127.0.0.1) and accepting
        TCP/IP connections on port 5432?

It appears that Haystack is trying to connect to the database specified in my local settings, instead of the Postgres RDS I have specified specifically for my AWS ElasticBeanstalk environment even though the 'DATABASE' setting works on AWS for ./manage.py loaddata.

    if 'RDS_DB_NAME' in os.environ:
        DATABASES = {
            'default': {
                'ENGINE': 'django.db.backends.postgresql_psycopg2',
                'NAME': os.environ['RDS_DB_NAME'],
                'USER': os.environ['RDS_USERNAME'],
                'PASSWORD': os.environ['RDS_PASSWORD'],
                'HOST': os.environ['RDS_HOSTNAME'],
                'PORT': os.environ['RDS_PORT'],
            }
        }
    else:
        DATABASES = {
            'default': {
                'ENGINE': 'django.db.backends.postgresql_psycopg2',
                'NAME': 'hhwc',
                'HOST': 'localhost',
                'PORT': '5432',
            }
        }

Is there something wrong in this 'DATABASE' setting, or does Haystack look somewhere else to find the location of the database it should connect to for generating indexes?

Any help troubleshooting this is welcome. Thanks in advance.

回答1:

From the answer to another SO question:

The environment variables are not set within the virtualenv, but by another script. First you have to activate the virualenv.

source /opt/python/run/venv/bin/activate

Then you need to load the variables by activating the env script in the 'current' directory.

source /opt/python/current/env

The Beanstalk RDS variables are now set and ready to use by any script you execute in SSH.'



回答2:

The error is not related to the database, but to your haystack configuration. Check the URL you've used there. Make sure you use :80 after the hostname, as haystack defaults to port 9200 if you don't explicitly give one and AWS sets it up in port 80.



回答3:

The error is basically, your django project is unable to connect to amazon elastic search instance. Here is a way to connect with aws elasticsearch

Firstly you need to install requests_aws4auth using

 sudo pip install requests_aws4auth

Now you need to connect with amazon elasticsearch instance

 from requests_aws4auth import AWS4Auth
 from elasticsearch import Elasticsearch, RequestsHttpConnection
 import elasticsearch
 host = 'YOUR_HOST without putting port number'
 awsauth = AWS4Auth('YOUT_ACCESS_KEY', 'YOUR_SECRET_KEY', 'REGION', 'es')

 HAYSTACK_CONNECTIONS = {
    'default': {
     'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
     'URL': host,
     'INDEX_NAME': 'haystack',
     'KWARGS': {
         'port':443,
         'http_auth': awsauth,
         'use_ssl': True,
         'verify_certs': True,
         'connection_class': elasticsearch.RequestsHttpConnection,
     }
 },

}

For those who will still face some problem you need to create index using following

 curl -XPUT 'Your_AWS_ELASTICSEARCH_URL/haystack/'