pyhive, sqlalchemy can not connect to hadoop sandb

2019-09-18 09:33发布

I have installed,

pip install thrift
pip install PyHive
pip install thrift-sasl

and since pip install sasl failed I downloaded sasl‑0.2.1‑cp27‑cp27m‑win_amd64.whl file and installed it in my Windows 8.1 PC.

Then I wrote this code,

from pyhive import hive
cursor = hive.connect('192.168.1.232', port=10000, auth='NONE')
cursor.execute('SELECT * from sample_07 LIMIT 5',async=True)
print cursor.fetchall()

this gives the error:

Traceback (most recent call last):
  File "C:/DigInEngine/scripts/UserManagementService/fd.py", line 37, in <module>
    cursor = hive.connect('192.168.1.232', port=10000, auth = 'NONE')
  File "C:\Python27\lib\site-packages\pyhive\hive.py", line 63, in connect
    return Connection(*args, **kwargs)
  File "C:\Python27\lib\site-packages\pyhive\hive.py", line 104, in __init__
    self._transport.open()
  File "C:\Python27\lib\site-packages\thrift_sasl\__init__.py", line 72, in open
    message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

and this code gives,

from sqlalchemy import create_engine
engine = create_engine('hive://192.168.1.232:10000/default')
try:
    connection = engine.connect()
except Exception, err:
    print err
result = connection.execute('select * from sample_07;')
engine.dispose()

this error,

Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

I have downloaded Hortonworks sandbox from here and use it in a separate server.

NOTE: I went through this as well but the accepted answer is not working for me, because importing ThriftHive from hive gives Import error although I have pip installed hive. So I decided to use pyhive or sqlalchemy

How can I connect to hive and execute a query easily?

2条回答
狗以群分
2楼-- · 2019-09-18 10:21

Here are steps to build SASL on Windows, but your mileage may vary: A lot of this depends on your particular system's paths and available libraries.

Please also note that these instructions are specific to Python 2.7 (which I see you are using from the paths in your question).

The high-level overview is that you're installing this project: https://github.com/cyrusimap/cyrus-sasl. In order to do that, you have to use the legacy C++ compiler that was used to build Python 2.7. There are a couple of other steps to getting this to work.

Pre-build Steps:

  1. Install Microsoft Visual C++ Compiler for Python 2.7. Use the default installation paths - take note of where it got installed for the next 2 steps (2 options are included in the list below)
  2. Copy this file to whichever of the include locations is appropriate for your install
  3. Make a unistd.h file from this answer in the same include directory

Build steps:

  1. git clone https://github.com/cyrusimap/cyrus-sasl
  2. Open the "VS2013 x64 Native Tools Command Prompt" that's installed with the Compiler from step 1
  3. Change directory to the directory created by step 4, then the lib sub-directory
  4. nmake /f ntmakefile STATIC=no prefix=C:\sasl64
  5. nmake /f ntmakefile prefix=C:\sasl64 STATIC=no install see note below
  6. copy /B C:\sasl64\lib\libsasl.lib /B C:\sasl64\lib\sasl2.lib
  7. pip install thrift_sasl --global-option=build_ext \ --global-option=-IC:\\sasl64\\include \ --global-option=-LC:\\sasl64\\lib

'Include' locations:

  • "C:\Program Files (x86)\Common Files\Microsoft\Visual C++ for Python\9.0\VC\include\stdint.h"
  • "%USERPROFILE%\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\include"

Here's a reference to these same steps, with some additional annotations and explanations: http://java2developer.blogspot.co.uk/2016/08/making-impala-connection-from-python-on.html.

Note The referenced instructions also executed step (8) in the include and win32\include sub-directories, you may have to do that as well.

查看更多
家丑人穷心不美
3楼-- · 2019-09-18 10:25

While using pyhive no authentication can be passed as auth="NOSASL", instead of "None", so your code should look like this:

from pyhive import hive
cursor = hive.connect('192.168.1.232', port=10000, auth='NOSASL')
cursor.execute('SELECT * from sample_07 LIMIT 5',async=True)
print cursor.fetchall()
查看更多
登录 后发表回答