I have some data in HDFS,i need to access that data using python,can anyone tell me how data is accessed from hive using python?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
You can use hive library for access hive from python,for that you want to import hive Class from hive import ThriftHive
Below the Example
A much simpler solution if you're on Windows uses
pyodbc
:As long as you have an ODBC driver and a DSN, that's all you need.
To install you'll need these libraries:
If you're on Linux, you may need to install SASL separately before running the above. Install the package
libsasl2-dev
usingapt-get
oryum
or whatever package manager. For Windows there are some options on GNU.org. On a Mac SASL should be available if you've installed xcode developer tools (xcode-select --install
)After installation, you can execute a hive query like this:
Now that you have the hive connection, you have options how to use it. You can just straight-up query:
...or to use the connection to make a Pandas dataframe: