azure.datalake.store.AdlFileSystem not found in Sp

2019-09-26 02:01发布

问题:

I am trying to use spark sql to query a csv file placed in Data Lake Store. when I query i am getting "java.lang.ClassNotFoundException: Class com.microsoft.azure.datalake.store.AdlFileSystem not found".

How can I use spark sql to query a file placed in Data Lake Store? Please help me with a sample.

Example csv:

Id     Name     Designation
1      aaa      bbb
2      ccc      ddd
3      eee      fff

Thanks in advance, Sowandharya

回答1:

Presently HDInsight-Spark Clusters are not available with Azure Data Lake Storage. Once we have the support it would work seamlessly. In the mean time you can try and use ADL Analytics to the same job on ADLS using U-SQL queries. For reference please visit the link: https://azure.microsoft.com/en-us/documentation/articles/data-lake-analytics-get-started-portal/ We are working for the support and it is currently targeted for some time prior to summer 2016. Hope it helps.

Thanks, Sourabh.



回答2:

Tried hours today to figure it out... leaving it here in case someone else needs help!

For Hadoop 3.0.1, ensure that the below is uncommented in hadoop-env.sh file

export HADOOP_OPTIONAL_TOOLS



回答3:

It seems that you didn't configure Cluster AAD Identity for Data Lake Store when creating a HDInsight Cluster.

You can try to create a Spark Cluster of HDInsight with Data Lake Store on Azure portal, please see https://azure.microsoft.com/en-us/documentation/articles/data-lake-store-hdinsight-hadoop-use-portal/.