SparkSQL read from MySQL database table using Pyth

2019-04-05 10:07发布

问题:

This question already has an answer here:

  • How to work with MySQL and Apache Spark? [closed] 10 answers

I have a 'user' table in MySQL. I want to read it to my Spark SQL program. How can I read the table from MySQL to the Apache Spark's SparkSQL module using Python? Is there a connector I can use for this task? Thanks.

回答1:

There is a similar question answered. Start pyspark like this

./bin/pyspark --packages mysql:mysql-connector-java:5.1.38

Then just run

sqlContext.read.format("jdbc").options(
url ="jdbc:mysql://localhost/mysql",
driver="com.mysql.jdbc.Driver",
dbtable="user",
user="root",
password=""
).load().take(10) 

This would most likely just work. But this depends on your mysql set-up, so if it doesn't try changing password, username, db-url and other settings.