PYODBC to Pandas - DataFrame not working - Shape o

2019-03-11 05:38发布

问题:

I used pyodbc with python before but now I have installed it on a new machine ( win 8 64 bit, Python 2.7 64 bit, PythonXY with Spyder).

Before I used to (at the bottom you can find more real examples):

columns = [column[0] for column in cursor.description]
temp = cursor.fetchall()
data = pandas.DataFrame(temp,columns=columns)

and it would work fine. Now it seems like DataFrame is not able to convert from the data fetched from the cursor anymore. It returns:

Shape of passed values is (x,y), indices imply (w,z)

I kind of see where the issue is. Basically, imagine I fetch only one row. Then DataFrame would like to shape it (1,1), one element only. While I would like to have (1,X) where X is the length of the list.

I am not sure why the behavior changed. Maybe it is the Pandas version I have, or the pyodbc, but updating is problematic. I tried to update some modules but it screws up everything, any method I use (binaries--for the right machine/installation--pip install, easy-install,anything! etc.. which is very frustrating indeed. I would probably avoid Win 8 64 bit from now on for Python).

Real examples:

sql = 'Select * form TABLE'
cursor.execute(sql)
columns = [column[0] for column in cursor.description]
data    = cursor.fetchall()
        con.close()
            results = DataFrame(data, columns=columns)

Returns: * ValueError: Shape of passed values is (1, 1540), indices imply (51, 1540)

Notice that:

ipdb> type(data)
<type 'list'>
ipdb> np.shape(data)
(1540, 51)
ipdb> type(data[0])
<type 'pyodbc.Row'>

Now, for example, if we do:

ipdb> DataFrame([1,2,3],columns=['a','b','c'])

* ValueError: Shape of passed values is (1, 3), indices imply (3, 3)

and if we do:

ipdb> DataFrame([[1,2,3]],columns=['a','b','c'])

a b c 0 1 2 3

However, even trying:

ipdb> DataFrame([data[0]], columns=columns)
*** ValueError: Shape of passed values is (1, 1), indices imply (51, 1)

or

ipdb> DataFrame(data[0], columns=columns)
*** PandasError: DataFrame constructor not properly called!

Please help :) Thanks!

回答1:

As of Pandas 0.12 (I believe) you can do:

import pandas
import pyodbc

sql = 'select * from table'
cnn = pyodbc.connect(...)

data = pandas.read_sql(sql, cnn)

Prior to 0.12, you could do:

import pandas
from pandas.io.sql import read_frame
import pyodbc

sql = 'select * from table'
cnn = pyodbc.connect(...)

data = read_frame(sql, cnn)


回答2:

This is because the cursor returns not a list of tuples but a list of the Row objects, which are similar to tuples, better, actually, but they confuse the pandas dataframe constructor. In the original example, do this before creating the data frame:

for i in range(0,len(temp)):
    temp[i]=tuple(temp[i])