Given this data frame from some other question:
Constraint Name TotalSP Onpeak Offpeak
Constraint_ID
77127 aaaaaaaaaaaaaaaaaa -2174.5 -2027.21 -147.29
98333 bbbbbbbbbbbbbbbbbb -1180.62 -1180.62 0
1049 cccccccccccccccccc -1036.53 -886.77 -149.76
It seems like there is an index Constraint_ID
. When I try to read it in with pd.read_clipboard
, this is how it gets loaded:
Constraint Name TotalSP Onpeak Offpeak
0 Constraint_ID NaN NaN NaN NaN
1 77127 aaaaaaaaaaaaaaaaaa -2174.50 -2027.21 -147.29
2 98333 bbbbbbbbbbbbbbbbbb -1180.62 -1180.62 0.00
3 1049 cccccccccccccccccc -1036.53 -886.77 -149.76
This is clearly wrong. How can I correct this?
read_clipboard
by default uses whitespace to separate the columns. The problem you see is because of the whitespace in the first column. If you specify two or more spaces as the separator, based on the table format it will figure out the index column itself:index_col
argument can also be used to tell pandas the first column is the index, in case the structure cannot be inferred from the separator alone:This is not as cool as @ayhan's answer, but most of the time works pretty well. Assuming you are using ipython or jupyter, just copy and paste the data into
%%file
:Then do some quick edits. With multi-indexes, just move the index up a line, something like this (also shortening "Constraint ID" to "ID" to save a little space in this case):
read_fwf
generally works pretty well on tabular stuff like this, correctly dealing with spaces in column names (usually). Of course, you can also use this basic method withread_csv
.The nice thing about this method is that for small sample data you can deal with just about any of the weird ways that users post data here. And there are a lot of weird ways. ;-)