-->

Can we load pandas DataFrame in .NET ironpython?

2020-06-12 06:13发布

问题:

Can we load a pandas DataFrame in .NET space using iron python? If not I am thinking of converting pandas df into a csv file and then reading in .net space.

回答1:

No, Pandas is pretty well tied to CPython. Like you said, your best bet is to do the analysis in CPython with Pandas and export the result to CSV.



回答2:

Regarding the option including serialization:

I'm still investigating similar case - we want to process the data in python and then use the results in c#. Our requirement was to (preferably) keep the python part platform independent so that we can run our number crunching on either linux or windows. Long story short we decided to use binary serialization/deserialization with Message Pack: http://msgpack.org/index.html

We convert the DataFrame values to list, and serialize to file:

import msgpack as mp
data_as_list = df.values.tolist()
mp.pack(data_as_list, open("d:\\msgpack1.mp",'wb'))

Then on the C# side we use the .net implementation of MessagePack to deserialize the data:

using MsgPack;
var serializer =
   SerializationContext.Default.GetSerializer<MessagePackObject[][]>();
var unpackedObject = serializer.Unpack(File.OpenRead("d:\\msgpack1.mp"));

Main advantages of binary serialization:

  • is less prone to any encoding related issues comparing to text based serialization formats like csv, json or xml
  • depending on the data it can be faster than CSV (it was in our case):http://matthewrocklin.com/blog/work/2015/03/16/Fast-Serialization/


回答3:

It is possible to call CPython from .NET using Python.NET:

https://github.com/pythonnet/pythonnet/tree/develop