I have a couple of MongoDB documents wherein one my the fields is best represented as a matrix (numpy array). I would like to save this document to MongoDB, how do I do this?
{
'name' : 'subject1',
'image_name' : 'blah/foo.png',
'feature1' : np.array(...)
}
For a 1D numpy array, you can use lists:
# serialize 1D array x
record['feature1'] = x.tolist()
# deserialize 1D array x
x = np.fromiter( record['feature1'] )
For multidimensional data, I believe you'll need to use pickle and pymongo.binary.Binary:
# serialize 2D array y
record['feature2'] = pymongo.binary.Binary( pickle.dumps( y, protocol=2) ) )
# deserialize 2D array y
y = pickle.loads( record['feature2'] )
We've built an open source library for storing numeric data (Pandas, numpy, etc.) in MongoDB:
https://github.com/manahl/arctic
Best of all it's really easy to use, pretty fast and supports data versioning, multiple data libraries and more.
The code pymongo.binary.Binary(...) didnt work for me, may be we need to use bson as
@tcaswell suggested.
Anyway here is one solution for multi-dimensional numpy array
>>from bson.binary import Binary
>>import pickle
# convert numpy array to Binary, store record in mongodb
>>record['feature2'] = Binary(pickle.dumps(npArray, protocol=2), subtype=128 )
# get record from mongodb, convert Binary to numpy array
>> npArray = pickle.loads(record['feature2'])
Having said that, the credit goes to MongoWrapper used the code written by them.
Have you tried Monary?
They have examples on the site
http://djcinnovations.com/index.php/archives/103
Have you try MongoWrapper, i think it simple :
Declare connection to mongodb server and collection to save your np.
import monogowrapper as mdb
db = mdb.MongoWrapper(dbName='test',
collectionName='test_collection',
hostname="localhost",
port="27017")
my_dict = {"name": "Important experiment",
"data":np.random.random((100,100))}
The dictionary's just as you'd expect it to be:
print my_dict
{'data': array([[ 0.773217, 0.517796, 0.209353, ..., 0.042116, 0.845194,
0.733732],
[ 0.281073, 0.182046, 0.453265, ..., 0.873993, 0.361292,
0.551493],
[ 0.678787, 0.650591, 0.370826, ..., 0.494303, 0.39029 ,
0.521739],
...,
[ 0.854548, 0.075026, 0.498936, ..., 0.043457, 0.282203,
0.359131],
[ 0.099201, 0.211464, 0.739155, ..., 0.796278, 0.645168,
0.975352],
[ 0.94907 , 0.363454, 0.912208, ..., 0.480943, 0.810243,
0.217947]]),
'name': 'Important experiment'}
Save data to mongo :
db.save(my_dict)
To load back data :
my_loaded_dict = db.load({"name":"Important experiment"})