AttributeError using pyBrain _splitWithPortion - o

2019-03-18 01:05发布

I'm testing out pybrain following the basic classification tutorial here and a different take on it with some more realistic data here. However I receive this error when applying trndata._convertToOneOfMany() with the error:

AttributeError: 'SupervisedDataSet' object has no attribute '_convertToOneOfMany

The data set is created as a classification.ClassificationDataSet object however calling splitWithProportion seems to change it supervised.SupervisedDataSet object, so being fairly new to Python this error doesn't seem such a surprise as the supervised.SupervisedDataSet doesn't have that method, classification.ClassificationDataSet does. Code here.

However the same exact code is used across so many tutorials I feel that I must be missing something as plenty of other people have it working. I've looked at changes to the codebase on github and there's nothing around this function, I've also tried running under Python 3 vs 2.7 but no difference. If anyone has any pointers to get me back on the right path and that would be very much appreciated.

#flatten the 64x64 data in to one dimensional 4096
ds = ClassificationDataSet(4096, 1 , nb_classes=40)
for k in xrange(len(X)): #length of X is 400
    ds.addSample(np.ravel(X[k]),y[k])
    # a new sample consisting of input and target

print(type(ds))      
tstdata, trndata = ds.splitWithProportion( 0.25 )
print(type(trndata))

trndata._convertToOneOfMany()
tstdata._convertToOneOfMany()

6条回答
Luminary・发光体
2楼-- · 2019-03-18 01:15

I had the same problem. I added the following code to make it work on my machine.

tstdata_temp, trndata_temp = alldata.splitWithProportion(0.25)

tstdata = ClassificationDataSet(2, 1, nb_classes=3)
for n in xrange(0, tstdata_temp.getLength()):
    tstdata.addSample( tstdata_temp.getSample(n)[0], tstdata_temp.getSample(n)[1] )

trndata = ClassificationDataSet(2, 1, nb_classes=3)
for n in xrange(0, trndata_temp.getLength()):
    trndata.addSample( trndata_temp.getSample(n)[0], trndata_temp.getSample(n)[1] )

This converts tstdata and trndata back to the ClassificationDataSet type.

查看更多
干净又极端
3楼-- · 2019-03-18 01:21

So, I did the following without getting an error:

from pybrain.datasets import ClassificationDataSet
ds = ClassificationDataSet(4096, 1 , nb_classes=40)
for k in range(400):
    ds.addSample(k,k%4)
print(type(ds))
# <class 'pybrain.datasets.classification.ClassificationDataSet'>
tstdata, trndata = ds.splitWithProportion(0.25)
print(type(trndata))
# <class 'pybrain.datasets.classification.ClassificationDataSet'>
print(type(tstdata))
# <class 'pybrain.datasets.classification.ClassificationDataSet'>
trndata._convertToOneOfMany()
tstdata._convertToOneOfMany()

The only difference I see between my code and yours is your use of X. Perhaps you can confirm that my code works on your machine, and if so then we could look into what about X if confusing things?

查看更多
何必那么认真
4楼-- · 2019-03-18 01:24

I tried the suggested workaround from Muhammed Miah, but I still was tripped up when running the tutorial at the line:

print( trndata['input'][0], trndata['target'][0], trndata['class'][0])

trndata['class'] was an empty array, so index [0] threw a fault.

I was able to workaround by making my own function ConvertToOneOfMany:

def ConvertToOneOfMany(d,nb_classes,bounds=(0,1)):
  d2 = ClassificationDataSet(d.indim, d.outdim, nb_classes=nb_classes)
  for n in range(d.getLength()):
    d2.addSample( d.getSample(n)[0], d.getSample(n)[1] )
  oldtarg=d.getField('target')
  newtarg=np.zeros([len(d),nb_classes],dtype='Int32')+bounds[0]
  for i in range(len(d)):
    newtarg[i,int(oldtarg[i])]=bounds[1]
  d2.setField('class',oldtarg)
  d2.setField('target',newtarg)
  return(d2)
查看更多
该账号已被封号
5楼-- · 2019-03-18 01:24

The implementation of splitWithProportionchanged between PyBrain versions 0.3.2 and 0.3.3., introducing this bug that breaks polymorphism.
As of now, the library hasn't been updated since January 2015, so using some kind of workaround is the only course of action at the moment.

You can check the responsible commit here: https://github.com/pybrain/pybrain/commit/2f02b8d9e4e9d6edbc135a355ab387048a00f1af

查看更多
乱世女痞
6楼-- · 2019-03-18 01:25

The simplest workaround that I found was to do first the splitWithProportion(), update the number of classes and then do the _convertToOneOfMany().

tstdata, trndata = alldata.splitWithProportion( 0.25 )
tstdata.nClasses = alldata.nClasses
trndata.nClasses = alldata.nClasses
tstdata._convertToOneOfMany(bounds=[0, 1])
trndata._convertToOneOfMany(bounds=[0, 1])

And with the update of nClasses of both testdata and trndata, it is guarantee that you don't get different dimensions in the target fields.

I was geting errors either if I did first _convertToOneOfMany and second splitWithProportion or the other way around when working with a ClassificationDataSet. So, I suggested and update in the splitWithProportion function. You can see the whole code in this pullRequest.

查看更多
神经病院院长
7楼-- · 2019-03-18 01:30

I have the same issue and think I fixed it: See this pull request.

(Python 2.7.6, PyBrain 0.3.3, OS X 10.9.5)

查看更多
登录 后发表回答