可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I typically get PCA loadings like this:

pca = PCA(n_components=2)
X_t = pca.fit(X).transform(X)
loadings = pca.components_

If I run PCA using a scikit-learn pipline ...

from sklearn.pipeline import Pipeline
pipeline = Pipeline(steps=[    
('scaling',StandardScaler()),
('pca',PCA(n_components=2))
])
X_t=pipeline.fit_transform(X)

... is it possible to get the loadings?

Simply trying loadings = pipeline.components_ fails:

AttributeError: 'Pipeline' object has no attribute 'components_'

Thanks!

(Also interested in extracting attributes like coef_ from learning pipelines.)

回答1:

Did you look at the documentation: http://scikit-learn.org/dev/modules/pipeline.html I feel it is pretty clear.

Update: in 0.21 you can use just square brackets:

pipeline['pca']

or indices

pipeline[1]

There are two ways to get to the steps in a pipeline, either using indices or using the string names you gave:

pipeline.named_steps['pca']
pipeline.steps[1][1]

This will give you the PCA object, on which you can get components. With named_steps you can also use attribute access with a . which allows autocompletion:

pipeline.names_steps.pca.

回答2:

Using Neuraxle

Working with pipelines is simpler using Neuraxle. For instance, you can do this:

from neuraxle.pipeline import Pipeline

# Create and fit the pipeline: 
pipeline = Pipeline([
    StandardScaler(),
    PCA(n_components=2)
])
pipeline, X_t = pipeline.fit_transform(X)

# Get the components: 
pca = pipeline[-1]
components = pca.components_

You can access your PCA these three different ways as wished:

pipeline['PCA']
pipeline[-1]
pipeline[1]

Neuraxle is a pipelining library built on top of scikit-learn to take pipelines to the next level. It allows easily managing spaces of hyperparameter distributions, nested pipelines, saving and reloading, REST API serving, and more. The whole thing is made to also use Deep Learning algorithms and to allow parallel computing.

Nested pipelines:

You could have pipelines within pipelines as below.

# Create and fit the pipeline: 
pipeline = Pipeline([
    StandardScaler(),
    Identity(),
    Pipeline([
        Identity(),  # Note: an Identity step is a step that does nothing. 
        Identity(),  # We use it here for demonstration purposes. 
        Identity(),
        Pipeline([
            Identity(),
            PCA(n_components=2)
        ])
    ])
])
pipeline, X_t = pipeline.fit_transform(X)

Then you'd need to do this:

# Get the components: 
pca = pipeline["Pipeline"]["Pipeline"][-1]
components = pca.components_

Getting model attributes from scikit-learn pipelin

问题:

回答1:

回答2:

Using Neuraxle

Nested pipelines:

收藏的人(0)

Getting model attributes from scikit-learn pipelin

问题:

回答1:

回答2:

Using Neuraxle

Nested pipelines:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮