I forked a repository named rasa_nlu
to work on a part of the code I want to modify : there is a function component.train(...)
inside of a function train(...)
in a file model.py
which seems to trigger warnings without providing the origin and I want to find what trigger it.
Basically it applies this function to a list of components:
[<rasa_nlu.utils.spacy_utils.SpacyNLP object at 0x7f3abbfbd780>, <rasa_nlu.tokenizers.spacy_tokenizer.SpacyTokenizer object at 0x7f3abbfbd710>, <rasa_nlu.featurizers.spacy_featurizer.SpacyFeaturizer object at 0x7f3abbfbd748>, <rasa_nlu.featurizers.regex_featurizer.RegexFeaturizer object at 0x7f3abbd1a630>, <rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor object at 0x7f3abbd1a748>, <rasa_nlu.extractors.entity_synonyms.EntitySynonymMapper object at 0x7f3abbd1a3c8>, <rasa_nlu.classifiers.sklearn_intent_classifier.SklearnIntentClassifier object at 0x7f3abbd1a240>]
And it seems that the last one triggers the warnings.
I tried to modify the function train()
in the components.py
file of the repository and it didn't changed anything so I suspect it is not the right one.
Anyway here is the code train(...)
in a file model.py
:
...
import rasa_nlu
from rasa_nlu import components, utils, config
from rasa_nlu.components import Component, ComponentBuilder
from rasa_nlu.config import RasaNLUModelConfig, override_defaults
from rasa_nlu.persistor import Persistor
from rasa_nlu.training_data import TrainingData, Message
from rasa_nlu.utils import create_dir, write_json_to_file
...
class Trainer(object):
"""Trainer will load the data and train all components.
Requires a pipeline specification and configuration to use for
the training."""
# Officially supported languages (others might be used, but might fail)
SUPPORTED_LANGUAGES = ["de", "en"]
def __init__(self,
cfg, # type: RasaNLUModelConfig
component_builder=None, # type: Optional[ComponentBuilder]
skip_validation=False # type: bool
):
# type: (...) -> None
self.config = cfg
self.skip_validation = skip_validation
self.training_data = None # type: Optional[TrainingData]
if component_builder is None:
# If no builder is passed, every interpreter creation will result in
# a new builder. hence, no components are reused.
component_builder = components.ComponentBuilder()
# Before instantiating the component classes, lets check if all
# required packages are available
if not self.skip_validation:
components.validate_requirements(cfg.component_names)
# build pipeline
self.pipeline = self._build_pipeline(cfg, component_builder)
...
def train(self, data, **kwargs):
# type: (TrainingData) -> Interpreter
"""Trains the underlying pipeline using the provided training data."""
self.training_data = data
context = kwargs # type: Dict[Text, Any]
for component in self.pipeline:
updates = component.provide_context()
if updates:
context.update(updates)
# Before the training starts: check that all arguments are provided
if not self.skip_validation:
components.validate_arguments(self.pipeline, context)
# data gets modified internally during the training - hence the copy
working_data = copy.deepcopy(data)
for i, component in enumerate(self.pipeline):
logger.info("Starting to train component {}"
"".format(component.name))
component.prepare_partial_processing(self.pipeline[:i], context)
print("before train")
updates = component.train(working_data, self.config,
**context)
logger.info("Finished training component.")
print("before updates")
if updates:
context.update(updates)
return Interpreter(self.pipeline, context)
And the output is
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
Fitting 2 folds for each of 6 candidates, totalling 12 fits
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.1s finished
before updates
trainer.persist:
You can see here the warnings which I want to catch and modify to know the origin UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
Therefore, can you see where does this warnings come from? What calls for sklearn/metrics/classification.py
?