loss of previous learning when change/update in the hyper parameters #710

RaMAT-44 · 2021-09-15T22:03:10Z

RaMAT-44
Sep 15, 2021

Hi Team,

In online learning methodology.

When the user wants to update the model after certain learnings(Batches) using set_params,
Then the model loses the previous learning and behaves like it's a new model.
Irrespective of changing the n_models or max_features in the hyperparameters.

Do provide any solution on suggestions on this.

Below is the sample code for this scenario.

Import packages

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from river import metrics
from river import ensemble

from collections import Counter

load Iris data set

""" Load IRIS Data set and assign feature and Target to X and y """
data_iris = load_iris()
X = data_iris.data
y = data_iris.target
X_feature_names = data_iris.feature_names

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.33, \
                                                    random_state=42)

## River method to calcuate accuracy from predictions
model_accuracy = metrics.Accuracy()

def river_learn(X,Y,X_feature_names,model,model_accuracy):
    """ This method,trains the river model on train dataset and calculates 
                                                        training accuray """
    y_predicted_all = []
    for x,y in zip(X,Y):
        X_sample_featurename_dict  = dict(zip(X_feature_names,x))
        y_predicted_one = model.predict_one(X_sample_featurename_dict)
        y_predicted_all.append(y_predicted_one)
        model.learn_one(X_sample_featurename_dict,y)
        model_accuracy.update(y,y_predicted_one)
        
    return model_accuracy,model,y_predicted_all


def river_predict(X,Y,X_feature_names,model,model_accuracy):
    """ This method,recevies trained model as parameter and 
                        predicts on test data set and calculate accuracy """
    y_predicted_all = []
    for x,y in zip(X,Y):
        X_sample_featurename_dict  = dict(zip(X_feature_names,x))
        y_predicted_one = model.predict_one(X_sample_featurename_dict)
        y_predicted_all.append(y_predicted_one)
        model_accuracy.update(y,y_predicted_one)

    return model_accuracy,y_predicted_all

Adaptive_random_forest  = ensemble.AdaptiveRandomForestClassifier()

Training the model

model_accuracy,Trained_model_RF,y_predicted_all  = river_learn(X_train,y_train,X_feature_names,Adaptive_random_forest,model_accuracy)
print(Counter(y_predicted_all[1:]))
print(model_accuracy)

# Counter({1: 39, 0: 30, 2: 30})
# Accuracy: 87.00%

"""Testing the trained_model on test data set """
model_accuracy,y_predicted_all  = river_predict(X_train,y_train,X_feature_names,Trained_model_RF,model_accuracy)
print(Counter(y_predicted_all))
print(model_accuracy)

# Counter({2: 35, 1: 34, 0: 31})
# Accuracy: 91.00%

Updated_param_Trained_model_RF_1  = Trained_model_RF._set_params({'n_models':25})

model_accuracy,y_predicted_all  = river_predict(X_train,y_train,X_feature_names,Updated_param_Trained_model_RF_1,model_accuracy)
print(Counter(y_predicted_all))

Counter({None: 100})

Updated_param_Trained_model_RF_2  = Trained_model_RF._set_params({'max_features':'log2'})

model_accuracy,y_predicted_all  = river_predict(X_train,y_train,X_feature_names,Updated_param_Trained_model_RF_2,model_accuracy)
print(Counter(y_predicted_all))

Counter({None: 100})

Answered by MaxHalford

Sep 16, 2021

Ok I understand. Unfortunately, you can't modify parameters on the fly. At least, the _set_params method "resets" the model. You could simply update the parameters manually. But in your case you're changing n_models, so that's not going to work.

May I ask, why do you want to change parameters on-the-fly? I agree that it's a nice thing to have, but it sounds overkill for most tasks.

View full answer

MaxHalford · 2021-09-15T22:09:50Z

MaxHalford
Sep 15, 2021
Maintainer

Hello. I've updated your post to make it readable. Could you please make a better effort next time? Thanks though for making it somewhat reproducible. I don't have any time at the moment but I'll dig in when I can. Cheers.

1 reply

RaMAT-44 Sep 15, 2021
Author

Thanks MaxHalford

MaxHalford · 2021-09-16T17:17:51Z

MaxHalford
Sep 16, 2021
Maintainer

Ok I understand. Unfortunately, you can't modify parameters on the fly. At least, the _set_params method "resets" the model. You could simply update the parameters manually. But in your case you're changing n_models, so that's not going to work.

May I ask, why do you want to change parameters on-the-fly? I agree that it's a nice thing to have, but it sounds overkill for most tasks.

3 replies

RaMAT-44 Sep 16, 2021
Author

Thanks, MaxHalford.

Reason to change the parameters after some learning is to improve the classification by updating with optimal parameters for better accuracy when we get new data.

And did experimentations on a parametric model from Sklearn and River.

Model: Passive Aggressive Classifier:
River:
After updating the parameter after certain batches/learning, the model losses the learning
SKlearn:
Updating the parameters after batches/learning, the model doesn't lose the learning and still predicting the same.

One more question.
Does the linear_model support multiclass classification?

MaxHalford Sep 16, 2021
Maintainer

Reason to change the parameters after some learning is to improve the classification by updating with optimal parameters for better accuracy when we get new data.

We prefer to frame this as model selection. I would recommend taking a look at the expert module to do just that.

Does the linear_model support multiclass classification?

The LogisticRegression class from the linear_model module does binary classification. You can extend it to perform multi-class classification by using any of the models in the multiclass module.

RaMAT-44 Sep 17, 2021
Author

Thanks, MaxHalford.

--- We prefer to frame this as model selection. I would recommend taking a look at the expert module to do just that.

Consider this case that I have a single model in production, the model is empty initially and started learning(Cold Start scenario) with the default parameter, Over a period of time model learns/trains on different batches of data coming and model might lose its prediction thereby accuracy.

To overcome this we experiment in local taking whole data till point and develop a model obtain optimal parameters and update the production model with those parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loss of previous learning when change/update in the hyper parameters #710

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

loss of previous learning when change/update in the hyper parameters #710

RaMAT-44 Sep 15, 2021

Import packages

load Iris data set

Training the model

Replies: 2 comments · 4 replies

MaxHalford Sep 15, 2021 Maintainer

RaMAT-44 Sep 15, 2021 Author

MaxHalford Sep 16, 2021 Maintainer

RaMAT-44 Sep 16, 2021 Author

MaxHalford Sep 16, 2021 Maintainer

RaMAT-44 Sep 17, 2021 Author

RaMAT-44
Sep 15, 2021

Replies: 2 comments 4 replies

MaxHalford
Sep 15, 2021
Maintainer

RaMAT-44 Sep 15, 2021
Author

MaxHalford
Sep 16, 2021
Maintainer

RaMAT-44 Sep 16, 2021
Author

MaxHalford Sep 16, 2021
Maintainer

RaMAT-44 Sep 17, 2021
Author