The error "ValueError: y contains previously unseen labels" occurs because the LabelEncoder has encountered previously unseen labels during the inverse_transform step.
To fix this error, you can modify the code by adding a check to ensure that the labels in the test data have been previously seen by the LabelEncoder. One way to do this is to use the
classes_ attribute of the LabelEncoder object, which stores the unique labels seen during the fit_transform step.
Here's an example of how you could modify the code:
X_train_obj_imputed_encoded = pd.DataFrame(impute_test_rf.transform(X_train_obj), columns=X_train_obj.columns) # Inverse transform the encoded data X_train_obj_imputed = X_train_obj_imputed_encoded.apply(lambda series: pd.Series( le.inverse_transform(series[series.notnull()]) if set(series.unique()).issubset(set(le.classes_)) else series, index=series.index ))
In this modified code, the
inverse_transform step is only applied to series where all unique labels in the series are already present in
le.classes_. For any series where there are previously unseen labels, the original encoded values are kept.
This should resolve the ValueError you encountered.