TreeFeatureSelectionTransform

class TreeFeatureSelectionTransform(model: Union[Literal['catboost'], Literal['random_forest'], sklearn.tree._classes.DecisionTreeRegressor, sklearn.tree._classes.ExtraTreeRegressor, sklearn.ensemble._forest.RandomForestRegressor, sklearn.ensemble._forest.ExtraTreesRegressor, sklearn.ensemble._gb.GradientBoostingRegressor, catboost.core.CatBoostRegressor], top_k: int, features_to_use: Union[List[str], Literal['all']] = 'all', return_features: bool = False)[source]

Bases: etna.transforms.feature_selection.base.BaseFeatureSelectionTransform

Transform that selects features according to tree-based models feature importance.

Notes

Transform works with any type of features, however most of the models works only with regressors. Therefore, it is recommended to pass the regressors into the feature selection transforms.

Init TreeFeatureSelectionTransform.

Parameters
  • model (Union[Literal['catboost'], typing.Literal['random_forest'], sklearn.tree._classes.DecisionTreeRegressor, sklearn.tree._classes.ExtraTreeRegressor, sklearn.ensemble._forest.RandomForestRegressor, sklearn.ensemble._forest.ExtraTreesRegressor, sklearn.ensemble._gb.GradientBoostingRegressor, catboost.core.CatBoostRegressor]) –

    Model to make selection, it should have feature_importances_ property (e.g. all tree-based regressors in sklearn).

    If catboost.CatBoostRegressor is given with no cat_features parameter, then cat_features are set during fit to be equal to columns of category type.

    Pre-defined options are also available:

    • catboost: catboost.CatBoostRegressor(iterations=1000, silent=True);

    • random_forest: sklearn.ensemble.RandomForestRegressor(n_estimators=100, random_state=0).

  • top_k (int) – num of features to select; if there are not enough features, then all will be selected

  • features_to_use (Union[List[str], Literal['all']]) – columns of the dataset to select from; if “all” value is given, all columns are used

  • return_features (bool) – indicates whether to return features or not.

Inherited-members

Methods

fit(ts)

Fit the transform.

fit_transform(ts)

Fit and transform TSDataset.

get_regressors_info()

Return the list with regressors created by the transform.

inverse_transform(ts)

Inverse transform TSDataset.

load(path)

Load an object.

params_to_tune()

Get default grid for tuning hyperparameters.

save(path)

Save the object.

set_params(**params)

Return new object instance with modified parameters.

to_dict()

Collect all information about etna object in dict.

transform(ts)

Transform TSDataset inplace.

params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution][source]

Get default grid for tuning hyperparameters.

This grid tunes parameters: model, top_k. Other parameters are expected to be set by the user.

For model parameter only pre-defined options are suggested. For top_k parameter the maximum suggested value is not greater than self.top_k.

Returns

Grid to tune.

Return type

Dict[str, etna.distributions.distributions.BaseDistribution]