Version 1.8.X ============= Version 1.8.1 ------------- Deployed: 1st Sep 2024 Contributors ~~~~~~~~~~~~ - `Cainã Max Couto da Silva `_ - `Gurjinder Kaur `_ - `Sergio Benito Martin `_ - `Ranja Sarkar `_ - `Hector Patino `_ - `Alessandro Benetti `_ - `olikra `_ - `Kanan Mahammadli `_ - `Soledad Galli `_ In this release, we fix several bugs and future deprecation warnings from pandas and numpy. In addition, we expand the functionality of some feaure selection classes to return the standard deviation of the derived feature importance. We have also updated and expanded various pages of our documentation. Thank you very much to all contributors to this release and to `Vasco Schiavo `_ and `Gleb Levitski `_ for actively discussing many of our PRs and issues. If you value what we do, consider `sponsoring us `_, so that we can keep updating Feature-engine at a fast pace. Enhancements ~~~~~~~~~~~~ - `ProbeFeatureSelection` can now also determine feature importance through single feature model performance (`Soledad Galli `_) - `ProbeFeatureSelection` can now return the standard deviation of the feature importance (`Soledad Galli `_) - `RecursiveFeatureElimination` and `RecursiveFeatureAddition` can now return the standard deviation of the feature importances (`Soledad Galli `_) - `SelectByShuffling`, `SelectBySingleFeaturePerformance` and `SelectByTargetMeanPerformance` can now return the standard deviation of the feature importances (`Soledad Galli `_) - All feature selection classes can now implement Group cross-validation through the `groups` parameter (`Kanan Mahammadli `_) Bug fixes ~~~~~~~~~ - The cv parameter of the recursive feature selectors can now take cv generators of the type `KFold.split(X, y)` (`Alessandro Benetti `_) - The cv parameter of the remaining feature selection classes can now take cv generators of the type `KFold.split(X, y)` (`Soledad Galli `_) - `LogCpTransformer()` adds a constant only to those variables that are strictly non-positive during fit (`Soledad Galli `_) - Fix bug in `MatchVariables` that was preventing the transformer to work when missing values were raised (`Soledad Galli `_) - Fix bug in `inverse_transform()` from `YeoJohnsonTransformer()` (`Soledad Galli `_) - Fix pandas future warnings (`Soledad Galli `_) - Fix numpy future warnings (`olikra `_) Code improvements ~~~~~~~~~~~~~~~~~ - Expand coverage of various tests (`olikra `_) Documentation ~~~~~~~~~~~~~ - Expand user guide for `ReciprocalTransformer()` (`Sergio Benito Martin `_) - Expand user guide for `YeoJohnsonTransformer()` (`Ranja Sarkar `_) - Expand user guide for `WoEEncoder()` (`Hector Patino `_) - Expand user guide for `OrdinalEncoder()` (`Gurjinder Kaur `_) - Expand user guide for `MeanMedianImputer()` (`Cainã Max Couto da Silva `_) - Expand `CyclicalFeatures` documentation to explain how `max_values` are calculated and discrepancies with Scikit-learn's documentation (`Soledad Galli `_) - Add contribute.MD file to repository (`Soledad Galli `_) Version 1.8.0 ------------- Deployed: 26th May 2024 Contributors ~~~~~~~~~~~~ - `Cainã Max Couto da Silva `_ - `Gurjinder Kaur `_ - `Gleb Levitski `_ - `Lorenzo Vitali `_ - `Soledad Galli `_ In this release, we make some breaking changes. The `DecisionTreeEncoder()` does not have the encoding pipeline any more. In its place, we now added an `encoding_dict_` parameter that stores the mappings from category to predictions of the decision tree. This allowed us to implement in addition a way to handle unseen categories and the method `inverse_transform`. We also expanded the functionality of the `DecisionTreeDiscretiser()`, which can now replace the continuous attributes with the decision tree predictions, interval limits, or bin number. In addition, we introduce a new transformer, the `DecisionTreeFreatures()`, which adds new features to the data, resulting from predictions of decision trees trained on one or more features. The classes from the module `outliers` can now automatically select the limit for the boundaries for outliers. Finally, we have updated and expanded various pages of our documentation. Thank you very much to all contributors to this release and to `Vasco Schiavo `_ and `Gleb Levitski `_ for actively reviewing many of our PRs. If you value what we do, please consider `sponsoring us `_, so that we can keep updating Feature-engine at a fast pace. New ~~~ - `DecisionTreeFeatures` is a new transformer from the creation module that adds features based of predictions of decision trees (`Soledad Galli `_) Enhancements ~~~~~~~~~~~~ - `DecisionTreeEncoder` now supports encodings for unseen categories, `inverse_transform`, and provides an encoding dictionary instead of the pipeline (`Soledad Galli `_, `Gleb Levitski `_ and `Lorenzo Vitali `_ ) - The `DecisionTreeDiscretiser()` can now replace the continuous attributes with the decision tree predictions, interval limits, or bin number (`Soledad Galli `_) - The `OutlierTrimmer()` and `Winsorizer()` can now adjust the strength of the outlier search automatically based of the statistical method (param `fold="auto"`) (`Gleb Levitski `_) Documentation ~~~~~~~~~~~~~ - Improve user guide for `PowerTransformer()` (`Cainã Max Couto da Silva `_) - Improve user guide for `EqualFrequencyDiscretiser()` and `EqualWidthDiscretiser` (`Cainã Max Couto da Silva `_) - Improve user guide for the categorical encoding module (`Gurjinder Kaur `_)