xgboost/NEWS.md
Vadim Khotilovich a375ad2822 [R] maintenance Apr 2017 (#2237)
* [R] make sure things work for a single split model; fixes #2191

* [R] add option use_int_id to xgb.model.dt.tree

* [R] add example of exporting tree plot to a file

* [R] set save_period = NULL as default in xgboost() to be the same as in xgb.train; fixes #2182

* [R] it's a good practice after CRAN releases to bump up package version in dev

* [R] allow xgb.DMatrix construction from integer dense matrices

* [R] xgb.DMatrix: silent parameter; improve documentation

* [R] xgb.model.dt.tree code style changes

* [R] update NEWS with parameter changes

* [R] code safety & style; handle non-strict matrix and inherited classes of input and model; fixes #2242

* [R] change to x.y.z.p R-package versioning scheme and set version to 0.6.4.3

* [R] add an R package versioning section to the contributors guide

* [R] R-package/README.md: clean up the redundant old installation instructions, link the contributors guide
2017-05-01 22:51:34 -07:00

120 lines
5.0 KiB
Markdown

XGBoost Change Log
==================
This file records the changes in xgboost library in reverse chronological order.
## in progress version
* Refactored gbm to allow more friendly cache strategy
- Specialized some prediction routine
* Automatically remove nan from input data when it is sparse.
- This can solve some of user reported problem of istart != hist.size
* Minor fixes
- Thread local variable is upgraded so it is automatically freed at thread exit.
* Migrate to C++11
- The current master version now requires C++11 enabled compiled(g++4.8 or higher)
* R package:
- New parameters:
- `silent` in `xgb.DMatrix()`
- `use_int_id` in `xgb.model.dt.tree()`
- Default value of the `save_period` parameter in `xgboost()` changed to NULL (consistent with `xgb.train()`).
## v0.6 (2016.07.29)
* Version 0.5 is skipped due to major improvements in the core
* Major refactor of core library.
- Goal: more flexible and modular code as a portable library.
- Switch to use of c++11 standard code.
- Random number generator defaults to ```std::mt19937```.
- Share the data loading pipeline and logging module from dmlc-core.
- Enable registry pattern to allow optionally plugin of objective, metric, tree constructor, data loader.
- Future plugin modules can be put into xgboost/plugin and register back to the library.
- Remove most of the raw pointers to smart ptrs, for RAII safety.
* Add official option to approximate algorithm `tree_method` to parameter.
- Change default behavior to switch to prefer faster algorithm.
- User will get a message when approximate algorithm is chosen.
* Change library name to libxgboost.so
* Backward compatiblity
- The binary buffer file is not backward compatible with previous version.
- The model file is backward compatible on 64 bit platforms.
* The model file is compatible between 64/32 bit platforms(not yet tested).
* External memory version and other advanced features will be exposed to R library as well on linux.
- Previously some of the features are blocked due to C++11 and threading limits.
- The windows version is still blocked due to Rtools do not support ```std::thread```.
* rabit and dmlc-core are maintained through git submodule
- Anyone can open PR to update these dependencies now.
* Improvements
- Rabit and xgboost libs are not thread-safe and use thread local PRNGs
- This could fix some of the previous problem which runs xgboost on multiple threads.
* JVM Package
- Enable xgboost4j for java and scala
- XGBoost distributed now runs on Flink and Spark.
* Support model attributes listing for meta data.
- https://github.com/dmlc/xgboost/pull/1198
- https://github.com/dmlc/xgboost/pull/1166
* Support callback API
- https://github.com/dmlc/xgboost/issues/892
- https://github.com/dmlc/xgboost/pull/1211
- https://github.com/dmlc/xgboost/pull/1264
* Support new booster DART(dropout in tree boosting)
- https://github.com/dmlc/xgboost/pull/1220
* Add CMake build system
- https://github.com/dmlc/xgboost/pull/1314
## v0.47 (2016.01.14)
* Changes in R library
- fixed possible problem of poisson regression.
- switched from 0 to NA for missing values.
- exposed access to additional model parameters.
* Changes in Python library
- throws exception instead of crash terminal when a parameter error happens.
- has importance plot and tree plot functions.
- accepts different learning rates for each boosting round.
- allows model training continuation from previously saved model.
- allows early stopping in CV.
- allows feval to return a list of tuples.
- allows eval_metric to handle additional format.
- improved compatibility in sklearn module.
- additional parameters added for sklearn wrapper.
- added pip installation functionality.
- supports more Pandas DataFrame dtypes.
- added best_ntree_limit attribute, in addition to best_score and best_iteration.
* Java api is ready for use
* Added more test cases and continuous integration to make each build more robust.
## v0.4 (2015.05.11)
* Distributed version of xgboost that runs on YARN, scales to billions of examples
* Direct save/load data and model from/to S3 and HDFS
* Feature importance visualization in R module, by Michael Benesty
* Predict leaf index
* Poisson regression for counts data
* Early stopping option in training
* Native save load support in R and python
- xgboost models now can be saved using save/load in R
- xgboost python model is now pickable
* sklearn wrapper is supported in python module
* Experimental External memory version
## v0.3 (2014.09.07)
* Faster tree construction module
- Allows subsample columns during tree construction via ```bst:col_samplebytree=ratio```
* Support for boosting from initial predictions
* Experimental version of LambdaRank
* Linear booster is now parallelized, using parallel coordinated descent.
* Add [Code Guide](src/README.md) for customizing objective function and evaluation
* Add R module
## v0.2x (2014.05.20)
* Python module
* Weighted samples instances
* Initial version of pairwise rank
## v0.1 (2014.03.26)
* Initial release