Go to file

Sergei Lebedev d535340459 [jvm-packages] Exposed baseMargin (#2450 )

* Disabled excessive Spark logging in tests

* Fixed a singature of XGBoostModel.predict

Prior to this commit XGBoostModel.predict produced an RDD with
an array of predictions for each partition, effectively changing
the shape wrt the input RDD. A more natural contract for prediction
API is that given an RDD it returns a new RDD with the same number
of elements. This allows the users to easily match inputs with
predictions.

This commit removes one layer of nesting in XGBoostModel.predict output.
Even though the change is clearly non-backward compatible, I still
think it is well justified.

* Removed boxing in XGBoost.fromDenseToSparseLabeledPoints

* Inlined XGBoost.repartitionData

An if is more explicit than an opaque method name.

* Moved XGBoost.convertBoosterToXGBoostModel to XGBoostModel

* Check the input dimension in DMatrix.setBaseMargin

Prior to this commit providing an array of incorrect dimensions would
have resulted in memory corruption. Maybe backport this to C++?

* Reduced nesting in XGBoost.buildDistributedBoosters

* Ensured consistent naming of the params map

* Cleaned up DataBatch to make it easier to comprehend

* Made scalastyle happy

* Added baseMargin to XGBoost.train and trainWithRDD

* Deprecated XGBoost.train

It is ambiguous and work only for RDDs.

* Addressed review comments

* Revert "Fixed a singature of XGBoostModel.predict"

This reverts commit 06bd5dcae7780265dd57e93ed7d4135f4e78f9b4.

* Addressed more review comments

* Fixed NullPointerException in buildDistributedBoosters

2017-06-30 08:27:24 -07:00

amalgamation

Histogram Optimized Tree Grower (#1940 )

2017-01-13 09:25:55 -08:00

cub @ f3937a96fd

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

demo

[GPU-Plugin] Change GPU plugin to use tree_method parameter, bump cmake version to 3.5 for GPU plugin, add compute architecture 3.5, remove unused cmake files (#2455 )

2017-06-29 16:19:45 +12:00

dmlc-core @ b5bec5481d

Remove xgboost's thread_local and switch to dmlc::ThreadLocalStore (#2121 )

2017-03-27 09:09:18 -07:00

doc

Update URL for "Multiclass logloss". (#2469 )

2017-06-30 08:06:09 +02:00

include/xgboost

[GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360 )

2017-06-06 09:39:53 +12:00

jvm-packages

[jvm-packages] Exposed baseMargin (#2450 )

2017-06-30 08:27:24 -07:00

make

[GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360 )

2017-06-06 09:39:53 +12:00

nccl @ 8ec6c27a33

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

plugin

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

python-package

Fixed shared library loading in the Python package (#2461 )

2017-06-29 11:50:50 +12:00

R-package

[R] many minor changes to increase the robustness of the R code (#2404 )

2017-06-15 22:56:23 -05:00

rabit @ a764d45cfb

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

src

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

tests

[GPU-Plugin] Add basic continuous integration for GPU plugin. (#2431 )

2017-06-22 10:15:28 -04:00

.gitignore

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

.gitmodules

[GPU-Plugin] Multi-GPU for grow_gpu_hist histogram method using NVIDIA NCCL. (#2395 )

2017-06-12 05:06:08 +12:00

.travis.yml

[jvm-packages] Another pack of build/CI improvements (#2422 )

2017-06-21 12:28:35 -07:00

appveyor.yml

[jvm-packages] Test xgboost4j on Windows (#2451 )

2017-06-26 11:19:18 -07:00

build.sh

Add build failure message (#2397 )

2017-06-25 22:32:11 -04:00

CMakeLists.txt

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

CONTRIBUTORS.md

Update CONTRIBUTORS.md (#2350 )

2017-05-27 08:38:32 -07:00

ISSUE_TEMPLATE.md

Update ISSUE_TEMPLATE.md (#2308 )

2017-05-18 08:49:07 -07:00

Jenkinsfile

[GPU-Plugin] Add basic continuous integration for GPU plugin. (#2431 )

2017-06-22 10:15:28 -04:00

LICENSE

update year in LICENSE, conf.py and README.md files

2016-03-15 16:51:34 +03:00

Makefile

[GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463 )

2017-06-30 20:04:17 +12:00

NEWS.md

Sklearn kwargs (#2338 )

2017-05-23 21:47:53 -05:00

README.md

[GPU-Plugin] (#2227 )

2017-04-25 16:37:10 -07:00

README.md

eXtreme Gradient Boosting

Documentation | Resources | Installation | Release Notes | RoadMap

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

What's New

Ask a Question

For reporting bugs please use the xgboost/issues page.
For generic questions or to share your experience using XGBoost please use the XGBoost User Group

Help to Make XGBoost Better

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

Check out call for contributions and Roadmap to see what can be improved, or open an issue if you want something.
Contribute to the documents and examples to share your experience with other users.
Add your stories and experience to Awesome XGBoost.
Please add your name to CONTRIBUTORS.md and after your patch has been merged.
- Please also update NEWS.md on changes and improvements in API and docs.

License

Reference

Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
XGBoost originates from research project at University of Washington, see also the Project Page at UW.

Languages

C++ 45.5%

Python 20.3%

Cuda 15.2%

R 6.8%

Scala 6.4%

Other 5.6%