Go to file

Eric Liu 9b2e41340b make DMatrix._init_from_npy2d only copy data when necessary (#1637 )

* make DMatrix._init_from_npy2d only copy data when necessary

When creating DMatrix from a 2d ndarray, it can unnecessarily copy the input data. This can be problematic when the data is already very large--running out of memory. The copy is temporary (going out of scope at the end of this function) but it still adds to peak memory usage.

``numpy.array`` copies its input no matter what by default. By adding ``copy=False``, it will only do so when necessary. Since XGDMatrixCreateFromMat is readonly on the input buffer, this copy is not needed.

Also added comments explaining when a copy can happen (if data ordering/layout is wrong or if type is not 32-bit float).

* remove whitespace

2016-10-20 09:30:52 -07:00

amalgamation

[R-package] GPL2 dependency reduction and some fixes (#1401 )

2016-07-27 00:05:04 -07:00

demo

[jvm-packages] add Spark and XGBoost tutorial (#1649 )

2016-10-11 09:41:24 -07:00

dmlc-core @ f35f14f308

Fix the issue 1474 (#1615 )

2016-09-29 19:29:47 -07:00

doc

Update build.md - added link to nightly windows binaries (#1601 )

2016-09-21 23:13:56 -07:00

include/xgboost

[jvm-packages] XGBoost4j Windows fixes (#1639 )

2016-10-18 08:35:25 -04:00

jvm-packages

[jvm-packages] XGBoost4j Windows fixes (#1639 )

2016-10-18 08:35:25 -04:00

make

Fixed OpenMP installation on MacOSX with gcc-6 (#1460 )

2016-08-22 10:30:34 -07:00

plugin

Update dmlc-core

2016-02-10 13:11:21 -08:00

python-package

make DMatrix._init_from_npy2d only copy data when necessary (#1637 )

2016-10-20 09:30:52 -07:00

R-package

simplify installation of R pkg devel version (#1653 )

2016-10-18 10:24:01 -07:00

rabit @ a9a2a69dc1

Fix warnings from g++5 or higher (#1510 )

2016-08-26 16:14:10 -07:00

src

correct CalcDCG in rank_metric.cc and rank_obj.cc (#1642 )

2016-10-18 10:23:41 -07:00

tests

Fix mknfold using new StratifiedKFold API (#1660 )

2016-10-12 14:43:37 -07:00

.gitignore

[jvm-packages] Integration with Spark Dataframe/Dataset (#1559 )

2016-09-11 15:02:58 -04:00

.gitmodules

[REFACTOR] cleanup structure

2016-01-16 10:24:00 -08:00

.travis.yml

Update .travis.yml

2016-10-09 20:37:57 -07:00

build.sh

Minor fix on installation guide and (the probably deprecated) build script

2016-02-24 12:46:37 +08:00

CMakeLists.txt

fix the problem that there is no libxgboost.dll (#1674 )

2016-10-18 09:56:48 -07:00

CONTRIBUTORS.md

[jvm-packages] XGBoost4j Windows fixes (#1639 )

2016-10-18 08:35:25 -04:00

ISSUE_TEMPLATE.md

issue template (#1475 )

2016-08-17 22:50:37 -07:00

LICENSE

update year in LICENSE, conf.py and README.md files

2016-03-15 16:51:34 +03:00

Makefile

Add option on OSX to use macports (#1675 )

2016-10-18 09:56:00 -07:00

NEWS.md

[CORE] Refactor cache mechanism (#1540 )

2016-09-02 20:39:07 -07:00

README.md

Broken Link in README (#1275 )

2016-06-13 15:41:24 -07:00

README.md

eXtreme Gradient Boosting

Documentation | Resources | Installation | Release Notes | RoadMap

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting(also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment(Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

What's New

Ask a Question

For reporting bugs please use the xgboost/issues page.
For generic questions for to share your experience using xgboost please use the XGBoost User Group

Help to Make XGBoost Better

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

Check out call for contributions and Roadmap to see what can be improved, or open an issue if you want something.
Contribute to the documents and examples to share your experience with other users.
Add your stories and experience to Awesome XGBoost.
Please add your name to CONTRIBUTORS.md and after your patch has been merged.
- Please also update NEWS.md on changes and improvements in API and docs.

License

Reference

Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
XGBoost originates from research project at University of Washington, see also the Project Page at UW.

Languages

C++ 45.5%

Python 20.3%

Cuda 15.2%

R 6.8%

Scala 6.4%

Other 5.6%