xgboost

Author	SHA1	Message	Date
Yang Yang	c7bc739ed2	Fix document about colsample_by* parameter (#4340 ) Correct the calculation mistake in colsample_by* example.	2019-04-08 11:10:04 -07:00
sriramch	2f7087eba1	Improve HostDeviceVector exception safety (#4301 ) * make the assignments of HostDeviceVector exception safe. * storing a dummy GPUDistribution instance in HDV for CPU based code. * change testxgboost binary location to build directory.	2019-03-31 22:48:58 +08:00
Jiaming Yuan	29a1356669	Deprecate `reg:linear' in favor of` reg:squarederror'. (#4267 ) * Deprecate `reg:linear' in favor of `reg:squarederror'. * Replace the use of `reg:linear'. * Replace the use of `silent`.	2019-03-17 17:55:04 +08:00
Jiaming Yuan	cf8d5b9b76	Mark CUDA 10.1 as unsupported. (#4265 )	2019-03-17 16:59:15 +08:00
Jiaming Yuan	7b1b11390a	Mark Scikit-Learn RF interface as experimental in doc. (#4258 ) * Mark Scikit-Learn RF interface as experimental in doc.	2019-03-16 00:45:32 +08:00
Andy Adinets	a36c3ed4f4	Added SKLearn-like random forest Python API. (#4148 ) * Added SKLearn-like random forest Python API. - added XGBRFClassifier and XGBRFRegressor classes to SKL-like xgboost API - also added n_gpus and gpu_id parameters to SKL classes - added documentation describing how to use xgboost for random forests, as well as existing caveats	2019-03-12 22:28:19 +08:00
Rory Mitchell	4eeeded7d1	Remove various synchronisations from cuda API calls, instrument monitor (#4205 ) * Remove various synchronisations from cuda API calls, instrument monitor with nvtx profiler ranges.	2019-03-10 15:01:23 +13:00
Philip Hyunsu Cho	331cd3e4f7	Document limitation of one-split-at-a-time Greedy tree learning heuristic (#4233 )	2019-03-08 10:05:39 -08:00
Jonas	00ea7b83c9	Fix docs for `num_parallel_tree` (#4221 ) Minor formatting correction for `num_parallel_tree`.	2019-03-06 23:47:51 +08:00
Philip Hyunsu Cho	67c38805a1	Update build doc: PyPI wheel now support multi-GPU (#4219 )	2019-03-05 13:25:31 -08:00
Adam November	0c1d5f1120	Fix snapshot artifact name in docs. (#4196 )	2019-03-03 13:27:50 -08:00
Matthew Jones	92b7577c62	[REVIEW] Enable Multi-Node Multi-GPU functionality (#4095 ) * Initial commit to support multi-node multi-gpu xgboost using dask * Fixed NCCL initialization by not ignoring the opg parameter. - it now crashes on NCCL initialization, but at least we're attempting it properly * At the root node, perform a rabit::Allreduce to get initial sum_gradient across workers * Synchronizing in a couple of more places. - now the workers don't go down, but just hang - no more "wild" values of gradients - probably needs syncing in more places * Added another missing max-allreduce operation inside BuildHistLeftRight * Removed unnecessary collective operations. * Simplified rabit::Allreduce() sync of gradient sums. * Removed unnecessary rabit syncs around ncclAllReduce. - this improves performance _significantly_ (7x faster for overall training, 20x faster for xgboost proper) * pulling in latest xgboost * removing changes to updater_quantile_hist.cc * changing use_nccl_opg initialization, removing unnecessary if statements * added definition for opaque ncclUniqueId struct to properly encapsulate GetUniqueId * placing struct defintion in guard to avoid duplicate code errors * addressing linting errors * removing * removing additional arguments to AllReduer initialization * removing distributed flag * making comm init symmetric * removing distributed flag * changing ncclCommInit to support multiple modalities * fix indenting * updating ncclCommInitRank block with necessary group calls * fix indenting * adding print statement, and updating accessor in vector * improving print statement to end-line * generalizing nccl_rank construction using rabit * assume device_ordinals is the same for every node * test, assume device_ordinals is identical for all nodes * test, assume device_ordinals is unique for all nodes * changing names of offset variable to be more descriptive, editing indenting * wrapping ncclUniqueId GetUniqueId() and aesthetic changes * adding synchronization, and tests for distributed * adding to tests * fixing broken #endif * fixing initialization of gpu histograms, correcting errors in tests * adding to contributors list * adding distributed tests to jenkins * fixing bad path in distributed test * debugging * adding kubernetes for distributed tests * adding proper import for OrderedDict * adding urllib3==1.22 to address ordered_dict import error * added sleep to allow workers to save their models for comparison * adding name to GPU contributors under docs	2019-03-02 10:03:22 +13:00
Yanbo Liang	9fefa2128d	[jvm-packages] Fix early stop with xgboost4j-spark (#4176 ) * Fix early stop with xgboost4j-spark * Update XGBoost.java * Update XGBoost.java * Update XGBoost.java To use -Float.MAX_VALUE as the lower bound, in case there is positive metric. * Only update best score if the current score is better (no update when equal) * Update xgboost-spark tutorial to fix early stopping docs.	2019-03-01 13:02:57 -08:00
Jiaming Yuan	754fe8142b	Make `HistCutMatrix::Init' be aware of groups. (#4115 ) * Add checks for group size. * Simple docs. * Search group index during hist cut matrix initialization. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-02-16 04:39:41 +08:00
Nan Zhu	ae3bb9c2d5	Distributed Fast Histogram Algorithm (#4011 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * allow hist algo * more changes * temp * update * remove hist sync * udpate rabit * change hist size * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * fix lint issue * broadcast subsampled feature correctly * revert some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * update docs	2019-02-05 05:12:53 -08:00
Jiaming Yuan	8905df4a18	Perform clang-tidy on both cpp and cuda source. (#4034 ) * Basic script for using compilation database. * Add `GENERATE_COMPILATION_DATABASE' to CMake. * Rearrange CMakeLists.txt. * Add basic python clang-tidy script. * Remove modernize-use-auto. * Add clang-tidy to Jenkins * Refine logic for correct path detection In Jenkins, the project root is of form /home/ubuntu/workspace/xgboost_PR-XXXX * Run clang-tidy in CUDA 9.2 container * Use clang_tidy container	2019-02-05 16:07:43 +08:00
Jiaming Yuan	1088dff42c	Prevent training without setting up caches. (#4066 ) * Prevent training without setting up caches. * Add warning for internal functions. * Check number of features. * Address reviewer's comment.	2019-02-03 01:03:29 -08:00
Nan Zhu	e0094d996e	fix doc about max_depth (#4078 ) * fix doc * Update doc/parameter.rst Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>	2019-01-30 12:53:44 -08:00
Tatsuhito KATO	15fe2f1e7c	fix typos (#4027 )	2018-12-28 00:36:47 +08:00
Jiaming Yuan	7735252925	Document num_parallel_tree. (#4022 )	2018-12-25 22:00:58 +08:00
Nan Zhu	c055a32609	[jvm-packages]support multiple validation datasets in Spark (#3910 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * enable copartition training and validationset * add parameters * converge code path and have init unit test * enable multi evals for ranking * unit test and doc * update example * fix early stopping * address the offline comments * udpate doc * test eval metrics * fix compilation issue * fix example	2018-12-17 21:03:57 -08:00
Andy Adinets	42bf90eb8f	Column sampling at individual nodes (splits). (#3971 ) * Column sampling at individual nodes (splits). * Documented colsample_bynode parameter. - also updated documentation for colsample_by* parameters * Updated documentation. * GetFeatureSet() returns shared pointer to std::vector. * Sync sampled columns across multiple processes.	2018-12-14 22:37:35 +08:00
Jiaming Yuan	e0a279114e	Unify logging facilities. (#3982 ) * Unify logging facilities. * Enhance `ConsoleLogger` to handle different verbosity. * Override macros from `dmlc`. * Don't use specialized gamma when building with GPU. * Remove verbosity cache in monitor. * Test monitor. * Deprecate `silent`. * Fix doc and messages. * Fix python test. * Fix silent tests.	2018-12-14 19:29:58 +08:00
Rory Mitchell	93f9ce9ef9	Single precision histograms on GPU (#3965 ) * Allow single precision histogram summation in gpu_hist * Add python test, reduce run-time of gpu_hist tests * Update documentation	2018-12-10 10:55:30 +13:00
Philip Hyunsu Cho	4f26053b09	Fix typo in Feature Interaction Constraints tutorial (#3975 )	2018-12-06 19:38:40 -08:00
Philip Hyunsu Cho	e9ab4a1c6c	Address #3933 : document limitation of DMLC CSV parser + recommend Pandas (#3934 )	2018-11-23 04:13:36 -08:00
Jiaming Yuan	daf77ca7b7	Enable running objectives with 0 GPU. (#3878 ) * Enable running objectives with 0 GPU. * Enable 0 GPU for objectives. * Add doc for GPU objectives. * Fix some objectives defaulted to running on all GPUs.	2018-11-13 20:19:59 +13:00
Jiacheng Xu	d810e6dec9	Fix a typo in the R-package documentation: max.deph -> max.depth (#3890 ) Signed-off-by: Jiacheng Xu <xjcmaxwellcjx@gmail.com>	2018-11-12 01:43:23 -08:00
Philip Hyunsu Cho	828d75714d	Fix #3857 : take down AWS YARN tutorial, as it is outdated (#3885 )	2018-11-08 23:08:32 -08:00
Jiaming Yuan	f1275f52c1	Fix specifying gpu_id, add tests. (#3851 ) * Rewrite gpu_id related code. * Remove normalised/unnormalised operatios. * Address difference between `Index' and `Device ID'. * Modify doc for `gpu_id'. * Better LOG for GPUSet. * Check specified n_gpus. * Remove inappropriate `device_idx' term. * Clarify GpuIdType and size_t.	2018-11-06 18:17:53 +13:00
Philip Hyunsu Cho	c22e90d5d2	Correct typo	2018-11-04 05:22:53 -08:00
Philip Hyunsu Cho	6da462234e	Move MinGW-w64 + Python section to the end, since it's 'advanced' (#3863 )	2018-11-04 05:12:27 -08:00
Philip Hyunsu Cho	a650131fc3	Update doc: colsample_bylevel now works for tree_method=hist (#3862 ) This feature was introduced by #3635	2018-11-04 02:25:25 -08:00
Philip Hyunsu Cho	583c88bce7	[jvm-packages] Require vanilla Apache Spark (#3854 )	2018-11-01 19:15:40 -07:00
Jonathan Friedman	45d321da28	Fix typo in docs (#3852 ) Fix typo in docs	2018-11-01 13:03:59 -07:00
Zhao Hang	e3c1afac6b	Update parameter.rst (#3843 )	2018-10-31 00:19:45 +13:00
Nan Zhu	5fbe230636	[jvm-packages] documenting tracker (#3831 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * documenting tracker * Make it a separate note	2018-10-25 18:53:46 -07:00
Nan Zhu	4ae225a08d	[Blocking][jvm-packages] fix the early stopping feature (#3808 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * temp * add method for classifier and regressor * update tutorial * address the comments * update	2018-10-23 14:53:13 -07:00
KOLANICH	5480e05173	Added some instructions on using MinGW-built XGBoost with python. (#3774 ) * Added some instructions on using MinGW-built XGBoost with python. * Changes according to the discussion and some additions * Fixed wording and removed redundancy. * Even more fixes * Fixed links. Removed redundancy. * Some fixes according to the discussion * fixes * Some fixes * fixes	2018-10-09 09:07:00 -07:00
Philip Hyunsu Cho	ca33bf6476	Document gblinear parameters: feature_selector and top_k (#3780 )	2018-10-08 22:41:54 -07:00
Philip Hyunsu Cho	813d2436d3	Produce xgboost.so for XGBoost-R on Mac OSX, so that `make install` works (#3767 ) * Produce xgboost.so for XGBoost-R on Mac OSX, so that `make install` works * Modernize R build instructions * Fix crossref	2018-10-07 14:09:54 -07:00
Philip Hyunsu Cho	91903ac5d4	Fix broken doc build due to Matplotlib 3.0 release (#3764 )	2018-10-07 13:34:37 -07:00
Dmitriy Rybalko	7bbb44182a	update eval_metric doc (#3687 )	2018-09-14 08:47:05 -07:00
mrgutkun	4b43810f51	Fix #3663 : Allow sklearn API to use callbacks (#3682 ) * Fix #3663: Allow sklearn API to use callbacks * Fix lint * Add Callback API to Python API doc	2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho	5a8bbb39a1	Revert #3677 and #3674 (#3678 ) * Revert "Add scikit-learn as dependency for doc build (#3677)" This reverts commit `308f664ade`. * Revert "Add scikit-learn tests (#3674)" This reverts commit `d176a0fbc8`.	2018-09-06 20:43:17 -07:00
Philip Hyunsu Cho	308f664ade	Add scikit-learn as dependency for doc build (#3677 )	2018-09-06 14:56:05 -07:00
Philip Hyunsu Cho	190d888695	Document LambdaMART objectives: pairwise, listwise (#3672 ) * Document LambdaMART objectives * Distinguish between pairwise and listwise objectives	2018-09-06 09:54:37 -07:00
Philip Hyunsu Cho	9344f081a4	Add numpy and matplotlib as requirements for doc build (#3669 )	2018-09-04 20:56:18 -07:00
Andrew Thia	9254c58e4d	[TREE] add interaction constraints (#3466 ) * add interaction constraints * enable both interaction and monotonic constraints at the same time * fix lint * add R test, fix lint, update demo * Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping * Add Python test for interaction constraints * make R interaction constraints parameter based on feature index instead of column names, fix R coding style * Fix lint * Add BlueTea88 to CONTRIBUTORS.md * Short circuit when no constraint is specified; address review comments * Add tutorial for feature interaction constraints * allow interaction constraints to be passed as string, remove redundant column_names argument * Fix typo * Address review comments * Add comments to Python test	2018-09-04 09:35:39 -07:00
gorogm	7ef2b599c7	Link fixed. (#3640 )	2018-08-27 20:25:50 -07:00

... 3 4 5 6 7 ...

482 Commits