xgboost

Author	SHA1	Message	Date
Jonas	00ea7b83c9	Fix docs for `num_parallel_tree` (#4221 ) Minor formatting correction for `num_parallel_tree`.	2019-03-06 23:47:51 +08:00
Philip Hyunsu Cho	67c38805a1	Update build doc: PyPI wheel now support multi-GPU (#4219 )	2019-03-05 13:25:31 -08:00
Adam November	0c1d5f1120	Fix snapshot artifact name in docs. (#4196 )	2019-03-03 13:27:50 -08:00
Matthew Jones	92b7577c62	[REVIEW] Enable Multi-Node Multi-GPU functionality (#4095 ) * Initial commit to support multi-node multi-gpu xgboost using dask * Fixed NCCL initialization by not ignoring the opg parameter. - it now crashes on NCCL initialization, but at least we're attempting it properly * At the root node, perform a rabit::Allreduce to get initial sum_gradient across workers * Synchronizing in a couple of more places. - now the workers don't go down, but just hang - no more "wild" values of gradients - probably needs syncing in more places * Added another missing max-allreduce operation inside BuildHistLeftRight * Removed unnecessary collective operations. * Simplified rabit::Allreduce() sync of gradient sums. * Removed unnecessary rabit syncs around ncclAllReduce. - this improves performance _significantly_ (7x faster for overall training, 20x faster for xgboost proper) * pulling in latest xgboost * removing changes to updater_quantile_hist.cc * changing use_nccl_opg initialization, removing unnecessary if statements * added definition for opaque ncclUniqueId struct to properly encapsulate GetUniqueId * placing struct defintion in guard to avoid duplicate code errors * addressing linting errors * removing * removing additional arguments to AllReduer initialization * removing distributed flag * making comm init symmetric * removing distributed flag * changing ncclCommInit to support multiple modalities * fix indenting * updating ncclCommInitRank block with necessary group calls * fix indenting * adding print statement, and updating accessor in vector * improving print statement to end-line * generalizing nccl_rank construction using rabit * assume device_ordinals is the same for every node * test, assume device_ordinals is identical for all nodes * test, assume device_ordinals is unique for all nodes * changing names of offset variable to be more descriptive, editing indenting * wrapping ncclUniqueId GetUniqueId() and aesthetic changes * adding synchronization, and tests for distributed * adding to tests * fixing broken #endif * fixing initialization of gpu histograms, correcting errors in tests * adding to contributors list * adding distributed tests to jenkins * fixing bad path in distributed test * debugging * adding kubernetes for distributed tests * adding proper import for OrderedDict * adding urllib3==1.22 to address ordered_dict import error * added sleep to allow workers to save their models for comparison * adding name to GPU contributors under docs	2019-03-02 10:03:22 +13:00
Yanbo Liang	9fefa2128d	[jvm-packages] Fix early stop with xgboost4j-spark (#4176 ) * Fix early stop with xgboost4j-spark * Update XGBoost.java * Update XGBoost.java * Update XGBoost.java To use -Float.MAX_VALUE as the lower bound, in case there is positive metric. * Only update best score if the current score is better (no update when equal) * Update xgboost-spark tutorial to fix early stopping docs.	2019-03-01 13:02:57 -08:00
Jiaming Yuan	754fe8142b	Make `HistCutMatrix::Init' be aware of groups. (#4115 ) * Add checks for group size. * Simple docs. * Search group index during hist cut matrix initialization. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-02-16 04:39:41 +08:00
Nan Zhu	ae3bb9c2d5	Distributed Fast Histogram Algorithm (#4011 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * allow hist algo * more changes * temp * update * remove hist sync * udpate rabit * change hist size * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * fix lint issue * broadcast subsampled feature correctly * revert some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * update docs	2019-02-05 05:12:53 -08:00
Jiaming Yuan	8905df4a18	Perform clang-tidy on both cpp and cuda source. (#4034 ) * Basic script for using compilation database. * Add `GENERATE_COMPILATION_DATABASE' to CMake. * Rearrange CMakeLists.txt. * Add basic python clang-tidy script. * Remove modernize-use-auto. * Add clang-tidy to Jenkins * Refine logic for correct path detection In Jenkins, the project root is of form /home/ubuntu/workspace/xgboost_PR-XXXX * Run clang-tidy in CUDA 9.2 container * Use clang_tidy container	2019-02-05 16:07:43 +08:00
Jiaming Yuan	1088dff42c	Prevent training without setting up caches. (#4066 ) * Prevent training without setting up caches. * Add warning for internal functions. * Check number of features. * Address reviewer's comment.	2019-02-03 01:03:29 -08:00
Nan Zhu	e0094d996e	fix doc about max_depth (#4078 ) * fix doc * Update doc/parameter.rst Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>	2019-01-30 12:53:44 -08:00
Tatsuhito KATO	15fe2f1e7c	fix typos (#4027 )	2018-12-28 00:36:47 +08:00
Jiaming Yuan	7735252925	Document num_parallel_tree. (#4022 )	2018-12-25 22:00:58 +08:00
Nan Zhu	c055a32609	[jvm-packages]support multiple validation datasets in Spark (#3910 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * enable copartition training and validationset * add parameters * converge code path and have init unit test * enable multi evals for ranking * unit test and doc * update example * fix early stopping * address the offline comments * udpate doc * test eval metrics * fix compilation issue * fix example	2018-12-17 21:03:57 -08:00
Andy Adinets	42bf90eb8f	Column sampling at individual nodes (splits). (#3971 ) * Column sampling at individual nodes (splits). * Documented colsample_bynode parameter. - also updated documentation for colsample_by* parameters * Updated documentation. * GetFeatureSet() returns shared pointer to std::vector. * Sync sampled columns across multiple processes.	2018-12-14 22:37:35 +08:00
Jiaming Yuan	e0a279114e	Unify logging facilities. (#3982 ) * Unify logging facilities. * Enhance `ConsoleLogger` to handle different verbosity. * Override macros from `dmlc`. * Don't use specialized gamma when building with GPU. * Remove verbosity cache in monitor. * Test monitor. * Deprecate `silent`. * Fix doc and messages. * Fix python test. * Fix silent tests.	2018-12-14 19:29:58 +08:00
Rory Mitchell	93f9ce9ef9	Single precision histograms on GPU (#3965 ) * Allow single precision histogram summation in gpu_hist * Add python test, reduce run-time of gpu_hist tests * Update documentation	2018-12-10 10:55:30 +13:00
Philip Hyunsu Cho	4f26053b09	Fix typo in Feature Interaction Constraints tutorial (#3975 )	2018-12-06 19:38:40 -08:00
Philip Hyunsu Cho	e9ab4a1c6c	Address #3933 : document limitation of DMLC CSV parser + recommend Pandas (#3934 )	2018-11-23 04:13:36 -08:00
Jiaming Yuan	daf77ca7b7	Enable running objectives with 0 GPU. (#3878 ) * Enable running objectives with 0 GPU. * Enable 0 GPU for objectives. * Add doc for GPU objectives. * Fix some objectives defaulted to running on all GPUs.	2018-11-13 20:19:59 +13:00
Jiacheng Xu	d810e6dec9	Fix a typo in the R-package documentation: max.deph -> max.depth (#3890 ) Signed-off-by: Jiacheng Xu <xjcmaxwellcjx@gmail.com>	2018-11-12 01:43:23 -08:00
Philip Hyunsu Cho	828d75714d	Fix #3857 : take down AWS YARN tutorial, as it is outdated (#3885 )	2018-11-08 23:08:32 -08:00
Jiaming Yuan	f1275f52c1	Fix specifying gpu_id, add tests. (#3851 ) * Rewrite gpu_id related code. * Remove normalised/unnormalised operatios. * Address difference between `Index' and `Device ID'. * Modify doc for `gpu_id'. * Better LOG for GPUSet. * Check specified n_gpus. * Remove inappropriate `device_idx' term. * Clarify GpuIdType and size_t.	2018-11-06 18:17:53 +13:00
Philip Hyunsu Cho	c22e90d5d2	Correct typo	2018-11-04 05:22:53 -08:00
Philip Hyunsu Cho	6da462234e	Move MinGW-w64 + Python section to the end, since it's 'advanced' (#3863 )	2018-11-04 05:12:27 -08:00
Philip Hyunsu Cho	a650131fc3	Update doc: colsample_bylevel now works for tree_method=hist (#3862 ) This feature was introduced by #3635	2018-11-04 02:25:25 -08:00
Philip Hyunsu Cho	583c88bce7	[jvm-packages] Require vanilla Apache Spark (#3854 )	2018-11-01 19:15:40 -07:00
Jonathan Friedman	45d321da28	Fix typo in docs (#3852 ) Fix typo in docs	2018-11-01 13:03:59 -07:00
Zhao Hang	e3c1afac6b	Update parameter.rst (#3843 )	2018-10-31 00:19:45 +13:00
Nan Zhu	5fbe230636	[jvm-packages] documenting tracker (#3831 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * documenting tracker * Make it a separate note	2018-10-25 18:53:46 -07:00
Nan Zhu	4ae225a08d	[Blocking][jvm-packages] fix the early stopping feature (#3808 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * temp * add method for classifier and regressor * update tutorial * address the comments * update	2018-10-23 14:53:13 -07:00
KOLANICH	5480e05173	Added some instructions on using MinGW-built XGBoost with python. (#3774 ) * Added some instructions on using MinGW-built XGBoost with python. * Changes according to the discussion and some additions * Fixed wording and removed redundancy. * Even more fixes * Fixed links. Removed redundancy. * Some fixes according to the discussion * fixes * Some fixes * fixes	2018-10-09 09:07:00 -07:00
Philip Hyunsu Cho	ca33bf6476	Document gblinear parameters: feature_selector and top_k (#3780 )	2018-10-08 22:41:54 -07:00
Philip Hyunsu Cho	813d2436d3	Produce xgboost.so for XGBoost-R on Mac OSX, so that `make install` works (#3767 ) * Produce xgboost.so for XGBoost-R on Mac OSX, so that `make install` works * Modernize R build instructions * Fix crossref	2018-10-07 14:09:54 -07:00
Philip Hyunsu Cho	91903ac5d4	Fix broken doc build due to Matplotlib 3.0 release (#3764 )	2018-10-07 13:34:37 -07:00
Dmitriy Rybalko	7bbb44182a	update eval_metric doc (#3687 )	2018-09-14 08:47:05 -07:00
mrgutkun	4b43810f51	Fix #3663 : Allow sklearn API to use callbacks (#3682 ) * Fix #3663: Allow sklearn API to use callbacks * Fix lint * Add Callback API to Python API doc	2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho	5a8bbb39a1	Revert #3677 and #3674 (#3678 ) * Revert "Add scikit-learn as dependency for doc build (#3677)" This reverts commit `308f664ade`. * Revert "Add scikit-learn tests (#3674)" This reverts commit `d176a0fbc8`.	2018-09-06 20:43:17 -07:00
Philip Hyunsu Cho	308f664ade	Add scikit-learn as dependency for doc build (#3677 )	2018-09-06 14:56:05 -07:00
Philip Hyunsu Cho	190d888695	Document LambdaMART objectives: pairwise, listwise (#3672 ) * Document LambdaMART objectives * Distinguish between pairwise and listwise objectives	2018-09-06 09:54:37 -07:00
Philip Hyunsu Cho	9344f081a4	Add numpy and matplotlib as requirements for doc build (#3669 )	2018-09-04 20:56:18 -07:00
Andrew Thia	9254c58e4d	[TREE] add interaction constraints (#3466 ) * add interaction constraints * enable both interaction and monotonic constraints at the same time * fix lint * add R test, fix lint, update demo * Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping * Add Python test for interaction constraints * make R interaction constraints parameter based on feature index instead of column names, fix R coding style * Fix lint * Add BlueTea88 to CONTRIBUTORS.md * Short circuit when no constraint is specified; address review comments * Add tutorial for feature interaction constraints * allow interaction constraints to be passed as string, remove redundant column_names argument * Fix typo * Address review comments * Add comments to Python test	2018-09-04 09:35:39 -07:00
gorogm	7ef2b599c7	Link fixed. (#3640 )	2018-08-27 20:25:50 -07:00
Philip Hyunsu Cho	cb4de521c1	Document CUDA requirement, lack of external memory on GPU (#3624 ) * Document fact that GPU doesn't support external memory * Document CUDA requirement	2018-08-22 22:47:10 -07:00
Philip Hyunsu Cho	4ed8a88240	Update Python API doc (#3619 ) * Add XGBRanker to Python API doc * Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel * Add table of contents to Python API doc * Skip JVM doc download if not available * Show inherited members for XGBRegressor and XGBRanker * Expose XGBRanker to Python XGBoost module directory * Add docstring to XGBRegressor.predict() and XGBRanker.predict() * Fix rendering errors in Python docstrings * Fix lint	2018-08-22 18:59:30 -07:00
Grant W Schneider	57f3c2f252	Remove errant $ (#3618 )	2018-08-21 12:32:38 -07:00
Philip Hyunsu Cho	b13c3a8bcc	Fix #3609 : Removed unused parameter 'use_buffer' (#3610 )	2018-08-21 07:54:15 -07:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
Philip Hyunsu Cho	983cb0b374	Add option to disable default metric (#3606 )	2018-08-18 11:39:20 -07:00
Grace Lam	caf4a756bf	Add JSON dump functionality documentation (#3600 )	2018-08-16 16:32:04 -07:00
Jakob Richter	725f4c36f2	replace nround with nrounds to match actual parameter (#3592 )	2018-08-15 11:13:53 -07:00

1 2 3 4 5 ...

274 Commits