xgboost

Author	SHA1	Message	Date
Jiaming Yuan	e6088366df	Export Python Interface for external memory. (#7070 ) * Add Python iterator interface. * Add tests. * Add demo. * Add documents. * Handle empty dataset.	2021-07-22 15:15:53 +08:00
ZabelTech	1d91f71119	fix typo in `XGDMatrixSetFloatInfo` example (#7097 )	2021-07-10 21:40:25 +08:00
Jeff H	d22b293f2f	Update reference to treelite website (#7084 ) treelite.io is no longer a valid site and re-directs users to a parked domain. Re-directing to the documentation is safer at this point.	2021-07-06 22:15:07 -07:00
Jiaming Yuan	cf06a266a8	[dask][doc] Wrap the example in main guard. (#6979 )	2021-05-25 08:24:47 +08:00
Jiaming Yuan	5cb51a191e	[dask][doc] Add small example for sklearn interface. (#6970 )	2021-05-19 13:50:45 +08:00
Andrew Ziem	3e7e426b36	Fix spelling in documents (#6948 ) * Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>	2021-05-11 20:44:36 +08:00
Kai Fricke	c8cc3eacc9	[docs] Add tutorial for XGBoost-Ray (#6884 ) * Add XGBoost-Ray tutorial * Add link to modin	2021-04-22 02:07:13 +08:00
Jiaming Yuan	a5d7094a45	Update documents. (#6856 ) * Add early stopping section to prediction doc. * Remove best_ntree_limit. * Better doxygen output.	2021-04-16 12:41:03 +08:00
Jiaming Yuan	9d62b14591	Fix document. [skip ci] (#6669 )	2021-02-02 20:43:31 +08:00
Jiaming Yuan	87ab1ad607	[dask] Accept `Future` of model for prediction. (#6650 ) This PR changes predict and inplace_predict to accept a Future of model, to avoid sending models to workers repeatably. * Document is updated to reflect functionality additions in recent changes.	2021-02-02 08:45:52 +08:00
Jiaming Yuan	d8ec7aad5a	[dask] Add a 1 line sample to infer output shape. (#6645 ) * [dask] Use a 1 line sample to infer output shape. This is for inferring shape with direct prediction (without DaskDMatrix). There are a few things that requires known output shape before carrying out actual prediction, including dask meta data, output dataframe columns. * Infer output shape based on local prediction. * Remove set param in predict function as it's not thread safe nor necessary as we now let dask to decide the parallelism. * Simplify prediction on `DaskDMatrix`.	2021-01-30 18:55:50 +08:00
Jiaming Yuan	4bf23c2391	Specify shape in prediction contrib and interaction. (#6614 )	2021-01-26 02:08:22 +08:00
Jiaming Yuan	c5876277a8	Drop saving binary format for memory snapshot. (#6513 )	2020-12-17 00:14:57 +08:00
James Lamb	1e2c3ade9e	[doc] [dask] Add example on early stopping with Dask (#6501 ) Co-authored-by: fis <jm.yuan@outlook.com>	2020-12-15 22:23:23 +08:00
James Lamb	afc4567268	[doc] [dask] fix partitioning in Dask example (#6389 )	2020-12-14 18:37:49 +08:00
Jiaming Yuan	a30461cf87	[dask] Support all parameters in regressor and classifier. (#6471 ) * Add eval_metric. * Add callback. * Add feature weights. * Add custom objective.	2020-12-14 07:35:56 +08:00
hzy001	c2ba4fb957	Fix broken links. (#6455 ) Co-authored-by: Hao Ziyu <haoziyu@qiyi.com> Co-authored-by: fis <jm.yuan@outlook.com>	2020-12-02 17:39:12 +08:00
Jiaming Yuan	00218d065a	[dask] Update document. [skip ci] (#6413 )	2020-11-20 19:16:19 +08:00
James Lamb	12d27f43ff	[doc] make Dask distributed example copy-pastable (#6345 )	2020-11-11 20:22:17 -08:00
Jean Lescut-Muller	9564886d9f	Update custom_metric_obj.rst (#6367 )	2020-11-10 22:29:22 +08:00
DIVYA CHAUHAN	4e9c4f2d73	Create a tutorial for using the C API in a C/C++ application (#6285 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-27 12:19:20 -07:00
Jiaming Yuan	bed7ae4083	Loop over `thrust::reduce`. (#6229 ) * Check input chunk size of dqdm. * Add doc for current limitation.	2020-10-14 10:40:56 +13:00
Jiaming Yuan	08bdb2efc8	Fix dask doc. [skip ci] (#6108 )	2020-09-11 10:56:12 +08:00
Jiaming Yuan	318bffaa10	Fix custom obj link. [skip ci] (#6100 )	2020-09-09 10:55:38 +08:00
Philip Hyunsu Cho	5a2dcd1c33	[R] Provide better guidance for persisting XGBoost model (#5964 ) * [R] Provide better guidance for persisting XGBoost model * Update saving_model.rst * Add a paragraph about xgb.serialize()	2020-07-31 20:00:26 -07:00
James Bourbeau	3b88bc948f	Update XGBoost + Dask overview documentation (#5961 ) * Add imports to code snippet * Better writing.	2020-07-31 09:58:50 +08:00
Jiaming Yuan	fa3715f584	[Dask] Asyncio support. (#5862 )	2020-07-30 06:23:58 +08:00
Jiaming Yuan	8104f10328	Update document for model dump. (#5818 ) * Clarify the relationship between dump and save. * Mention the schema.	2020-06-22 14:33:54 +08:00
Jiaming Yuan	529b5c2cfd	[DOC] Mention dask blog post in doc. [skip ci] (#5789 )	2020-06-14 13:00:19 +08:00
Philip Hyunsu Cho	ca0d605b34	[Doc] Fix typos in AFT tutorial (#5716 )	2020-05-28 14:04:34 -07:00
Rong Ou	e21a608552	add pointers to the gpu external memory paper (#5684 )	2020-05-19 19:46:16 -07:00
Yuan Tang	dfcdfabf1f	Move dask tutorial closer other distributed tutorials (#5613 )	2020-04-28 02:24:00 +08:00
Jiaming Yuan	9c1103e06c	[Breaking] Set output margin to True for custom objective. (#5564 ) * Set output margin to True for custom objective in Python and R. * Add a demo for writing multi-class custom objective function. * Run tests on selected demos.	2020-04-20 20:44:12 +08:00
Yuan Tang	9097e8f0d9	Edits on tutorial for XGBoost job on Kubernetes (#5487 )	2020-04-05 07:36:33 -04:00
Philip Hyunsu Cho	30e94ddd04	Add R code to AFT tutorial [skip ci] (#5486 )	2020-04-04 13:06:12 -07:00
Philip Hyunsu Cho	5fc5ec539d	Implement robust regularization in 'survival:aft' objective (#5473 ) * Robust regularization of AFT gradient and hessian * Fix AFT doc; expose it to tutorial TOC * Apply robust regularization to uncensored case too * Revise unit test slightly * Fix lint * Update test_survival.py * Use GradientPairPrecise * Remove unused variables	2020-04-04 12:21:24 -07:00
Avinash Barnwal	dcf439932a	Add Accelerated Failure Time loss for survival analysis task (#4763 ) * [WIP] Add lower and upper bounds on the label for survival analysis * Update test MetaInfo.SaveLoadBinary to account for extra two fields * Don't clear qids_ for version 2 of MetaInfo * Add SetInfo() and GetInfo() method for lower and upper bounds * changes to aft * Add parameter class for AFT; use enum's to represent distribution and event type * Add AFT metric * changes to neg grad to grad * changes to binomial loss * changes to overflow * changes to eps * changes to code refactoring * changes to code refactoring * changes to code refactoring * Re-factor survival analysis * Remove aft namespace * Move function bodies out of AFTNormal and AFTLogistic, to reduce clutter * Move function bodies out of AFTLoss, to reduce clutter * Use smart pointer to store AFTDistribution and AFTLoss * Rename AFTNoiseDistribution enum to AFTDistributionType for clarity The enum class was not a distribution itself but a distribution type * Add AFTDistribution::Create() method for convenience * changes to extreme distribution * changes to extreme distribution * changes to extreme * changes to extreme distribution * changes to left censored * deleted cout * changes to x,mu and sd and code refactoring * changes to print * changes to hessian formula in censored and uncensored * changes to variable names and pow * changes to Logistic Pdf * changes to parameter * Expose lower and upper bound labels to R package * Use example weights; normalize log likelihood metric * changes to CHECK * changes to logistic hessian to standard formula * changes to logistic formula * Comply with coding style guideline * Revert back Rabit submodule * Revert dmlc-core submodule * Comply with coding style guideline (clang-tidy) * Fix an error in AFTLoss::Gradient() * Add missing files to amalgamation * Address @RAMitchell's comment: minimize future change in MetaInfo interface * Fix lint * Fix compilation error on 32-bit target, when size_t == bst_uint * Allocate sufficient memory to hold extra label info * Use OpenMP to speed up * Fix compilation on Windows * Address reviewer's feedback * Add unit tests for probability distributions * Make Metric subclass of Configurable * Address reviewer's feedback: Configure() AFT metric * Add a dummy test for AFT metric configuration * Complete AFT configuration test; remove debugging print * Rename AFT parameters * Clarify test comment * Add a dummy test for AFT loss for uncensored case * Fix a bug in AFT loss for uncensored labels * Complete unit test for AFT loss metric * Simplify unit tests for AFT metric * Add unit test to verify aggregate output from AFT metric * Use EXPECT_* instead of ASSERT_, so that we run all unit tests Use aft_loss_param when serializing AFTObj This is to be consistent with AFT metric * Add unit tests for AFT Objective * Fix OpenMP bug; clarify semantics for shared variables used in OpenMP loops * Add comments * Remove AFT prefix from probability distribution; put probability distribution in separate source file * Add comments * Define kPI and kEulerMascheroni in probability_distribution.h * Add probability_distribution.cc to amalgamation * Remove unnecessary diff * Address reviewer's feedback: define variables where they're used * Eliminate all INFs and NANs from AFT loss and gradient * Add demo * Add tutorial * Fix lint * Use 'survival:aft' to be consistent with 'survival:cox' * Move sample data to demo/data * Add visual demo with 1D toy data * Add Python tests Co-authored-by: Philip Cho <chohyu01@cs.washington.edu>	2020-03-25 13:52:51 -07:00
Jiaming Yuan	cd7d6f7d59	[dask] Fix missing value for scikit-learn interface. (#5435 )	2020-03-20 10:56:01 -04:00
Jiaming Yuan	761a5dbdfc	[dask] Honor `nthreads` from dask worker. (#5414 )	2020-03-16 04:51:24 +08:00
Samrat Pandiri	2d76d40dfd	Update dask.rst to correct a spelling mistake (#5371 ) Change `signle-node` to `single-node`	2020-02-27 20:46:41 +08:00
Rong Ou	d6b31df449	update docs for gpu external memory (#5332 ) * update docs for gpu external memory * add hist limitation	2020-02-22 14:57:40 +08:00
Jiaming Yuan	e433a379e4	Fix changing locale. (#5314 ) * Fix changing locale. * Don't use locale guard. As number parsing is implemented in house, we don't need locale. * Update doc.	2020-02-17 11:31:13 +08:00
Jiaming Yuan	ed2465cce4	Add configuration to R interface. (#5217 ) * Save and load internal parameter configuration as JSON.	2020-02-16 03:01:58 +08:00
Jiaming Yuan	911a902835	Merge model compatibility fixes from 1.0rc branch. (#5305 ) * Port test model compatibility. * Port logit model fix. https://github.com/dmlc/xgboost/pull/5248 https://github.com/dmlc/xgboost/pull/5281	2020-02-13 20:41:58 +08:00
Jiaming Yuan	472ded549d	Save Scikit-Learn attributes into learner attributes. (#5245 ) * Remove the recommendation for pickle. * Save skl attributes in booster.attr * Test loading scikit-learn model with native booster.	2020-01-30 16:00:18 +08:00
Jiaming Yuan	ef19480eda	Add dart to JSON schema. (#5218 ) * Add dart to JSON schema. * Use spaces instead of tab.	2020-01-28 13:29:09 +08:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Jiaming Yuan	1d0ca49761	Example JSON model parser and Schema. (#5137 )	2019-12-23 19:47:35 +08:00
Jiaming Yuan	a4b929385e	Note for `DaskDMatrix`. (#5144 ) * Brief introduction to `DaskDMatrix`. * Add xgboost.dask.train to API doc	2019-12-23 18:55:32 +08:00

1 2 3 4

186 Commits