16 Commits

Author SHA1 Message Date
Rong Ou
603f8ce2fa
Support hist in the partition builder under column split (#9120) 2023-05-11 05:24:29 +08:00
Jiaming Yuan
a093770f36
Partitioner for multi-target tree. (#8922) 2023-03-16 18:49:34 +08:00
Jiaming Yuan
26209a42a5
Define git attributes for renormalization. (#8921) 2023-03-16 02:43:11 +08:00
Rong Ou
d9688f93c7
Support column-split in row partitioner (#8828) 2023-02-26 04:43:35 +08:00
Jiaming Yuan
43a647a4dd
Fix inference with categorical feature. (#8591) 2022-12-15 17:57:26 +08:00
Jiaming Yuan
3e26107a9c
Rename and extract Context. (#8528)
* Rename `GenericParameter` to `Context`.
* Rename header file to reflect the change.
* Rename all references.
2022-12-07 04:58:54 +08:00
Dmitry Razdoburdin
5bd849f1b5
Unify the partitioner for hist and approx.
Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2022-10-20 02:49:20 +08:00
Tim Gates
cb40bbdadd
docs: fix simple typo, cannonical -> canonical (#8099)
There is a small typo in src/common/partition_builder.h.

Should read `canonical` rather than `cannonical`.

Signed-off-by: Tim Gates <tim.gates@iress.com>
2022-07-20 21:04:50 +08:00
Jiaming Yuan
1a33b50a0d
Fix compiler warnings. (#7974)
- Remove unused parameters. There are still many warnings that are not yet
addressed. Currently, the warnings in dmlc-core dominate the error log.
- Remove `distributed` parameter from metric.
- Fixes some warnings about signed comparison.
2022-06-06 22:56:25 +08:00
Jiaming Yuan
4fcfd9c96e
Fix and cleanup for column matrix. (#7901)
* Fix missed type dispatching for dense columns with missing values.
* Code cleanup to reduce special cases.
* Reduce memory usage.
2022-05-16 21:11:50 +08:00
Jiaming Yuan
1baad8650c
Small cleanup to Column. (#7898)
* Define forward iterator to hide the internal state.
2022-05-15 12:39:10 +08:00
Jiaming Yuan
fdf533f2b9
[POC] Experimental support for l1 error. (#7812)
Support adaptive tree, a feature supported by both sklearn and lightgbm.  The tree leaf is recomputed based on residue of labels and predictions after construction.

For l1 error, the optimal value is the median (50 percentile).

This is marked as experimental support for the following reasons:
- The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors.
- Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.
2022-04-26 21:41:55 +08:00
Jiaming Yuan
83a66b4994
Support categorical data for hist. (#7695)
* Extract partitioner from hist.
* Implement categorical data support by passing the gradient index directly into the partitioner.
* Organize/update document.
* Remove code for negative hessian.
2022-02-25 03:47:14 +08:00
Jiaming Yuan
eee527d264
Add approx partitioner. (#7467) 2021-11-27 15:22:06 +08:00
farfarawayzyt
d7c14496d2
fix typo in arguments of PartitionBuilder::Init (#7113)
Co-authored-by: Yuntian Zhang <zhangyt@lamda.nju.edu.cn>
2021-07-16 15:46:22 +08:00
ShvetsKS
2567404ab6
Simplify sparse and dense CPU hist kernels (#7029)
* Simplify sparse and dense kernels
* Extract row partitioner.

Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
2021-06-11 18:26:30 +08:00