From ca33bf6476a182a42209e3ce10a386adedd65ba9 Mon Sep 17 00:00:00 2001 From: Philip Hyunsu Cho Date: Mon, 8 Oct 2018 22:41:54 -0700 Subject: [PATCH] Document gblinear parameters: feature_selector and top_k (#3780) --- doc/parameter.rst | 14 ++++++++++++++ src/linear/coordinate_common.h | 2 +- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/doc/parameter.rst b/doc/parameter.rst index 9ec7bb95d..6a4137f04 100644 --- a/doc/parameter.rst +++ b/doc/parameter.rst @@ -248,6 +248,20 @@ Parameters for Linear Booster (``booster=gblinear``) - ``shotgun``: Parallel coordinate descent algorithm based on shotgun algorithm. Uses 'hogwild' parallelism and therefore produces a nondeterministic solution on each run. - ``coord_descent``: Ordinary coordinate descent algorithm. Also multithreaded but still produces a deterministic solution. +* ``feature_selector`` [default= ``cyclic``] + + - Feature selection and ordering method + + * ``cyclic``: Deterministic selection by cycling through features one at a time. + * ``shuffle``: Similar to ``cyclic`` but with random feature shuffling prior to each update. + * ``random``: A random (with replacement) coordinate selector. + * ``greedy``: Select coordinate with the greatest gradient magnitude. It has ``O(num_feature^2)`` complexity. It is fully deterministic. It allows restricting the selection to ``top_k`` features per group with the largest magnitude of univariate weight change, by setting the ``top_k`` parameter. Doing so would reduce the complexity to ``O(num_feature*top_k)``. + * ``thrifty``: Thrifty, approximately-greedy feature selector. Prior to cyclic updates, reorders features in descending magnitude of their univariate weight changes. This operation is multithreaded and is a linear complexity approximation of the quadratic greedy selection. It allows restricting the selection to ``top_k`` features per group with the largest magnitude of univariate weight change, by setting the ``top_k`` parameter. + +* ``top_k`` [default=0] + + - The number of top features to select in ``greedy`` and ``thrifty`` feature selector. The value of 0 means using all the features. + Parameters for Tweedie Regression (``objective=reg:tweedie``) ============================================================= * ``tweedie_variance_power`` [default=1.5] diff --git a/src/linear/coordinate_common.h b/src/linear/coordinate_common.h index 4c4dc3b54..8b7617c76 100644 --- a/src/linear/coordinate_common.h +++ b/src/linear/coordinate_common.h @@ -241,7 +241,7 @@ class CyclicFeatureSelector : public FeatureSelector { }; /** - * \brief Similar to Cyclyc but with random feature shuffling prior to each update. + * \brief Similar to Cyclic but with random feature shuffling prior to each update. * \note Its randomness is controllable by setting a random seed. */ class ShuffleFeatureSelector : public FeatureSelector {