rank_metric: add AUC-PR (#3172)

* rank_metric: add AUC-PR

Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209)

* rank_metric: fix lint warnings

* Implement tests for AUC-PR and fix implementation

* add aucpr to documentation for other languages
This commit is contained in:
Arjan van der Velde
2018-03-23 10:43:47 -04:00
committed by Yuan (Terry) Tang
parent 8fb3388af2
commit 04221a7469
7 changed files with 121 additions and 2 deletions

View File

@@ -34,6 +34,7 @@
#' \item \code{rmse} Rooted mean square error
#' \item \code{logloss} negative log-likelihood function
#' \item \code{auc} Area under curve
#' \item \code{aucpr} Area under PR curve
#' \item \code{merror} Exact matching error, used to evaluate multi-class classification
#' }
#' @param obj customized objective function. Returns gradient and second order

View File

@@ -127,6 +127,7 @@
#' Different threshold (e.g., 0.) could be specified as "error@0."
#' \item \code{merror} Multiclass classification error rate. It is calculated as \code{(# wrong cases) / (# all cases)}.
#' \item \code{auc} Area under the curve. \url{http://en.wikipedia.org/wiki/Receiver_operating_characteristic#'Area_under_curve} for ranking evaluation.
#' \item \code{aucpr} Area under the PR curve. \url{https://en.wikipedia.org/wiki/Precision_and_recall} for ranking evaluation.
#' \item \code{ndcg} Normalized Discounted Cumulative Gain (for ranking task). \url{http://en.wikipedia.org/wiki/NDCG}
#' }
#'

View File

@@ -51,6 +51,7 @@ from each CV model. This parameter engages the \code{\link{cb.cv.predict}} callb
\item \code{rmse} Rooted mean square error
\item \code{logloss} negative log-likelihood function
\item \code{auc} Area under curve
\item \code{aucpr} Area under PR curve
\item \code{merror} Exact matching error, used to evaluate multi-class classification
}}

View File

@@ -186,6 +186,7 @@ The folloiwing is the list of built-in metrics for which Xgboost provides optimi
Different threshold (e.g., 0.) could be specified as "error@0."
\item \code{merror} Multiclass classification error rate. It is calculated as \code{(# wrong cases) / (# all cases)}.
\item \code{auc} Area under the curve. \url{http://en.wikipedia.org/wiki/Receiver_operating_characteristic#'Area_under_curve} for ranking evaluation.
\item \code{aucpr} Area under the PR curve. \url{https://en.wikipedia.org/wiki/Precision_and_recall} for ranking evaluation.
\item \code{ndcg} Normalized Discounted Cumulative Gain (for ranking task). \url{http://en.wikipedia.org/wiki/NDCG}
}