Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.
Regression
Using XGBoost for regression is very similar to using it for binary classification. We suggest that you can refer to the binary classification demo first. In XGBoost if we use negative log likelihood as the loss function for regression, the training procedure is same as training binary classifier of XGBoost.
Tutorial
The dataset we used is the computer hardware dataset from UCI repository. The demo for regression is almost the same as the binary classification demo, except a little difference in general parameter:
# General parameter
# this is the only difference with classification, use reg:squarederror to do linear regression
# when labels are in [0,1] we can also use reg:logistic
objective = reg:squarederror
...
The input format is same as binary classification, except that the label is now the target regression values. We use linear regression here, if we want use objective = reg:logistic logistic regression, the label needed to be pre-scaled into [0,1].