Copy CMake parameter from dmlc-core. (#4948)
This commit is contained in:
parent
a78d4e7aa8
commit
9fc681001a
@ -37,6 +37,12 @@ option(RABIT_MOCK "Build rabit with mock" OFF)
|
|||||||
option(USE_CUDA "Build with GPU acceleration" OFF)
|
option(USE_CUDA "Build with GPU acceleration" OFF)
|
||||||
option(USE_NCCL "Build with NCCL to enable distributed GPU support." OFF)
|
option(USE_NCCL "Build with NCCL to enable distributed GPU support." OFF)
|
||||||
option(BUILD_WITH_SHARED_NCCL "Build with shared NCCL library." OFF)
|
option(BUILD_WITH_SHARED_NCCL "Build with shared NCCL library." OFF)
|
||||||
|
|
||||||
|
## Copied From dmlc
|
||||||
|
option(USE_HDFS "Build with HDFS support" OFF)
|
||||||
|
option(USE_AZURE "Build with AZURE support" OFF)
|
||||||
|
option(USE_S3 "Build with S3 support" OFF)
|
||||||
|
|
||||||
set(GPU_COMPUTE_VER "" CACHE STRING
|
set(GPU_COMPUTE_VER "" CACHE STRING
|
||||||
"Semicolon separated list of compute versions to be built against, e.g. '35;61'")
|
"Semicolon separated list of compute versions to be built against, e.g. '35;61'")
|
||||||
if (BUILD_WITH_SHARED_NCCL AND (NOT USE_NCCL))
|
if (BUILD_WITH_SHARED_NCCL AND (NOT USE_NCCL))
|
||||||
|
|||||||
@ -72,8 +72,10 @@ Our goal is to build the shared library:
|
|||||||
|
|
||||||
The minimal building requirement is
|
The minimal building requirement is
|
||||||
|
|
||||||
- A recent C++ compiler supporting C++11 (g++-4.8 or higher)
|
- A recent C++ compiler supporting C++11 (g++-5.0 or higher)
|
||||||
- CMake 3.2 or higher
|
- CMake 3.3 or higher (3.12 for building with CUDA)
|
||||||
|
|
||||||
|
For a list of CMake options, see ``#-- Options`` in CMakeLists.txt on top of source tree.
|
||||||
|
|
||||||
Building on Ubuntu/Debian
|
Building on Ubuntu/Debian
|
||||||
=========================
|
=========================
|
||||||
|
|||||||
@ -20,7 +20,7 @@ Installation
|
|||||||
Installation from source
|
Installation from source
|
||||||
========================
|
========================
|
||||||
|
|
||||||
Building XGBoost4J using Maven requires Maven 3 or newer, Java 7+ and CMake 3.2+ for compiling the JNI bindings.
|
Building XGBoost4J using Maven requires Maven 3 or newer, Java 7+ and CMake 3.3+ for compiling the JNI bindings.
|
||||||
|
|
||||||
Before you install XGBoost4J, you need to define environment variable ``JAVA_HOME`` as your JDK directory to ensure that your compiler can find ``jni.h`` correctly, since XGBoost4J relies on JNI to implement the interaction between the JVM and native libraries.
|
Before you install XGBoost4J, you need to define environment variable ``JAVA_HOME`` as your JDK directory to ensure that your compiler can find ``jni.h`` correctly, since XGBoost4J relies on JNI to implement the interaction between the JVM and native libraries.
|
||||||
|
|
||||||
|
|||||||
@ -158,7 +158,7 @@ Dealing with missing values
|
|||||||
|
|
||||||
Strategies to handle missing values (and therefore overcome issues as above):
|
Strategies to handle missing values (and therefore overcome issues as above):
|
||||||
|
|
||||||
In the case that a feature column contains missing values for any reason (could be related to business logic / wrong data ingestion process / etc.), the user should decide on a strategy of how to handle it.
|
In the case that a feature column contains missing values for any reason (could be related to business logic / wrong data ingestion process / etc.), the user should decide on a strategy of how to handle it.
|
||||||
The choice of approach depends on the value representing 'missing' which fall into four different categories:
|
The choice of approach depends on the value representing 'missing' which fall into four different categories:
|
||||||
|
|
||||||
1. 0
|
1. 0
|
||||||
@ -171,7 +171,7 @@ We introduce the following approaches dealing with missing value and their fitti
|
|||||||
1. Skip VectorAssembler (using setHandleInvalid = "skip") directly. Used in (2), (3).
|
1. Skip VectorAssembler (using setHandleInvalid = "skip") directly. Used in (2), (3).
|
||||||
2. Keep it (using setHandleInvalid = "keep"), and set the "missing" parameter in XGBClassifier/XGBRegressor as the value representing missing. Used in (2) and (4).
|
2. Keep it (using setHandleInvalid = "keep"), and set the "missing" parameter in XGBClassifier/XGBRegressor as the value representing missing. Used in (2) and (4).
|
||||||
3. Keep it (using setHandleInvalid = "keep") and transform to other irregular values. Used in (3).
|
3. Keep it (using setHandleInvalid = "keep") and transform to other irregular values. Used in (3).
|
||||||
4. Nothing to be done, used in (1).
|
4. Nothing to be done, used in (1).
|
||||||
|
|
||||||
Then, XGBoost will automatically learn what's the ideal direction to go when a value is missing, based on that value and strategy.
|
Then, XGBoost will automatically learn what's the ideal direction to go when a value is missing, based on that value and strategy.
|
||||||
|
|
||||||
@ -241,7 +241,7 @@ Early stopping is a feature to prevent the unnecessary training iterations. By s
|
|||||||
|
|
||||||
When it comes to custom eval metrics, in additional to ``num_early_stopping_rounds``, you also need to define ``maximize_evaluation_metrics`` or call ``setMaximizeEvaluationMetrics`` to specify whether you want to maximize or minimize the metrics in training. For built-in eval metrics, XGBoost4J-Spark will automatically select the direction.
|
When it comes to custom eval metrics, in additional to ``num_early_stopping_rounds``, you also need to define ``maximize_evaluation_metrics`` or call ``setMaximizeEvaluationMetrics`` to specify whether you want to maximize or minimize the metrics in training. For built-in eval metrics, XGBoost4J-Spark will automatically select the direction.
|
||||||
|
|
||||||
For example, we need to maximize the evaluation metrics (set ``maximize_evaluation_metrics`` with true), and set ``num_early_stopping_rounds`` with 5. The evaluation metric of 10th iteration is the maximum one until now. In the following iterations, if there is no evaluation metric greater than the 10th iteration's (best one), the traning would be early stopped at 15th iteration.
|
For example, we need to maximize the evaluation metrics (set ``maximize_evaluation_metrics`` with true), and set ``num_early_stopping_rounds`` with 5. The evaluation metric of 10th iteration is the maximum one until now. In the following iterations, if there is no evaluation metric greater than the 10th iteration's (best one), the traning would be early stopped at 15th iteration.
|
||||||
|
|
||||||
Training with Evaluation Sets
|
Training with Evaluation Sets
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user