diff --git a/CMakeLists.txt b/CMakeLists.txt index 2ffff89f5..ceb44a597 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -37,6 +37,12 @@ option(RABIT_MOCK "Build rabit with mock" OFF) option(USE_CUDA "Build with GPU acceleration" OFF) option(USE_NCCL "Build with NCCL to enable distributed GPU support." OFF) option(BUILD_WITH_SHARED_NCCL "Build with shared NCCL library." OFF) + +## Copied From dmlc +option(USE_HDFS "Build with HDFS support" OFF) +option(USE_AZURE "Build with AZURE support" OFF) +option(USE_S3 "Build with S3 support" OFF) + set(GPU_COMPUTE_VER "" CACHE STRING "Semicolon separated list of compute versions to be built against, e.g. '35;61'") if (BUILD_WITH_SHARED_NCCL AND (NOT USE_NCCL)) diff --git a/doc/build.rst b/doc/build.rst index 64f76b73b..8f66b100f 100644 --- a/doc/build.rst +++ b/doc/build.rst @@ -72,8 +72,10 @@ Our goal is to build the shared library: The minimal building requirement is -- A recent C++ compiler supporting C++11 (g++-4.8 or higher) -- CMake 3.2 or higher +- A recent C++ compiler supporting C++11 (g++-5.0 or higher) +- CMake 3.3 or higher (3.12 for building with CUDA) + +For a list of CMake options, see ``#-- Options`` in CMakeLists.txt on top of source tree. Building on Ubuntu/Debian ========================= diff --git a/doc/jvm/index.rst b/doc/jvm/index.rst index 9b2415b79..436311d25 100644 --- a/doc/jvm/index.rst +++ b/doc/jvm/index.rst @@ -20,7 +20,7 @@ Installation Installation from source ======================== -Building XGBoost4J using Maven requires Maven 3 or newer, Java 7+ and CMake 3.2+ for compiling the JNI bindings. +Building XGBoost4J using Maven requires Maven 3 or newer, Java 7+ and CMake 3.3+ for compiling the JNI bindings. Before you install XGBoost4J, you need to define environment variable ``JAVA_HOME`` as your JDK directory to ensure that your compiler can find ``jni.h`` correctly, since XGBoost4J relies on JNI to implement the interaction between the JVM and native libraries. diff --git a/doc/jvm/xgboost4j_spark_tutorial.rst b/doc/jvm/xgboost4j_spark_tutorial.rst index 9e4129257..72767f128 100644 --- a/doc/jvm/xgboost4j_spark_tutorial.rst +++ b/doc/jvm/xgboost4j_spark_tutorial.rst @@ -158,7 +158,7 @@ Dealing with missing values Strategies to handle missing values (and therefore overcome issues as above): -In the case that a feature column contains missing values for any reason (could be related to business logic / wrong data ingestion process / etc.), the user should decide on a strategy of how to handle it. +In the case that a feature column contains missing values for any reason (could be related to business logic / wrong data ingestion process / etc.), the user should decide on a strategy of how to handle it. The choice of approach depends on the value representing 'missing' which fall into four different categories: 1. 0 @@ -171,7 +171,7 @@ We introduce the following approaches dealing with missing value and their fitti 1. Skip VectorAssembler (using setHandleInvalid = "skip") directly. Used in (2), (3). 2. Keep it (using setHandleInvalid = "keep"), and set the "missing" parameter in XGBClassifier/XGBRegressor as the value representing missing. Used in (2) and (4). 3. Keep it (using setHandleInvalid = "keep") and transform to other irregular values. Used in (3). -4. Nothing to be done, used in (1). +4. Nothing to be done, used in (1). Then, XGBoost will automatically learn what's the ideal direction to go when a value is missing, based on that value and strategy. @@ -241,7 +241,7 @@ Early stopping is a feature to prevent the unnecessary training iterations. By s When it comes to custom eval metrics, in additional to ``num_early_stopping_rounds``, you also need to define ``maximize_evaluation_metrics`` or call ``setMaximizeEvaluationMetrics`` to specify whether you want to maximize or minimize the metrics in training. For built-in eval metrics, XGBoost4J-Spark will automatically select the direction. -For example, we need to maximize the evaluation metrics (set ``maximize_evaluation_metrics`` with true), and set ``num_early_stopping_rounds`` with 5. The evaluation metric of 10th iteration is the maximum one until now. In the following iterations, if there is no evaluation metric greater than the 10th iteration's (best one), the traning would be early stopped at 15th iteration. +For example, we need to maximize the evaluation metrics (set ``maximize_evaluation_metrics`` with true), and set ``num_early_stopping_rounds`` with 5. The evaluation metric of 10th iteration is the maximum one until now. In the following iterations, if there is no evaluation metric greater than the 10th iteration's (best one), the traning would be early stopped at 15th iteration. Training with Evaluation Sets -----------------------------