* add qid for https://github.com/dmlc/xgboost/issues/2748
* change names
* change spaces
* change qid to bst_uint type
* change qid type to size_t
* change qid first to SIZE_MAX
* change qid type from size_t to uint64_t
* update dmlc-core
* fix qids name error
* fix group_ptr_ error
* Style fix
* Add qid handling logic to SparsePage
* New MetaInfo format + backward compatibility fix
Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able
to read from MetaInfo files saved in old format. Also, define a new format
(2.0) that contains the qid field. This way, we can distinguish files that
contain qid and those that do not.
* Update MetaInfo test
* Simply group assignment logic
* Explicitly set qid=nullptr in NativeDataIter
NativeDataIter's callback does not support qid field. Users of NativeDataIter
will need to call setGroup() function separately to set group information.
* Save qids_ in SaveBinary()
* Upgrade dmlc-core submodule
* Add a test for reading qid
* Add contributor
* Check the size of qids_
* Document qid format
* allow arbitrary cross validation fold indices
- use training indices passed to `folds` parameter in `training.cv`
- update doc string
* add tests for arbitrary fold indices
* Refactor to allow for custom regularisation methods
* Implement compositional SplitEvaluator framework
* Fixed segfault when no monotone_constraints are supplied.
* Change pid to parentID
* test_monotone_constraints.py now passes
* Refactor ColMaker and DistColMaker to use SplitEvaluator
* Performance optimisation when no monotone_constraints specified
* Fix linter messages
* Fix a few more linter errors
* Update the amalgamation
* Add bounds check
* Add check for leaf node
* Fix linter error in param.h
* Fix clang-tidy errors on CI
* Fix incorrect function name
* Fix clang-tidy error in updater_fast_hist.cc
* Enable SSE2 for Win32 R MinGW
Addresses https://github.com/dmlc/xgboost/pull/3335#issuecomment-400535752
* Add contributor
CI tests were failing because wget prompts "the user" for a response
whenever the google test archive is already on the disk.
Fix: Use `-nc` option to skip download when the archive already
exists
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* maven central release
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* [jvm-packages] XGBoost Spark integration refactor. (#3313)
* XGBoost Spark integration refactor.
* Make corresponding update for xgboost4j-example
* Address comments.
* [jvm-packages] Refactor XGBoost-Spark params to make it compatible with both XGBoost and Spark MLLib (#3326)
* Refactor XGBoost-Spark params to make it compatible with both XGBoost and Spark MLLib
* Fix extra space.
* [jvm-packages] XGBoost Spark supports ranking with group data. (#3369)
* XGBoost Spark supports ranking with group data.
* Use Iterator.duplicate to prevent OOM.
* Update CheckpointManagerSuite.scala
* Resolve conflicts
* Use sparse page as singular CSR matrix representation
* Simplify dmatrix methods
* Reduce statefullness of batch iterators
* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
* GPU binning and compression.
- binning and index compression are done inside the DeviceShard constructor
- in case of a DMatrix with multiple row batches, it is first converted into a single row batch
Currently, `CLIPredict()` saves prediction results in default 6-digit precision which causes precision loss. This PR sets precision to a level so that the conversion back to `bst_float` is lossless.
Related: #3298.
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* update 0.80
* Fix print.xgb.Booster
valid_handle should be TRUE when x$handle is NOT null
* Update xgb.Booster.R
Modify is.null.handle to return TRUE for NULL handle
* Add option to use weights when evaluating metrics in validation sets
* Add test for validation-set weights functionality
* simplify case with no weights for test sets
* fix lint issues
* For CRAN submission, remove all #pragma's that suppress compiler warnings
A few headers in dmlc-core contain #pragma's that disable compiler warnings,
which is against the CRAN submission policy. Fix the problem by removing
the offending #pragma's as part of the command `make Rbuild`.
This addresses issue #3322.
* Fix script to improve Cygwin/MSYS compatibility
We need this to pass rmingw CI test
* Remove remove_warning_suppression_pragma.sh from packaged tarball
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* static glibc glibc++
* update to build with glib 2.12
* remove unsupported flags
* update version number
* remove properties
* remove unnecessary command
* update poms
* Update dmlc-core submodule
* Fix dense_parser to work with the latest dmlc-core
* Specify location of Google Test
* Add more source files in dmlc-minimum to get latest dmlc-core working
* Update dmlc-core submodule
* Adjust xgboost entries in .gitignore
They were overly broad. In particularly this was inconvenient when
working with tools such as fzf that use the .gitignore to decide what to
include. As written, we'd not look into /include/xgboost.
* Make cosmetic improvements to .gitignore
* Remove dmlc-core from .gitignore
This seems unnecessary and has the drawback that tools that use
.gitignore to know files to skip mean they won't look here, and being
able to inspect the submodule files with them is useful.