2953 Commits

Author SHA1 Message Date
Tong He
f5c85836bf [R] Increase the version number, date and required R version (#1920)
* remove unnecessary line
2016-12-30 21:29:26 -08:00
Qiang Kou (KK)
7948d1c799 disable openmp on solaris (#1912) 2016-12-28 11:32:56 -08:00
adamist521
119763bc49 cross_validation is included in model_selection module since sklearn 0.18 (#1908) 2016-12-26 04:11:56 -05:00
Rory Mitchell
1957e6fb4d Fix cmake build for linux. Update GPU benchmarks. (#1904) 2016-12-23 09:18:56 +01:00
jokari69
fb0fc0c580 option to shuffle data in mknfolds (#1459)
* option to shuffle data in mknfolds

* removed possibility to run as stand alone test

* split function def in 2 lines for lint

* option to shuffle data in mknfolds

* removed possibility to run as stand alone test

* split function def in 2 lines for lint
2016-12-23 07:53:30 +08:00
Rory Mitchell
b49b339183 GPU Plugin: Add subsample, colsample_bytree, colsample_bylevel (#1895) 2016-12-22 16:30:36 +01:00
wxchan
cee4aafb93 fix dart bug (#1882) 2016-12-19 18:01:28 +01:00
Tong He
fa97259d66 Bump up version number, add cleanup script (#1886)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc

* fix cran note and warning

* improve checks

* fix urls

* fix cran check

* add cleanup and bump up version number

* use clean in build

* Update Makefile
2016-12-18 15:11:43 -08:00
Yixuan Qiu
b14994aeff [R Package] Use the C++ 11 compiler to test OpenMP flags (#1881)
* fix segfault when gctorture() is enabled

* use the C++ 11 compiler to test OpenMP flags

* auto-generated configure script
2016-12-16 15:11:06 -08:00
Qiang Kou (KK)
5ebd8fb809 autoconf for solaris (#1880) 2016-12-16 21:56:10 +01:00
Tong He
674024c53a [R] Fix for cran submission of xgboost 0.6 (#1875)
fix cran check
2016-12-15 12:04:54 -08:00
Rory Mitchell
d943720883 GPU Plugin: Add bosch demo, update build instructions (#1872) 2016-12-15 07:57:27 +01:00
Matthew Drury
edc356f7ec Add monotonic tutorial. (#1870) 2016-12-14 20:17:19 -06:00
Ian
167864da75 python package tree plotting support fmap (#1856)
* to_graphviz and plot_tree support fmap

* [python-package] add model_plot docstring
2016-12-13 07:36:17 -06:00
Liam Huang
49bdb5c97f fix typo in comment. (#1850) 2016-12-11 19:49:04 +01:00
Vadim Khotilovich
b21e658a02 [R-package] JSON dump format and a couple of bugfixes (#1855)
* [R-package] JSON tree dump interface

* [R-package] precision bugfix in xgb.attributes

* [R-package] bugfix for cb.early.stop called from xgb.cv

* [R-package] a bit more clarity on labels checking in xgb.cv

* [R-package] test JSON dump for gblinear as well

* whitespace lint
2016-12-11 19:48:39 +01:00
AbdealiJK
0268dedeea config.mk: Set TEST_COVER to 0 by default (#1853)
Set the TEST_COVER to 0 by default so it uses optimization
-O3 when compiling.
2016-12-11 19:48:15 +01:00
Ruimin Wang
d9584ab82e refactor duplicate evaluation implementation (#1852) 2016-12-08 20:33:40 -08:00
RAMitchell
2b6aa7736f Add benchmarks, fix GCC build (#1848) 2016-12-08 18:59:10 +01:00
Xin Yin
e7fbc8591f [jvm-packages] Scala implementation of the Rabit tracker. (#1612)
* [jvm-packages] Scala implementation of the Rabit tracker.

A Scala implementation of RabitTracker that is interface-interchangable with the
Java implementation, ported from `tracker.py` in the
[dmlc-core project](https://github.com/dmlc/dmlc-core).

* [jvm-packages] Updated Akka dependency in pom.xml.

* Refactored the RabitTracker directory structure.

* Fixed premature stopping of connection handler.

Added a new finite state "AwaitingPortNumber" to explicitly wait for the
worker to send the port, and close the connection. Stopping the actor
prematurely sends a TCP RST to the worker, causing the worker to crash
on AssertionError.

* Added interface IRabitTracker so that user can switch implementations.

* Default timeout duration changes.

* Dependency for Akka tests.

* Removed the main function of RabitTracker.

* A skeleton for testing Akka-based Rabit tracker.

* waitFor() in RabitTracker no longer throws exceptions.

* Completed unit test for the 'start' command of Rabit tracker.

* Preliminary support for Rabit Allreduce via JNI (no prepare function support yet.)

* Fixed the default timeout duration.

* Use Java container to avoid serialization issues due to intermediate wrappers.

* Added tests for Allreduce/model training using Scala Rabit tracker.

* Added spill-over unit test for the Scala Rabit tracker.

* Fixed a typo.

* Overhaul of RabitTracker interface per code review.

  - Removed methods start() waitFor() (no arguments) from IRabitTracker.
  - The timeout in start(timeout) is now worker connection timeout, as tcp
    socket binding timeout is less intuitive.
  - Dropped time unit from start(...) and waitFor(...) methods; the default
    time unit is millisecond.
  - Moved random port number generation into the RabitTrackerHandler.
  - Moved all Rabit-related classes to package ml.dmlc.xgboost4j.scala.rabit.

* More code refactoring and comments.

* Unified timeout constants. Readable tracker status code.

* Add comments to indicate that allReduce is for tests only. Removed all other variants.

* Removed unused imports.

* Simplified signatures of training methods.

 - Moved TrackerConf into parameter map.
 - Changed GeneralParams so that TrackerConf becomes a standalone parameter.
 - Updated test cases accordingly.

* Changed monitoring strategies.

* Reverted monitoring changes.

* Update test case for Rabit AllReduce.

* Mix in UncaughtExceptionHandler into IRabitTracker to prevent tracker from hanging due to exceptions thrown by workers.

* More comprehensive test cases for exception handling and worker connection timeout.

* Handle executor loss due to unknown cause: the newly spawned executor will attempt to connect to the tracker. Interrupt tracker in such case.

* Per code-review, removed training timeout from TrackerConf. Timeout logic must be implemented explicitly and externally in the driver code.

* Reverted scalastyle-config changes.

* Visibility scope change. Interface tweaks.

* Use match pattern to handle tracker_conf parameter.

* Minor clarification in JNI code.

* Clearer intent in match pattern to suppress warnings.

* Removed Future from constructor. Block in start() and waitFor() instead.

* Revert inadvertent comment changes.

* Removed debugging information.

* Updated test cases that are a bit finicky.

* Added comments on the reasoning behind the unit tests for testing Rabit tracker robustness.
2016-12-07 06:35:42 -08:00
Simon DENEL
7078c41dad Changing omp_get_num_threads to omp_get_max_threads (#1831)
* Updating dmlc-core

* Changing omp_get_num_threads to omp_get_max_threads
2016-12-04 11:26:45 -08:00
AbdealiJK
47ba2de7d4 tests/cpp: Add tests for multiclass_metric.cc 2016-12-04 11:25:57 -08:00
AbdealiJK
a7e20555a3 tests/cpp: Add tests for rank_metrics.cc 2016-12-04 11:25:57 -08:00
AbdealiJK
5912e051b1 rank_metric.cc: Use GetWeight in EvalAMS
The GetWeight is a wrapper which sets the correct weight
if the weights vector is not provided. Hence accessing the default
weights vector is not recommended.
2016-12-04 11:25:57 -08:00
AbdealiJK
4a2ef130a7 tests/cpp: Add test for elementwise_metric.cc 2016-12-04 11:25:57 -08:00
AbdealiJK
03abd47f49 tests/cpp: Add tests for Metric RMSE 2016-12-04 11:25:57 -08:00
AbdealiJK
582c373274 tests/cpp: Add tests for metric.cc 2016-12-04 11:25:57 -08:00
AbdealiJK
cc859420ba tests/cpp: Add tests for TweedieRegression 2016-12-04 11:25:57 -08:00
AbdealiJK
fa865564f6 tests/cpp: Add tests for GammaRegression 2016-12-04 11:25:57 -08:00
AbdealiJK
401e4b5220 tests/cpp: Add tests for PoissonRegression 2016-12-04 11:25:57 -08:00
AbdealiJK
d41aab4f61 tests/cpp: Add tests for regression_obj.cc
Test the objective functions in regression_obj.cc

tests/cpp: Add tests for objective.cc and RegLossObj
2016-12-04 11:25:57 -08:00
AbdealiJK
fd99d39372 tests/cpp: Add tests for SplitEntry 2016-12-04 11:25:57 -08:00
AbdealiJK
62e3468603 tests/cpp: Add tests for param.h 2016-12-04 11:25:57 -08:00
AbdealiJK
d6407c3746 tests/cpp: Add tests for SparsePageDMatrix
The SparsePageDMatrix or external memory DMatrix reads data from the
file IO rather than load it into RAM.
2016-12-04 11:25:57 -08:00
AbdealiJK
c3629c91d3 tests/cpp: Add tests for SimpleCSRSource
Test the binary format saved and read by a SimpleDMatrix, which is
internally the SimpleCSRSource.
2016-12-04 11:25:57 -08:00
AbdealiJK
be0f55d563 tests/cpp: Add tests for SimpleDMatrix 2016-12-04 11:25:57 -08:00
AbdealiJK
ef7fe06cf8 tests/cpp/test_metainfo: Add tests to save and load 2016-12-04 11:25:57 -08:00
AbdealiJK
8eb69e0677 travis: Add code coverage on success
Update the code coverage of the project on codecov for easy viewing.

Also the gcov on travis uses a different version which cannot
find the directory of the given files, and it needs to be specified
in the -o flag. Hence now we loop over the list of files and
run them independently.
2016-12-04 11:25:57 -08:00
AbdealiJK
61a9b3a49e travis: Run CPP tests 2016-12-04 11:25:57 -08:00
AbdealiJK
006f9e0760 Makefile: Add CPP code coverage 2016-12-04 11:25:57 -08:00
AbdealiJK
1f2ad36bad Add make commands for tests
This adds the make commands required to build and run tests.
2016-12-04 11:25:57 -08:00
AbdealiJK
b045ccd764 data.cc: Remove redundant ftype variable 2016-12-04 11:25:57 -08:00
JohnStott
1683e07461 Fix issue introduced from correction to log2 (#1837)
https://github.com/dmlc/xgboost/pull/1642
2016-12-04 11:11:56 -08:00
Vadim Khotilovich
a44032d095 [CORE] The update process for a tree model, and its application to feature importance (#1670)
* [CORE] allow updating trees in an existing model

* [CORE] in refresh updater, allow keeping old leaf values and update stats only

* [R-package] xgb.train mod to allow updating trees in an existing model

* [R-package] added check for nrounds when is_update

* [CORE] merge parameter declaration changes; unify their code style

* [CORE] move the update-process trees initialization to Configure; rename default process_type to 'default'; fix the trees and trees_to_update sizes comparison check

* [R-package] unit tests for the update process type

* [DOC] documentation for process_type parameter; improved docs for updater, Gamma and Tweedie; added some parameter aliases; metrics indentation and some were non-documented

* fix my sloppy merge conflict resolutions

* [CORE] add a TreeProcessType enum

* whitespace fix
2016-12-04 09:33:52 -08:00
Nat Wilson
4398fbbe4a fix typo on documentation page (#1836)
replaces "Lanuages" -> "Languages"
2016-12-03 14:41:30 -08:00
Tong He
2f3958a455 Fix for CRAN Submission (#1826)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc

* fix cran note and warning

* improve checks

* fix urls
2016-12-02 20:19:03 -08:00
xgdgsc
27ca50e2c2 change contribution link to open issues (#1834) 2016-12-02 11:03:03 -08:00
ccphillippi
dd477ac903 Move feature_importances_ to base XGBModel for XGBRegressor access (#1591) 2016-12-01 10:17:37 -08:00
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
Dr. Kashif Rasul
da2556f58a fixed some typos (#1814) 2016-11-25 16:34:57 -05:00