* [jvm-packages] Scala implementation of the Rabit tracker. A Scala implementation of RabitTracker that is interface-interchangable with the Java implementation, ported from `tracker.py` in the [dmlc-core project](https://github.com/dmlc/dmlc-core). * [jvm-packages] Updated Akka dependency in pom.xml. * Refactored the RabitTracker directory structure. * Fixed premature stopping of connection handler. Added a new finite state "AwaitingPortNumber" to explicitly wait for the worker to send the port, and close the connection. Stopping the actor prematurely sends a TCP RST to the worker, causing the worker to crash on AssertionError. * Added interface IRabitTracker so that user can switch implementations. * Default timeout duration changes. * Dependency for Akka tests. * Removed the main function of RabitTracker. * A skeleton for testing Akka-based Rabit tracker. * waitFor() in RabitTracker no longer throws exceptions. * Completed unit test for the 'start' command of Rabit tracker. * Preliminary support for Rabit Allreduce via JNI (no prepare function support yet.) * Fixed the default timeout duration. * Use Java container to avoid serialization issues due to intermediate wrappers. * Added tests for Allreduce/model training using Scala Rabit tracker. * Added spill-over unit test for the Scala Rabit tracker. * Fixed a typo. * Overhaul of RabitTracker interface per code review. - Removed methods start() waitFor() (no arguments) from IRabitTracker. - The timeout in start(timeout) is now worker connection timeout, as tcp socket binding timeout is less intuitive. - Dropped time unit from start(...) and waitFor(...) methods; the default time unit is millisecond. - Moved random port number generation into the RabitTrackerHandler. - Moved all Rabit-related classes to package ml.dmlc.xgboost4j.scala.rabit. * More code refactoring and comments. * Unified timeout constants. Readable tracker status code. * Add comments to indicate that allReduce is for tests only. Removed all other variants. * Removed unused imports. * Simplified signatures of training methods. - Moved TrackerConf into parameter map. - Changed GeneralParams so that TrackerConf becomes a standalone parameter. - Updated test cases accordingly. * Changed monitoring strategies. * Reverted monitoring changes. * Update test case for Rabit AllReduce. * Mix in UncaughtExceptionHandler into IRabitTracker to prevent tracker from hanging due to exceptions thrown by workers. * More comprehensive test cases for exception handling and worker connection timeout. * Handle executor loss due to unknown cause: the newly spawned executor will attempt to connect to the tracker. Interrupt tracker in such case. * Per code-review, removed training timeout from TrackerConf. Timeout logic must be implemented explicitly and externally in the driver code. * Reverted scalastyle-config changes. * Visibility scope change. Interface tweaks. * Use match pattern to handle tracker_conf parameter. * Minor clarification in JNI code. * Clearer intent in match pattern to suppress warnings. * Removed Future from constructor. Block in start() and waitFor() instead. * Revert inadvertent comment changes. * Removed debugging information. * Updated test cases that are a bit finicky. * Added comments on the reasoning behind the unit tests for testing Rabit tracker robustness.
125 lines
4.9 KiB
XML
125 lines
4.9 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<project xmlns="http://maven.apache.org/POM/4.0.0"
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
<modelVersion>4.0.0</modelVersion>
|
|
<parent>
|
|
<groupId>ml.dmlc</groupId>
|
|
<artifactId>xgboost-jvm</artifactId>
|
|
<version>0.7</version>
|
|
</parent>
|
|
<artifactId>xgboost4j</artifactId>
|
|
<version>0.7</version>
|
|
<packaging>jar</packaging>
|
|
<profiles>
|
|
<profile>
|
|
<id>NotWindows</id>
|
|
<activation>
|
|
<os>
|
|
<family>!windows</family>
|
|
</os>
|
|
</activation>
|
|
<build>
|
|
<plugins>
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-javadoc-plugin</artifactId>
|
|
<version>2.10.3</version>
|
|
<configuration>
|
|
<show>protected</show>
|
|
<nohelp>true</nohelp>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-assembly-plugin</artifactId>
|
|
<configuration>
|
|
<skipAssembly>false</skipAssembly>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<artifactId>exec-maven-plugin</artifactId>
|
|
<groupId>org.codehaus.mojo</groupId>
|
|
<executions>
|
|
<execution><!-- Run our version calculation script -->
|
|
<id>native</id>
|
|
<phase>generate-sources</phase>
|
|
<goals>
|
|
<goal>exec</goal>
|
|
</goals>
|
|
<configuration>
|
|
<executable>create_jni.sh</executable>
|
|
</configuration>
|
|
</execution>
|
|
</executions>
|
|
</plugin>
|
|
</plugins>
|
|
</build>
|
|
</profile>
|
|
<profile>
|
|
<id>Windows</id>
|
|
<activation>
|
|
<os>
|
|
<family>windows</family>
|
|
</os>
|
|
</activation>
|
|
<build>
|
|
<plugins>
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-javadoc-plugin</artifactId>
|
|
<version>2.10.3</version>
|
|
<configuration>
|
|
<show>protected</show>
|
|
<nohelp>true</nohelp>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-assembly-plugin</artifactId>
|
|
<configuration>
|
|
<skipAssembly>false</skipAssembly>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<artifactId>exec-maven-plugin</artifactId>
|
|
<groupId>org.codehaus.mojo</groupId>
|
|
<executions>
|
|
<execution><!-- Run our version calculation script -->
|
|
<id>native</id>
|
|
<phase>generate-sources</phase>
|
|
<goals>
|
|
<goal>exec</goal>
|
|
</goals>
|
|
<configuration>
|
|
<executable>create_jni.bat</executable>
|
|
</configuration>
|
|
</execution>
|
|
</executions>
|
|
</plugin>
|
|
</plugins>
|
|
</build>
|
|
</profile>
|
|
</profiles>
|
|
<dependencies>
|
|
<dependency>
|
|
<groupId>junit</groupId>
|
|
<artifactId>junit</artifactId>
|
|
<version>4.11</version>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>com.typesafe.akka</groupId>
|
|
<artifactId>akka-actor_${scala.binary.version}</artifactId>
|
|
<version>2.3.11</version>
|
|
<scope>compile</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>com.typesafe.akka</groupId>
|
|
<artifactId>akka-testkit_${scala.binary.version}</artifactId>
|
|
<version>2.3.11</version>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
</dependencies>
|
|
</project>
|