* Add objective and metric. * Some refactoring for CPU/GPU dispatching using linalg module.
* Add `Tensor` class. * Add elementwise kernel for CPU and GPU. * Add unravel index. * Move some computation to compile time.