spelling and grammar changes
This commit is contained in:
parent
16781ac8f9
commit
030a4e4e25
28
doc/faq.md
28
doc/faq.md
@ -1,6 +1,6 @@
|
||||
Frequent Asked Questions
|
||||
Frequently Asked Questions
|
||||
========================
|
||||
This document contains the frequent asked question to xgboost.
|
||||
This document contains frequently asked questions about xgboost.
|
||||
|
||||
How to tune parameters
|
||||
----------------------
|
||||
@ -13,7 +13,7 @@ See [Introduction to Boosted Trees](model.md)
|
||||
|
||||
I have a big dataset
|
||||
--------------------
|
||||
XGBoost is designed to be memory efficient. Usually it could handle problems as long as the data fit into your memory
|
||||
XGBoost is designed to be memory efficient. Usually it can handle problems as long as the data fit into your memory
|
||||
(This usually means millions of instances).
|
||||
If you are running out of memory, checkout [external memory version](external_memory.md) or
|
||||
[distributed version](https://github.com/dmlc/wormhole/tree/master/learn/xgboost) of xgboost.
|
||||
@ -23,30 +23,30 @@ Running xgboost on Platform X (Hadoop/Yarn, Mesos)
|
||||
--------------------------------------------------
|
||||
The distributed version of XGBoost is designed to be portable to various environment.
|
||||
Distributed XGBoost can be ported to any platform that supports [rabit](https://github.com/dmlc/rabit).
|
||||
You can directly run xgboost on Yarn. In theory Mesos and other resource allocation engine can be easily supported as well.
|
||||
You can directly run xgboost on Yarn. In theory Mesos and other resource allocation engines can be easily supported as well.
|
||||
|
||||
|
||||
Why not implement distributed xgboost on top of X (Spark, Hadoop)
|
||||
-----------------------------------------------------------------
|
||||
The first fact we need to know is going distributed does not necessarily solve all the problems.
|
||||
Instead, it creates more problems such as more communication over head and fault tolerance.
|
||||
The ultimate question will still come back into how to push the limit of each computation node
|
||||
Instead, it creates more problems such as more communication overhead and fault tolerance.
|
||||
The ultimate question will still come back to how to push the limit of each computation node
|
||||
and use less resources to complete the task (thus with less communication and chance of failure).
|
||||
|
||||
To achieve these, we decide to reuse the optimizations in the single node xgboost and build distributed version on top of it.
|
||||
The demand of communication in machine learning is rather simple, in a sense that we can depend on a limited set of API (in our case rabit).
|
||||
Such design allows us to reuse most of the code, and being portable to major platforms such as Hadoop/Yarn, MPI, SGE.
|
||||
Most importantly, pushs the limit of the computation resources we can use.
|
||||
The demand of communication in machine learning is rather simple, in the sense that we can depend on a limited set of API (in our case rabit).
|
||||
Such design allows us to reuse most of the code, while being portable to major platforms such as Hadoop/Yarn, MPI, SGE.
|
||||
Most importantly, it pushes the limit of the computation resources we can use.
|
||||
|
||||
|
||||
How can I port the model to my own system
|
||||
-----------------------------------------
|
||||
The model and data format of XGBoost is exchangable.
|
||||
Which means the model trained by one langauge can be loaded in another.
|
||||
The model and data format of XGBoost is exchangable,
|
||||
which means the model trained by one language can be loaded in another.
|
||||
This means you can train the model using R, while running prediction using
|
||||
Java or C++, which are more common in production system.
|
||||
You can also train the model using distributed version,
|
||||
and load them in from python to do some interactive analysis.
|
||||
Java or C++, which are more common in production systems.
|
||||
You can also train the model using distributed versions,
|
||||
and load them in from Python to do some interactive analysis.
|
||||
|
||||
|
||||
Do you support LambdaMART
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user