From e353a2e51cd0269a31bbc1dac4001fa5193d312a Mon Sep 17 00:00:00 2001
From: Ajinkya Kale <kaleajinkya@gmail.com>
Date: Fri, 24 Jul 2015 17:00:02 -0700
Subject: [PATCH 1/5] restructuring the README with an index

---
 README.md | 87 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 47 insertions(+), 40 deletions(-)

diff --git a/README.md b/README.md
index 4fabb7362..7a4cfa4c8 100644
--- a/README.md
+++ b/README.md
@@ -1,29 +1,32 @@
-DMLC/XGBoost
-==================================
+XGBoost
+=======
 
 [![Build Status](https://travis-ci.org/dmlc/xgboost.svg?branch=master)](https://travis-ci.org/dmlc/xgboost)  [![Gitter chat for developers at https://gitter.im/dmlc/xgboost](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/dmlc/xgboost?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
 
-An optimized general purpose gradient boosting library. The library is parallelized, and also provides an optimized distributed version.
+An optimized general purpose gradient boosting library. The library is parallelized, and also provides an optimized distributed version. 
+
 It implements machine learning algorithms under the [Gradient Boosting](https://en.wikipedia.org/wiki/Gradient_boosting) framework, including [Generalized Linear Model](https://en.wikipedia.org/wiki/Generalized_linear_model) (GLM) and [Gradient Boosted Decision Trees](https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting) (GBDT). XGBoost can also be [distributed](#features) and scale to Terascale data
 
-Check out our [Committers and Contributors](CONTRIBUTORS.md) who help make xgboost better.
-
-Documentation: [Documentation of dmlc/xgboost](doc/README.md)
-
-Issue Tracker: [https://github.com/dmlc/xgboost/issues](https://github.com/dmlc/xgboost/issues?q=is%3Aissue+label%3Aquestion)
-
-Please join [XGBoost User Group](https://groups.google.com/forum/#!forum/xgboost-user/) to ask questions and share your experience on xgboost.
-  - Use issue tracker for bug reports, feature requests etc.
-  - Use the user group to post your experience, ask questions about general usages.
-
-Distributed Version: [Distributed XGBoost](multi-node)
-
-Highlights of Usecases: [Highlight Links](doc/README.md#highlight-links)
-
 XGBoost is part of [Distributed Machine Learning Common](http://dmlc.github.io/) projects
 
+Contents
+--------
+* [What's New](#whats-new)
+* [Version](#version)
+* [Documentation](doc/README.md)
+* [Build Instruction](doc/build.md)
+* [Features](#features)
+* [Distributed XGBoost](multi-node)
+* [Usecases](doc/README.md#highlight-links)
+* [Bug Reporting](#bug-reporting)
+* [Contributing to XGBoost](#contributing-to-xgboost)
+* [Committers and Contributors](CONTRIBUTORS.md)
+* [License](#license)
+* [XGBoost in Graphlab Create](#xgboost-in-graphlab-create)
+
 What's New
-==========
+----------
+
 * XGBoost helps Chenglong Chen to win [Kaggle CrowdFlower Competition](https://www.kaggle.com/c/crowdflower-search-relevance)
   - Check out the winning solution at [Highlight links](doc/README.md#highlight-links)
 * XGBoost-0.4 release, see [CHANGES.md](CHANGES.md#xgboost-04)
@@ -31,42 +34,46 @@ What's New
   - Check out the winning solution at [Highlight links](doc/README.md#highlight-links)
 * [External Memory Version](doc/external_memory.md)
 
-Contributing to XGBoost
-=========
-XGBoost has been developed and used by a group of active community members. Everyone is more than welcome to contribute. It is a way to make the project better and more accessible to more users.
-* Check out [Feature Wish List](https://github.com/dmlc/xgboost/labels/Wish-List) to see what can be improved, or open an issue if you want something.
-* Contribute to the [documents and examples](https://github.com/dmlc/xgboost/blob/master/doc/) to share your experience with other users.
-* Please add your name to [CONTRIBUTORS.md](CONTRIBUTORS.md) after your patch has been merged.
+Version
+-------
+
+* Current version xgboost-0.4, a lot improvment has been made since 0.3
+  - Change log in [CHANGES.md](CHANGES.md)
+  - This version is compatible with 0.3x versions
 
 Features
-========
-* Easily accessible in python, R, Julia, CLI
-* Fast speed and memory efficient
-  - Can be more than 10 times faster than GBM in sklearn and R
+--------
+
+* Easily accessible through python, R, Julia, CLI
+* Fast and memory efficient
+  - Can be more than 10 times faster than GBM in sklearn and R. [benchm-ml numbers](https://github.com/szilard/benchm-ml)
   - Handles sparse matrices, support external memory
 * Accurate prediction, and used extensively by data scientists and kagglers
   - See [highlight links](https://github.com/dmlc/xgboost/blob/master/doc/README.md#highlight-links)
 * Distributed and Portable
   - The distributed version runs on Hadoop (YARN), MPI, SGE etc.
   - Scales to billions of examples and beyond
+  
+Bug Reporting
+-------------
 
-Build
-=======
-* Run ```bash build.sh``` (you can also type make)
-  - Normally it gives what you want
-  - See [Build Instruction](doc/build.md) for more information
+* For reporting bugs please use the [xgboost/issues](https://github.com/dmlc/xgboost/issues) page.
+* For generic questions or to share your experience using xgboost please use the [XGBoost User Group](https://groups.google.com/forum/#!forum/xgboost-user/)
 
-Version
-=======
-* Current version xgboost-0.4, a lot improvment has been made since 0.3
-  - Change log in [CHANGES.md](CHANGES.md)
-  - This version is compatible with 0.3x versions
+
+Contributing to XGBoost
+-----------------------
+
+XGBoost has been developed and used by a group of active community members. Everyone is more than welcome to contribute. It is a way to make the project better and more accessible to more users.
+* Check out [Feature Wish List](https://github.com/dmlc/xgboost/labels/Wish-List) to see what can be improved, or open an issue if you want something.
+* Contribute to the [documents and examples](https://github.com/dmlc/xgboost/blob/master/doc/) to share your experience with other users.
+* Please add your name to [CONTRIBUTORS.md](CONTRIBUTORS.md) after your patch has been merged.
 
 License
-=======
+-------
 © Contributors, 2015. Licensed under an [Apache-2](https://github.com/dmlc/xgboost/blob/master/LICENSE) license.
 
 XGBoost in Graphlab Create
-==========================
+--------------------------
 * XGBoost is adopted as part of boosted tree toolkit in Graphlab Create (GLC). Graphlab Create is a powerful python toolkit that allows you to do data manipulation, graph processing, hyper-parameter search, and visualization of TeraBytes scale data in one framework. Try the Graphlab Create in http://graphlab.com/products/create/quick-start-guide.html
 * Nice blogpost by Jay Gu about using GLC boosted tree to solve kaggle bike sharing challenge: http://blog.graphlab.com/using-gradient-boosted-trees-to-predict-bike-sharing-demand

From cbdcbfc49c63c8c0201b429839e8b64c6a81ef52 Mon Sep 17 00:00:00 2001
From: Ajinkya Kale <kaleajinkya@gmail.com>
Date: Sat, 25 Jul 2015 12:46:28 -0700
Subject: [PATCH 2/5] some more changes to remove redundant information

---
 README.md | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 7a4cfa4c8..18c5b77c1 100644
--- a/README.md
+++ b/README.md
@@ -28,17 +28,17 @@ What's New
 ----------
 
 * XGBoost helps Chenglong Chen to win [Kaggle CrowdFlower Competition](https://www.kaggle.com/c/crowdflower-search-relevance)
-  - Check out the winning solution at [Highlight links](doc/README.md#highlight-links)
+  Check out the [winning solution](doc/README.md#highlight-links)
 * XGBoost-0.4 release, see [CHANGES.md](CHANGES.md#xgboost-04)
 * XGBoost helps three champion teams to win [WWW2015  Microsoft Malware Classification Challenge (BIG 2015)](http://www.kaggle.com/c/malware-classification/forums/t/13490/say-no-to-overfitting-approaches-sharing)
-  - Check out the winning solution at [Highlight links](doc/README.md#highlight-links)
+  Check out the [winning solution](doc/README.md#highlight-links)
 * [External Memory Version](doc/external_memory.md)
 
 Version
 -------
 
-* Current version xgboost-0.4, a lot improvment has been made since 0.3
-  - Change log in [CHANGES.md](CHANGES.md)
+* Current version xgboost-0.4
+  - [Change log](CHANGES.md)
   - This version is compatible with 0.3x versions
 
 Features
@@ -48,8 +48,7 @@ Features
 * Fast and memory efficient
   - Can be more than 10 times faster than GBM in sklearn and R. [benchm-ml numbers](https://github.com/szilard/benchm-ml)
   - Handles sparse matrices, support external memory
-* Accurate prediction, and used extensively by data scientists and kagglers
-  - See [highlight links](https://github.com/dmlc/xgboost/blob/master/doc/README.md#highlight-links)
+* Accurate prediction, and used extensively by data scientists and kagglers - [highlight links](https://github.com/dmlc/xgboost/blob/master/doc/README.md#highlight-links)
 * Distributed and Portable
   - The distributed version runs on Hadoop (YARN), MPI, SGE etc.
   - Scales to billions of examples and beyond
@@ -75,5 +74,5 @@ License
 
 XGBoost in Graphlab Create
 --------------------------
-* XGBoost is adopted as part of boosted tree toolkit in Graphlab Create (GLC). Graphlab Create is a powerful python toolkit that allows you to do data manipulation, graph processing, hyper-parameter search, and visualization of TeraBytes scale data in one framework. Try the Graphlab Create in http://graphlab.com/products/create/quick-start-guide.html
+* XGBoost is adopted as part of boosted tree toolkit in Graphlab Create (GLC). Graphlab Create is a powerful python toolkit that allows you to do data manipulation, graph processing, hyper-parameter search, and visualization of TeraBytes scale data in one framework. Try the [Graphlab Create](http://graphlab.com/products/create/quick-start-guide.html)
 * Nice blogpost by Jay Gu about using GLC boosted tree to solve kaggle bike sharing challenge: http://blog.graphlab.com/using-gradient-boosted-trees-to-predict-bike-sharing-demand

From 9a936721d84873c5d97fda04f9c96760e82500e5 Mon Sep 17 00:00:00 2001
From: Ajinkya Kale <kaleajinkya@gmail.com>
Date: Sun, 26 Jul 2015 20:12:51 -0700
Subject: [PATCH 3/5] dropping raw graphlab url

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 18c5b77c1..0a09c5168 100644
--- a/README.md
+++ b/README.md
@@ -75,4 +75,4 @@ License
 XGBoost in Graphlab Create
 --------------------------
 * XGBoost is adopted as part of boosted tree toolkit in Graphlab Create (GLC). Graphlab Create is a powerful python toolkit that allows you to do data manipulation, graph processing, hyper-parameter search, and visualization of TeraBytes scale data in one framework. Try the [Graphlab Create](http://graphlab.com/products/create/quick-start-guide.html)
-* Nice blogpost by Jay Gu about using GLC boosted tree to solve kaggle bike sharing challenge: http://blog.graphlab.com/using-gradient-boosted-trees-to-predict-bike-sharing-demand
+* Nice [blogpost](http://blog.graphlab.com/using-gradient-boosted-trees-to-predict-bike-sharing-demand) by Jay Gu about using GLC boosted tree to solve kaggle bike sharing challenge: 

From f2eb55683cc20f8e7885add55cca50916bb7ad5f Mon Sep 17 00:00:00 2001
From: Ajinkya Kale <kaleajinkya@gmail.com>
Date: Sun, 26 Jul 2015 20:30:59 -0700
Subject: [PATCH 4/5] some more links and restructuring

---
 README.md | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 0a09c5168..51ed94633 100644
--- a/README.md
+++ b/README.md
@@ -44,14 +44,13 @@ Version
 Features
 --------
 
-* Easily accessible through python, R, Julia, CLI
-* Fast and memory efficient
-  - Can be more than 10 times faster than GBM in sklearn and R. [benchm-ml numbers](https://github.com/szilard/benchm-ml)
-  - Handles sparse matrices, support external memory
+* Easily accessible through CLI, [python](guide-python/basic_walkthrough.py), 
+  [R](../R-package/demo/basic_walkthrough.R), 
+  [Julia](https://github.com/antinucleon/XGBoost.jl/blob/master/demo/basic_walkthrough.jl)
+* Its fast! Benchmark numbers comparing xgboost, H20, Spark, R - [benchm-ml numbers](https://github.com/szilard/benchm-ml)
+* Memory efficient - Handles sparse matrices, supports external memory
 * Accurate prediction, and used extensively by data scientists and kagglers - [highlight links](https://github.com/dmlc/xgboost/blob/master/doc/README.md#highlight-links)
-* Distributed and Portable
-  - The distributed version runs on Hadoop (YARN), MPI, SGE etc.
-  - Scales to billions of examples and beyond
+* Distributed version runs on Hadoop (YARN), MPI, SGE etc., scales to billions of examples.
   
 Bug Reporting
 -------------

From fc27e2f32d79632261d9a905e3a34f4df42061e7 Mon Sep 17 00:00:00 2001
From: Ajinkya Kale <kaleajinkya@gmail.com>
Date: Sun, 26 Jul 2015 20:31:51 -0700
Subject: [PATCH 5/5] adding DMLC back to the title

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 51ed94633..df53f5e46 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-XGBoost
+DMLC/XGBoost
 =======
 
 [![Build Status](https://travis-ci.org/dmlc/xgboost.svg?branch=master)](https://travis-ci.org/dmlc/xgboost)  [![Gitter chat for developers at https://gitter.im/dmlc/xgboost](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/dmlc/xgboost?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)