code simplification

This commit is contained in:
El Potaeto 2015-03-12 23:44:08 +01:00
parent 09091884be
commit 93a019d174

View File

@ -153,7 +153,7 @@ head(sparse_matrix)
Create the output `numeric` vector (not as a sparse `Matrix`):
```{r}
output_vector = df[,Y:=0][Improved == "Marked",Y:=1][,Y]
output_vector = df[,Improved] == "Marked"
```
1. set `Y` vector to `0`;
@ -261,21 +261,21 @@ Let's check some **Chi2** between each of these features and the label.
Higher **Chi2** means better correlation.
```{r, warning=FALSE, message=FALSE}
c2 <- chisq.test(df$Age, df$Y)
c2 <- chisq.test(df$Age, output_vector)
print(c2)
```
Pearson correlation between Age and illness disapearing is **`r round(c2$statistic, 2 )`**.
```{r, warning=FALSE, message=FALSE}
c2 <- chisq.test(df$AgeDiscret, df$Y)
c2 <- chisq.test(df$AgeDiscret, output_vector)
print(c2)
```
Our first simplification of Age gives a Pearson correlation is **`r round(c2$statistic, 2)`**.
```{r, warning=FALSE, message=FALSE}
c2 <- chisq.test(df$AgeCat, df$Y)
c2 <- chisq.test(df$AgeCat, output_vector)
print(c2)
```