# 19 Correlation Analysis

We work with the famous dataset that Pearson collected on the heights of men and their sons. An appropriate plot is the scatterplot.

```
require(UsingR)
plot(father.son, pch = 16, xlab = "father's height", ylab = "son's height", asp = 1)
```

## 19.1 Sample correlations

We compute various types of correlations between the son and father heights.

`[1] 0.5013383`

`[1] 0.5058485`

`[1] 0.3492753`

## 19.2 Correlations tests

Although it is pretty clear from the scatterplot that the father and son heights are positively correlated (or more generally, monotonically associated), for pedagodical reasons, we perform the corresponding tests. (Look at the manual for details on how the p-values are computed. In the present case, the first and third are approximated based on asymptotic theory, while the second is based on exact calculations but the p-value is only approximated because of ties.)

```
Pearson's product-moment correlation
data: sheight and fheight
t = 19.006, df = 1076, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.4552586 0.5447396
sample estimates:
cor
0.5013383
```

```
Spearman's rank correlation rho
data: sheight and fheight
S = 103172697, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.5058485
```

```
Kendall's rank correlation tau
data: sheight and fheight
z = 17.174, p-value < 2.2e-16
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
0.3492753
```

## 19.3 Distance covariance (and test)

We also apply the distance covariance test. (The function returns the Monte Carlo permutation p-value based on R replicates.)

```
dCov independence test (permutation test)
data: index 1, replicates 1000
nV^2 = 742.04, p-value = 0.000999
sample estimates:
dCov
0.8296702
```