Classifying Variables And Probability Calculations
Answer 2.a)
Variable |
Classification |
License |
Nominal variable (Licensed, Learners permit and Not licensed) |
Sex |
Nominal variable (Male and Female) |
Activities |
Numerical variable |
Transport |
Nominal variable (Passenger, Driver and Other) |
MVPA |
Numerical/quantitative variable |
logMVPA |
Numerical/quantitative variable |
sed |
Numerical/quantitative variable |
yourID |
Nominal variable |
Answer 2.a)
Total percentages:
Transport/Sex |
Drivers |
Passengers |
Other |
Total |
Female |
6.6 |
26.2 |
14.4 |
47.2 |
Male |
16.6 |
24.7 |
11.4 |
52.8 |
Total |
23.2 |
50.9 |
25.8 |
100.0 |
Pearson’s Chi-square test:
|
Transport |
||
Sex |
Driver |
Passenger |
Other |
Female |
4.64 |
0.52 |
1.07 |
Male |
4.16 |
0.47 |
0.95 |
Chi-square test:
χ2 statistic |
11.808 |
Degrees of freedom |
2 |
p-value |
0.002729 |
Answer 2.b)
The hypotheses indicate that-
Null hypothesis (H0): Sex and transport are independent to each other.
Alternative hypothesis (HA): Sex and transport are associated to each other.
The p-value (0.002729) of the test is less than 0.05. Therefore, the null hypothesis of independence of these two variables is rejected with 95% confidence and alternative hypothesis of association of these variables is accepted with 95% probability. It could be interpreted that gender (sex) and modes of transportation (transport) are related to each other.
Answer 3.b)
|
Statistics |
||
License types |
IQR |
Skewness |
Kurtosis |
Not licensed |
3 |
0.0543 |
-0.7815 |
Learners permit |
3 |
-0.0268 |
-0.8808 |
Licensed |
2 |
-0.0928 |
-0.5112 |
Shapiro-Wilk normality test |
Normality measures |
||
License types |
Statistic (W) |
p-value |
Significance |
Not licensed |
0.96166 |
0.02912 |
Yes |
Learners permit |
0.95517 |
0.006442 |
Yes |
Licensed |
0.95826 |
0.0009744 |
Yes |
Answer 3. c)
The distribution of activities of licensed products is more right skewed than the distributions of activities of non-license and learners permit. The mean activity level of license is greater than non-license and learners permit. As all the p-values are less than 5%, therefore, the assumption of normality is violated. However, p-value is comparative greater in case of non-license. Hence, non-license activities is comparatively more normally distributed than other two histograms.
Answer 4.
The scatter plot of self-reported sedentary hours per week (sed) and number of activities (Activities) attended in the previous month has moderately strong and negative link (in terms of correlation) between themselves. For, higher value of ‘sed’, ‘Activities’ decreases and for lower value of ‘sed’, ‘Activities’ increases.
Answer 5.
Initials |
Age in years |
Gender |
Area of study |
HP |
17 |
Male |
Nursing |
RT |
19 |
Male |
Accounting |
SK |
20 |
Female |
Psychology |
KZ |
20 |
Male |
Psychology |
AN |
21 |
Female |
Nursing |
KK |
22 |
Female |
Psychology |
JH |
22 |
Male |
Psychology |
PV |
25 |
Female |
Nursing |
If a person at random is selected from this group, then the probability that a person would be 18 or more than years of age If a person is selected at random from ‘Area of study’, then the probability that a person would be a female who is studying psychology
If a female is selected at random who is female and whose age is 21 years or more years old
Answer 6.
As the random sample of 250 adults would contain 25 or fewer people whose blood group is B, then, k = 25. Here, number of total samples (n) = 250, probability (p) = 0.1.
Therefore, the probability that the sample size of 25 samples or less whose blood group is B among all Australian adults = 0.552995.
Answer 6.b)
200 samples were drawn and each of the samples contains exactly 250 people. The samples (sample size = 250) should contain fewer than 102 people with type-B blood.
The z-score is recorded as 1.
Z =
= 8.2 + 1*0.6 = (8.2 + 0.6) = 8.8.
In 17 years old, 8.8 hours sleep causes a z-score of 1.
The probability = (7.5 < < 8.0) = ( < < = ( < < =
(-1.16667< Z < = (Z < – (Z ≤ = 0.3707 – 0.123024 = 0.247676 (Lepetit and Strobel 2013).
The probability that the sample mean for normal nights would lie between 7.5 and 8 hours per night = 0.248.
For a random sample of sixteen 17 years old people, (16 * 0.247676) = 4 people would be expected to sleep between 7.5 to 8 hours.
References
Baty, F., Ritz, C., Charles, S., Brutsche, M., Flandrois, J.P. and Delignette-Muller, M.L., 2015. A toolbox for nonlinear regression in R: the package nlstools. Journal of Statistical Software, 66(5), pp.1-21.
Lepetit, L. and Strobel, F., 2013. Bank insolvency risk and time-varying Z-score measures. Journal of International Financial Markets, Institutions and Money, 25, pp.73-87.