Classifying Variables In A Data Table

Categorical and Numeric Variables

The following figure shows an excerpt of the recoded data out of the available data. The data for age was missing from the given data set and the column was thus left blank.               

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Frequency Table

TRANSPORT in past month

   

Driver/rider of car or motor cycle

Passenger of car

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Others (excluding public transport)

GENDER

female

23

59

37

male

45

76

31

Frequency Table: Row Percentage

TRANSPORT in past month

 

   

Driver/rider of car or motor cycle

Passenger of car

Others (excluding public transport)

Total

GENDER

female

19.32773%

49.57983%

31.09244%

100%

male

29.60526%

50%

20.39474%

100%

Frequency Table: Column Percentage

TRANSPORT in past month

   

Driver/rider of car or motor cycle

Passenger of car

Others (excluding public transport)

GENDER

female

33.82353%

43.7037%

54.41176%

male

66.17647%

56.2963%

45.58824%

 

Total

100%

100%

100%

The row percentages show that 19.32% of females drove to their destinations in the past month, although most, that is, 49.57% were mainly driven by someone else,  that is they were passengers. 31.09% reported some other means of transport. For the males, 50% were driven by someone else, 29.60% drove by themselves and 20.39% travelled by some other means. It is thus seen that most of the people are driven by someone.          

Grouping

Statistical Measure on Number of Activities in the past month

License Status

Mean

Standard Deviation

Pearson’s Skewness

not licensed

6.269

2.164326

-0.14226

learners permit

6.405

2.181031

-0.10214

licensed

8.408

2.110251

0.327959

The statistical measure which explains the centre of the distribution of the number of activities in the last month is the mean. The mean number of activities for the “not licensed” was found to be 6.26 , the mean for those with “learner’s permit” is 6.405 and the mean for those with “license” was 8.408. The measure of spread of the distribution for the respective groups is the standard deviation. It is 2.16 for the not licensed, 2.18 for the ones with learners permit and 2.11 for the licensed. The measure which explains shape of a distribution is the skewness measure.

A distribution with Pearson’s skewness more than 0 is leptokurtic, those less than 0 are mesokurtic and those have same shape as Gaussian or normal.  The further away from normal, the larger is the absolute value of the coefficient. Those with license are leptokurtic whereas the other two groups are mesokurtic. The table above shows the measures as described.

 The results from part (a) and part (b) implies that, the individuals with a license are the ones with most activity. The distribution for the unlicensed and those with learners permit has greater variation with fatter tails than normal.         

b.

                The relationship between the number of sedentary hours spent last month and the number of activities last month was found to be negatively related. The line of best fit, as depicted in blue in the figure in part (a) is explained by the regression equation:

                The equation shows that with unit increase in sedentary hours, the number of the activities decrease by 0.434 units. The absence of sedentary hours implies that the number of activities would be 11.819.

  1. The probability that a person chosen at random from the 8 students are at least 18 years of age is given by the ration of the number of students who are greater than or equal to 18 years in age by the total number of students. The probability as computed using R commander was found to be equal to 0.875.
  1. The probability that a person chosen at random out of the 8 students in female and a psychology major is given by the ratio of count of the number of individuals who are female and have psychology as major by the total number of students, that is, 8. The probability was found to be 0.75.
  1. The conditional probability that a student is aged at least 21 years of age, given that the student is female is given by the ratio of count of the number of individuals who are female and have psychology as major by the total number of women. The probability was found to be 0.25.
  1. The probability of an Australian adult to have blood type B was given to be 0.1. Then the probability that a random sample of 250 people will contain at most 25 people with blood type B is given by P( X< 25) where X denotes the number of people in a sample of 250 who have blood type B. X then follows binomial with size 250 and probability parameter 0.1. Then the required probability was computed using R commander as 0.0838.
  1. It is given that it is of interest to determine the maximum number of blood type B’s such that 12% of multiple samples of size 250 of Australian adults have. This means that it is of interest to determine the value x where P ( X> x) = 0.12 where X is binomial(250,0.1). The value was computed using R commander as 19. So at most 19 people is found to have blood type B among 12% of the samples of size 250 drawn of Australian adults.
  2. The mean number of people with blood type B is then computed as the expectation of binomial distribution of size 250 and probability parameter 0.1. The mean value is then 250×0.1 which equals 25.
  3. The z-score for a random variable X following normal distribution is defined as

Z= (X- mean of X)/standard deviation of X

Then using R commander the Z score when equal to 1, mean of normal is 8.2 and standard deviation 0.6, the value of X is given by X = 0.6* Z + 8.2 = 8.8.

  1. The probability that a variable X denoting the hours of sleep of the 17 year olds, following Normal(8.2, 0.6) will have value between 7.5 and 8 is  given by:

P (7.5 <X< 8.0) = P(X<8.0) – P(X<7.5) = 0.247

  1. The distribution of the mean of a sample of size n of a random variable which follows normal distribution with mean ‘m’ and standard deviation ‘s’ is a normal distribution with mean ‘m’ and standard deviation ‘s’/n. Then the distribution of the mean hours of sleep of a sample of 16 students is Normal(8.2, 0.6/16). Let the mean statistic be denoted by Xbar

Then the probability that the mean lies between 7.5 and 8.0 is given by P (7.5 <Xbar< 8.0) = P(Xbar <8.0) – P(Xbar <7.5)  which was computed to be 5×10-8as per R commander.

  1. The number of students who are then expected to sleep for 7.5 to 8 hours among the 16 students is given by the expectation of a binomial distribution with size 16 and probability 5*x10-8. Then the expected number of students is approximately 0.

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.