Posted: January 18th, 2021

# Statistics Psych using Excel essay

student

Need help with my Statistics question – I’m studying for my class.

The goal of this assignment is to help you understand the logic underlying the estimation of RSE (Random Sampling Error) based on simulated computation (estimation) using height data.

See Excel

Sheet 1 on Excel contains 1620 people’s height data

These 1620 people’s height are 54 sets of 30 samples – this means that sample size(n) is 30 and you have 54 of them.

Therefore, in this assignment we make following assumptions:

Height data of 1620 people (54 sets of samples containing 30 people’s height) are population (I know this actually is a set of sample, but we pretend that this is a population: N = 1620)

30 people’s height within each set of sample is a set of sample: therefore sample size is 30 (n n=30) and there are 54 sets of samples.

Based on these assumptions, please compute:

Population mean (mean of the 1620 people’s height)

Sample mean (mean of the 30 people) – please choose a specific sample from 54 samples, and compute the sample mean based on 30 samples in that particular set.

Population standard deviation based on 1620 people as population

Sample standard deviation (population standard deviation estimated based on your own sample of 30 – so you need to compute the SD on 30 people’s height in your own sample that you chose)

Create a sampling distribution of the mean based on these 54 sets of samples and compare the shape (characteristics) of the sampling distribution with population distribution of height that I provided (sheet 2 grouped frequency polygon) by following these steps:

Then

step 1 compute the mean of 30 people’s height for each of all the 54 sets of samples – so you need 54 sample means for 54 sets of sample

step 2 create a group frequency distribution table based on the computed means (54) – this is a grouped frequency table for sampling distribution of the 54 means

Compare the shape of frequency distributions between Population of Height (one I provided) and Sampling distribution of the means (54 sets of Means you created). For your reference I am providing the grouped frequency polygon representing the population distribution (the third sheet of the excel) and answer the following questions:

What is the relation between population mean and the mean of the 54 means? – same or different

Which of the two distributions (population distribution of 1620 height data vs sampling distribution of the 54 means) has a narrower distribution clustered around the population mean?

to what extent, the observation of the above two (a and b) aspects of the sampling distribution lend support to the Central Limit Theorem? – this requires you read CLM and understand it.

RSE as difference between your own sample mean and the mean of the sampling distribution of mean (average of the 54 sets of sample mean)

RSE as Standard Error of the Mean which is the Standard Deviation computed based on sampling distribution of the mean – this means computing a SD based on 54 sample means. For this use the sampling distribution of the mean that you created in the above (you need to use population St Dev computation function in Excel – see below).

RSE as Standard Error of the Mean approximated by population standard deviation (based on 1620 data) divided by the square root of n (n=30)

RSE as Standard Error of the Mean approximated by sample standard deviation (based on your own sample of n=30) (use of n-1 in denominator – sample standard deviation in Excel – see below) divided by the square root of n (n=30)

(This is a bonus point of 5 on top of 30) I assume that 6-3 and 6-4 are different even though they are supposed to be similar according to the lecture. Speculate on the reason why they are different.Based on what you have learned on the four different approaches of estimating RSE, they should be the same. But they are different in this one. Why? Hint: the nature of the sample (30 people’s height)? You can include any questions or comments based on this process. Your points is not entirely based on whether your answer is correct; it is mainly based on evidence of THINKING you put here.

6 ) Estimate the Random Sampling Error in the following four different ways based on your understanding of the definition of RSE we just covered in the class:

In computing Means and SD, use excel’s computational functions:

For mean (average) see: https://www.youtube.com/watch?v=5_OHS-18RbU\

For standard deviation see: https://www.youtube.com/watch?v=uZWQXQG37Zs

There are STD. P (population where the denominator is n) and STD.S (sample where the denominator is n-1). Be careful to use appropriate one. You should, by now, know which one to use when. If you have question on this, please send me an email.

In sum your assignment needs to address all these questions

Population mean

Sample mean

Population standard deviation

Sample standard deviation (population standard deviation estimated based on your own sample)

RSE as difference between your sample mean and the mean of the sampling distribution of mean (average of the 54 sets of sample mean)

RSE as Standard Error of the Mean which is Standard Deviation computed on the sampling distribution of the mean.

RSE as Standard Error of the Mean approximated by population standard deviation divided by the square root of n (n=30)

RSE as Standard Error of the Mean approximated by sample standard deviationofyour own sample) divided by the square root of n (n=30)

(bonus points) consideration of why 6-3 and 6-4 are different.

5-aWhat is the relation between population mean and the mean of the 54 means?

5-b. Which of the two distributions (population vs sampling distribution of the means) has a narrower distribution clustered around the population mean?

5-c.To what extent, the observation of the above two (a and b) aspect of the sampling distribution lend support to the Central Limit Theorem?

Requirements: .doc file

ATTACHMENTS