Q1. Why do we study samples when we want to know about populations?
Samples that representing the population are preferable because:
Cost: Cost is one of the main arguments in favor of sampling, because often a sample can furnish data of sufficient accuracy and at much lower cost than a census.
Q2. Why do we study random sample instead of just any sample?
Q3. Why is the median sometimes better than the mean as an indicator of the central tendency?
Q5. Why is P(A and B) = P(A)P(B|A) = P(B)P(A|B)?
Similarly, P(B|A) = P(A and B)/P(A) provided P(A) is non-zero.
The rest follows. Right?
P(A) + P(B) - P(both) = P(A) + P(B) – P(AÇB
Right?
Q7. If in an experiment there are three possible outcomes (a, b, c) and their probabilities are P(a) = .3, P(b) = .4, and P(c) = .5, why must at least two of the three outcomes not independent of each other?
Q8. Why do we use S(x – x bar)2 to measure variability instead of S(x - xbar)?
Because, if we add up all positive and negative deviations, we get always zero value, i.e., S(x – x bar) = 0. So, to deal with this problem, we square the deviations. Why not using power of four (three will not work)? Squaring does the trick; why should we make life more complicated than it is?
Notice also that squaring also magnifies the deviations; therefore it works to our advantage to measure the quality of the data.
Q9. To approximate the binomial distribution, why do we sometimes use the Poisson distribution and sometimes use the normal distribution?
Poisson approximation to binomial is a discrete-to-discrete approximation; therefore it is preferable to the normal approximation. However, just as binomial table is limited, the Poisson table is limited too in its scope; therefore one may have to approximate both by normal.
Q10. Why is the (1 – a)100% confidence interval equal to x ± za/2sx?
It is the case of Single Observation, i.e., n=1. Therefore, if the population is normal with known standard deviation sx then the above confidence interval is correct.
Whenever we have a mixture of population, no standard statistical technique is applicable. In such a case one must take sample from each stratum randomly and then apply statistical tools to each sub-population. Never mix apples with oranges.
It is similar to the stratified sampling in its intents, however often cluster sample are within each cluster randomly.
Q13. Why do we usually test for Type I error instead of Type II error in hypothesis testing?
Because the null hypothesis is always specified in exact form with (=) sign. Therefore one can talk about rejecting or not rejecting the null hypothesis. However, if the alternative is also specified in exact form with (=) sign, then one in able to compute both types of errors.
Q14. Why the “margin of error” is often used as a measure of accuracy in estimation?
When estimating a parameter of a population based on a random sample, one has to provide the degree of accuracy. The accuracy of the estimate is often expressed by a confidence interval with specific confidence level.
The half-length of the confidence interval is often referred to as absolute error, absolute precision, and even margin of error. However, the usual usage of the “marginal of error” is referred to the half-length of confidence interval with 95% confidence.
Q15. Why there are so many statistical tables? Which one to use?
Statistical tables are used to construct confidence interval in estimation, as well as reaching reasonable conclusions in test of hypotheses. Depending on application areas, one may, for example classify the two major statistical tables as follows:
T - Table: expected value of population(s), regression coefficients, and correlation(s).
Z - Table: Similar to the T-table, with large-size (say over 30).
Q16. Why do we use the p-value? What is it?
The p-value is the tail probability of the test statistic value given that the null hypothesis is true. Since the p-value is a function of a test statistic, which is a function of sample data, therefore it is a statistic as well as a conditional probability.This is analogous to the method of maximum likelihood parameter estimation wherein we consider the data to be fixed and the parameter to be variable.
Q17. Why is linear regression a good model when the range of the independent variable is small?
Q18. Why does high correlation not imply causality?
Determination of cause-and-effect is not in the statistician’s job description.
Any specific cause-and –effect belongs to specific areas of knowledge subject to rigorous experimentation. Correlation measures the strength of linear numerical relation, called function. A function simply converts something into something else. Your coffee grounder is a function. The cause in this example is mechanical force in grounding the coffee bins.
Q19. Why would ANOVA and performing t-test for each pair of samples not necessarily give the same conclusion at the same confidence level?
It is because any pair-wise comparison of means is never a substitute for the simultaneous comparison of all means. Moreover, it is not an easy task to compute the exact confidence level from the pair-wise confidence levels.