Chapter 1 Introduction to Statistics
Data
collections of observations
Statistics
The science of planning studies and experiments, obtaining data, and then organizing, summarizing , presenting, analizing, interpreting, and drawing conclusions based on the data
Population
the complete collection of all individuals (scores, people, etc ...) to be studied. The collection is complete in the sense that it includes all the indidviuals to be studied
Census
the collection of data from every member of the population
Sample
is a subcollection of members selected from the population
Statistically significant
The likelyhood of getting these results by chance is very small
Practical significant
The teatment or finding might be statistically significant but common sense might suggest that the finding or treatment does not make enough of a difference to justify its use to be practical.
Parameter
a numerical measurement of a population
Statistic
a numerical measure of a sample
Quantitive data
(numerical) consists of numbers representing counts of measurments.
Categorical data
(qualitive or attribute) consists of names or labels that are not numbers representing counts or measurements
Discrete Data
results when the number of possible values is either a finite number or a countable number (1,2,3,etc)
Continuous Data
results from infinitely many possible values that corrospond to some continuous scale that covers a range of values without gaps, interruptions or jumps. (1.67 liters, 7.437 pounds)
Nominal level of measurement
is data that consists of names, labels or categores only. The data cannot be arranged in an ordering scheme (such as low to high) Ex. political party
Ordinal level of measurement
can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be obtained or are meaningless. Ex. Rank, Grades
Interval level of measurement
is like the ordinal level, with the additional property that the differences between any two values is meaningful. However, data at this level do not have a natural starting point. Ex. Temperature, years
Ratio level of measurement
is the interval level with additional property that there is also a natural zero starting point (where sero indicates that none of the quantity is present). For values at this level, differences and ratios are both meaningful.
Voluntary response Sample
one in which the respondents themselves decide wheather to be included.
Observational Study
observe and measure specific characteristics, but we do not attempt to modify the subject studied.
Experiment
apply treatment and then observe its effects on the subject.
Simple random sample
of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen
Random sample
members from the population are selected in such a way that each individual member in the population has an equal chance at being selected
Probability Sample
involves selectin members from a population in such a way that each member of the population has a known (but not necessarily the same) chance of being selected)
Systemic sampling
select some starting point and then select every kth (75th) element.
Convenience sampling
use results that are very easy to get
Stratified sampling
subdivide the population into at leasst two different subgroups (or strata) so that subjects within the same sungroup share the same characteristics (such as gender or age), then we draw a sample from each sungroup (or stratum)
Cluster sampling
first divide the population into sections 9 or clusters), then randomly select some of those clusters, and then choose all the members from those selected
Cross
sectional study - data are observed, measured, and collected at one point in time.
Retrospective study
(case-control) data are collected from the past by going back in time. (through examination of records, interviews, etc)
Prosepctive study
(longitudinal) data are collected in the future from groups sharing common factors (called cohorts
Confounding
occurs in an experiment when you are not able to distinguish among the effects of different factors.
Sampling error
is the difference between a sample result and the true population result; such an error results from chance sample fluctuation.
Nonsampling error
occurs when the sample data are incorrectly collected, recorded, or analiyzed (such as by selecting a biased sample, using a defective measurment instrument, of copy the data incorrectly.