# 17.1. Distributions: Population, Empirical, Sampling#

The population, sampling, and empirical distributions are important concepts that guide us when we make inferences about a model or predictions for new observations. Figure 17.1 provides a diagram that can help distinguish between them. The diagram uses the notions of population and access frame from Chapter 2 and the urn model from Chapter 3. On the left is the population that we are studying, represented as marbles in an urn with one marble for each unit. We have simplified the situation to where the access frame and the population are the same; that is, we can access every unit in the population. (The problems that arise when this is not the case are covered in Chapters 2 and Chapter 3.) The arrow from the urn to the sample represents the design, meaning the protocol for selecting the sample from the frame. The diagram shows this selection process as a chance mechanism, represented by draws from an urn filled with indistinguishable marbles. On the right side of the diagram, the collection of marbles constitutes our sample (the data we got).

We are often interested in a summary of the sample measurements, such as the mean, median, slope from a simple linear model, and so on. Typically, this summary statistic is an estimate for a population parameter, such as the population mean or median. The population parameter is shown as $$\theta^*$$ on the left of the diagram; on the right, the summary statistic, calculated from the sample, is $$\hat{\theta}$$.