Skip to content

Stratified sampling

Stratified sampling is where we divide the population into groups and then sample from each group separately at random.

Stratified sampling requires us to know some things about the population first.

We should start by finding some groups to divide the population into. For example, we might divide the population into groups based on age and gender: six groups:

  • Males aged 0-17
  • Males aged 18-65
  • Males aged 65+
  • Females aged 0-17
  • Females aged 18-65
  • Females aged 65+

We also need to know how many people are in each group - or at least, its proportion of the overall population.

Then, we use that data to survey a proportional number of people from each group. For example, if 10% of the population is males aged 0-17, then we would survey 10% of our sample from that group.

This method is more work than simple random sampling, but it can give us a more representative sample, especially if the groups are very different from each other.

  • It can give us a more representative sample, especially if the groups are very different from each other.
  • It can be more efficient than simple random sampling, especially if the groups are very different from each other.
  • It can allow us to make comparisons between groups.
  • It lets us ensure that we have enough people from each group in our sample, which can be important if some groups are small.
  • It means that a single group can’t dominate the sample, which can happen with simple random sampling if you get unlucky.
  • It can be more work than simple random sampling, especially if there are many groups.
  • We need to know some things about the population first, which can be difficult or expensive to find out.
  • If we don’t divide the population into the right groups, we might not get a representative sample. For example, if we divide the population into groups based on age and gender, but the important differences are actually based on income, then we might not get a representative sample.
  • If we don’t sample from each group in the right proportions, we might not get a representative sample. For example, if we sample too many people from one group and too few from another, we won’t get a representative sample.