7a. Sampling Part 1 – The Principles of Sampling
Dr. Shereen Hassan
🎯 Learning Objectives
- Describe each of the four types of units of analysis in criminology: individual, group, organization and social artifact.
- Distinguish between a unit of analysis and a unit of observation.
- Distinguish between populations and samples.
- Define key concepts that relate to sampling: representativeness, generalizability, and sampling error.
Social science researchers come up with all sorts of interesting questions to investigate using scientific research methods. Unfortunately, researchers can’t study entire populations because of feasibility, time and cost constraints. Instead, we systematically select samples from a larger group of interest to draw conclusions about the people, behaviors, or social phenomena we’re interested in. Who or what researchers select for their samples and how they choose their sample impacts the conclusions that can be drawn from scientific research studies. The process of selecting a subset of a population to study is called sampling (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014).
This chapter focuses on key elements and principles of sampling, beginning first with a review of units of analysis. The difference between a population, sampling frame and sample is explained next. Finally, key concepts relevant to sampling, including representativeness and generalizability, are reviewed. In the next chapter, the specific types of sampling strategies will be reviewed, along with how to use information about samples to evaluate claims made based on research findings.
Units of Analysis
The main goal of sampling is to identify a subset of a larger group from which to collect data. To do so, a researcher must first define the larger group or entity they are interested in studying. This larger group is also called the unit of analysis. Unit of analysis refers to the entity that is the target of the investigation (Maxfield & Babbie, 2018). In other words, a unit of analysis is the entity you wish to be able to say something about at the end of your study. In any scientific study, the research question determines the unit of analysis.
There are four different units of analysis in the social sciences: individuals, groups, organizations and social artifacts (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). These are not mutually exclusive. In fact, for different analyses in the same study, you may have different units of analysis.

The individual is probably the most common unit of analysis in criminological research (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). As researchers, we are often interested in describing characteristics of offenders or victims of crime. For example, my dissertation research examined offenders in British Columbia who had been designated as long-term offenders since the inception of this designation in 1997 (see Hassan, 2010). One of the goals of the research was to describe the key characteristics of these offenders, including their age, gender, and race. As such, the unit of analysis, or the person/thing I wanted to know more about, was the individual offenders.
Groups are also a common unit of analysis in criminological research (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). A good example of a unit of analysis at the group level that is relevant to criminologists is gangs. Gangs are usually groups. I say “usually” here because it really depends on what it is about gangs that you want to know more about. If you are interested in describing the characteristics of certain gang members, then you will be studying individuals. But, if you want to compare the features of male gangs versus female gangs, or larger versus smaller gangs, then you are interested in the gangs as entities. In other words, you are interested in the dynamic between the members within the gang and not just the characteristics of each individual gang member. Other examples of units of analysis at the group level that might be of interest to criminologists include cities, neighbourhoods and even households.
Organizations are the third type of unit of analysis social scientists, including criminologists, might study (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). An organization is essentially a more formal group, like a police department or a correctional facility. You may, for example, want to compare the gender distribution or education level of officers that work at an RCMP detachment versus those of officers that work at a city department throughout the province of B.C., or the age range and ethnic diversity of inmates within correctional facilities in B.C. compared to those inmates in Alberta. While you would have to learn more about the characteristics of individual officers or individual inmates in the two examples provided here, this does not mean that the individuals are what you want to learn more about in your study; rather, it’s the detachments or correctional facilities – or organizations – as entities in and of themselves that you are seeking to examine and how they might compare, as a whole, to other entities at the organizational level.
Social artifacts are the last type of unit of analysis examined by social scientists. These can be defined as “products of social beings and their behaviour” (Maxfield & Babbie, 2018, p. 95). Examples of social artifacts that are common in criminological research are newspaper articles or police reports. In my dissertation research, one of the data sources examined was the psychological assessments of long-term offenders; these assessments were social artifacts. While these news articles, police reports and psychological assessments contain information about individuals, it is the social artifacts you are describing and/or comparing, not the individuals. For example, it was the diagnoses outlined in the psychological assessments, the comments made by the mental health professional pertaining to the offender’s treatability or risk of recidivating that I set out to summarize and describe. It was the discourse within these assessments that was of particular interest.
| Unit of Analysis | Definition | Examples |
|---|---|---|
| Individual | when the study seeks to learn about individual people, such as offenders or victims of crime | sexual assault victims, Indigenous individuals, white-collar criminals |
| Group | when the study seeks to learn about a group of individuals and the group is an entity in and of itself | gangs, First Nations, neighbourhoods, cities |
| Organization | when the study seeks to learn about a formal group of individuals and that organization is an entity in and of itself | police departments, Tribal councils, correctional facilities, universities |
| Social Artifact | when the study seeks to learn more about some product of human interaction | legal cases, psychological assessments, Tribal council minutes, news articles, diaries |
🧠 Stop and Take a Break!
When drawing conclusions from our research findings, it is important that we are crystal clear about the unit of analysis of our study. If a neighbourhood is the unit of analysis and it’s the neighbourhood as a whole you are examining, then you cannot logically make conclusions about all individuals within that neighbourhood. For example, if the crime rates in a neighbourhood in the suburbs are lower than the crime rates in a neighbourhood in the downtown core, we cannot say it is the individuals that live downtown who are committing these crimes. What else could be going on to lead to this disparity in crime rates between these two neighbourhoods? Perhaps the criminals are commuting for work from the suburbs and committing their crimes downtown on their lunch break. Perhaps the downtown neighbourhood has more tourists from other provinces, and it’s the tourists committing the crimes. Or maybe the police officers in the downtown neighbourhood are more diligent about recording all criminal incidents occurring in their jurisdiction! This example illustrates an error that researchers must be careful not to make called ecological fallacy, which involves making individual-level conclusions while relying on group-level information (Maxfield & Babbie, 2018).
Closely related to the unit of analysis is the unit of observation, which is the item (or items) you actually observe, measure, or collect in the course of trying to learn something about your unit of analysis (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). As mentioned earlier, your unit of analysis will be determined by your research question. Your unit of observation, on the other hand, is largely determined by the method of data collection you use to answer that research question. In a given study, the unit of observation might be the same as the unit of analysis, but that is not always the case. For example, if I want to learn more about the reasons why long-term offenders in B.C. engaged in the sex crimes that resulted in them receiving this designation, I could interview the individual offenders themselves. After all, if I want to learn about the individual offenders – who are the unit of analysis – I should talk to the individual offenders and also make them the unit of observation, right? Well, that may not be possible for a variety of reasons. Perhaps the warden prohibits researchers from having direct contact with these violent, sexual offenders, or perhaps the offenders do not consent to be interviewed. You may need to figure out other ways to learn more about these individual offenders and collect data from some other source. This could be family members of the offenders or their former teachers, for example. The goal of the study would still be to learn more about the individual offenders and so they remain the unit of analysis, but the unit of observation may have to be someone or something else.
🧠 Stop and Take a Break!
In sum, social science might examine many potential units of analysis. Identifying the unit of analysis early on in a research study is important because it shapes the type of data a researcher should collect for their study and who or what they should collect it from. Whichever it is, it is crucial to acknowledge that the level of analysis dictates the types of conclusions you can draw at the end of your study.
Populations versus Samples
Once a researcher has defined the unit of analysis, they can then begin to narrow their focus to identifying the population they wish to study. A population can be defined as all the people, things or events that one seeks to better understand in a research study (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). Populations in research may be rather large, such as “British Columbians,” but they are usually much more specific than that. For example, in a study in which the population of interest really is British Columbians, we will still need to specify which British Columbians, such as adults over the age of 18 or citizens or legal residents. In the study mentioned earlier about why a certain neighbourhood might have a higher crime rate, the unit of analysis would be the neighbourhood, but rather than identify all neighbourhoods as the population, the researcher would probably narrow their focus to neighbourhoods in a particular geographic area at a particular point in time or over a certain timeframe.
If a researcher wanted to have members of a particular First Nation be the target population and they have permission from that Nation to conduct the study, it is important to remember that the Indigenous peoples you are including in the research should be consulted before making decisions about sampling. Not only is collaborating with the Indigenous peoples and respecting their decisions a key aspect of the 5 Rs discussed in chapter 1 on Ethics, but they likely know the unique characteristics of their people better than you. This understanding of typical variations can result in the best decisions for sample selection.
At this point, you might wonder why researchers don’t just gather data from the entire population. In reality, most researchers would rather gather data from their entire population of interest. In fact, I was fortunate enough to be able to come quite close to that in my dissertation. While there were 91 offenders in total that received this designation in B.C. at the time of writing my dissertation (population N = 91), there were 67 long-term offender files that were physically available to include in the analysis (administrative reasons made obtaining all 91 of the files from the Corrections Service of Canada difficult), and this was a feasible sample size. Technically, this list of 67 offenders would make up what we call a sampling frame, which is a complete and exhaustive list of all the possible elements or units of analysis that could theoretically be included in our study from the larger population (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). I had the time and resources to actually study all 67 offenders, so my study examined the entire sampling frame.
This type of situation is rare, though. Typically, the entire population that researchers are interested in is far too large to examine, and the time and monetary resources are rarely available for such an endeavour (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). Consider, for example, if I had set out to study all offenders that received the long-term offender designation in Canada since the inception of the designation in 1997 through to the time of writing my dissertation (target population N = 577). In this case, I would have had to narrow my focus and come up with a way to select some of these offenders to include in my study. The smaller subset we select from the larger population of long-term offenders from which I would actually gather data would be the sample.

| Population | Sample |
|---|---|
| News articles discussing the prevalence of gang violence in Canada | 120 news articles from three major newspapers in British Columbia published in 2023, with 40 articles from each newspaper |
| Canadian residents who have been charged with DUI | 300 Canadian residents who have been charged with DUI in the provinces of British Columbia and Quebec between 2020 and 2022 |
| Undergraduate students currently enrolled at colleges across Canada | 750 undergraduate students currently enrolled in one of 13 colleges, with one college from each of the country’s 10 provinces and 3 territories across Canada |
| Parole officers who supervise long-term offenders in the community in Canada | 30 parole officers who have supervised long-term offenders in B.C. between 2020 and 2024 |
| Indigenous women of childbearing age | 30 Indigenous women who have given birth within the last three years |
🧠 Stop and Take a Break!
Test your knowledge by answering a few questions on what you have read so far.
Representativeness and Generalizability
Now that we have defined what a population and a sample are and have outlined the difference between the two, let us turn our attention to one of the key concepts relevant to our sampling decisions: representativeness. A representative sample is one that resembles the population from which it was drawn in all the ways that are important for the research being conducted. If, for example, you wish to be able to say something about differences between men and women at the end of your study, you must make sure your sample doesn’t contain only women. That is a bit of an oversimplification, but the point with representativeness is that if your population varies in some way that is important to your study, your sample should contain the same sort of variation.

A key point to remember when conducting research with Indigenous peoples is the great variation that can occur between the different communities. The communities may be highly diverse, with unique languages, traditions, and socio-political structures. It is essential to understand the variation you wish to examine when determining the population to study and then determining the sample. For instance, if your question is broad such as, “What is the level of trust between Indigenous peoples and the RCMP in Ontario?”, the population could be all Indigenous peoples in Ontario. Your sample could be drawn from various First Nation communities in the province of Ontario. However, if your question was focused on something specific such as, “Do Indigenous cultural fishing traditions and spiritual beliefs create tension between Indigenous peoples and governmental fishing regulations?”, your population would likely be a specific First Nation, not every First Nation in the province. Selecting one First Nation as the population to be studied would be important since cultural traditions and spiritual practices can vary greatly between the Nations.
Obtaining a representative sample is mostly important in probability sampling (which we will discuss more in the next chapter) because a key goal of studies that rely on probability samples is generalizability, which is the second key concept relevant to our sampling decisions and is perhaps the main feature that distinguishes probability samples from nonprobability samples. Generalizability refers to the idea that a study’s results will tell us something about a group larger than the sample from which the findings were generated.
Generalizing results from a representative sample to a specific population does not mean that the results automatically generalize to all similar populations, though. For example, opinions on carrying personal weapons on campus from a representative sample of criminology students at one B.C. university may represent well the opinions of all criminology students on that campus; however, their opinions may be much different from those of students at a university in Alberta or of criminal justice students in the U.S. Clearly, you need to be critical when you read findings that go beyond the specific population and sample utilized in a research study. Only through replication with different samples from varying populations can more confidence be attached to such broad claims of generalizability between different populations.
Sampling Error and Sample Size
Inasmuch as the results from a representative sample are meant to be a close approximation of what would actually be found if an entire population were utilized, there is certain to be some degree of difference in the results produced from a sample compared to a population. For example, survey results from a sample of citizens on attitudes toward rehabilitation versus incarceration will likely not be identical to the overall survey results if an entire province’s population of citizens was surveyed.
The difference in results or outcomes between a sample and a population is called sampling error (Maxfield & Babbie, 2018; Rennison & Hart, 2019; Palys & Atchison, 2014). Researchers expect there to be a difference between the sample results and the results from an entire population, even when the sample is representative of the population. The good news is that this margin of error can be estimated and considered in research.
One of the most important factors related to the degree of sampling error is the size of the sample (Palys & Atchison, 2014). A general rule is that the larger the sample, the lower the sampling error. As the sample gets larger, it more closely approximates the population, and the error or difference between the sample and population decreases. When the sample is equal to the population, the error is zero because the sample is the population! This was the case in my study on long-term offenders – all long-term offenders in B.C. for which a file existed were included in the research. Conversely, since very small samples are less representative of the population, the results are less generalizable to the population as a whole, and the sampling error is greater. Of course, one must consider that even if an entire population was selected to participate in a survey, some eligible participants would not respond, other persons in the population would be unable to be reached or would be unknown (e.g., unhoused individuals), and these issues are relevant to consider in discussions of sample versus population and sampling error. But as a very general rule, the larger the sample, the smaller the sampling error.

Students often wonder about the appropriate sample size for a particular research study. The previous discussion indicates that the larger the sample, the better, and this is generally true when considering the notion of sampling error. However, the constraints of the research process – high costs, staffing, tight deadlines – might mean that a larger sample is not feasible. Study constraints notwithstanding, there is no clear-cut rule concerning what constitutes the appropriate sample size. The sample size depends on a number of considerations, including the size of the population and how much variability exists in the population. If the population is homogeneous, it means all elements within the population are similar with respect to the relevant characteristics of interest (Maxfield & Babbie, 2018; Rennison & Hart, 2019). For example, if all the long-term offenders in my dissertation were White, male, and 40 years of age and they all had the exact same criminal histories, then a smaller sample would suffice because they all share key common features. On the flip side, if you are dealing with a heterogeneous population, which means that the population has a lot of variation (say the gender, race, age, and abilities of individuals within that population are quite diverse), you would likely need a larger sample that is strategically selected in order to capture the diversity from the population in your sample.
Ultimately, the sample size and the emphasis you place on the representativeness of the sample and the generalizability of your findings connect back to what your research goals are and what approach you take to answer your research question (see chapter 3 to review research goals and approaches). This also connects to which specific sampling strategy is the most appropriate to gather the data you need to answer your research question. These sampling strategies will be reviewed in the next chapter.
Conclusion
In this chapter, we reviewed the key elements and principles of sampling. First, we introduced you to the concept of a unit of analysis, which is the target of our investigation, and distinguished this from the unit of observation, which is the unit from which data are collected. We also defined what a population is and how it differs from the sample. Lastly, we reviewed key concepts of sampling: representativeness and generalizability. With this foundation, we can now move on to the next chapter where we engage in a detailed discussion of the specific non-probabilistic (qualitative) and probabilistic (quantitative) sampling strategies that we can choose from or combine in our own research study.
✅ Summary
- The unit of analysis is the larger group, individual, or entity that a researcher wants to be able to say something about at the end of their study.
- The unit of observation is the individual, group or entity that data are actually collected from. It is sometimes, but not always, the same as the unit of analysis.
- Individual, group, organization and social artifact are the four units of analysis relevant to criminologists.
- A population is the entire group or set of entities that a researcher wants to study. By contrast, a sample is a subset of the population from which the researcher gathers data.
- Representativeness and generalizability are the two key concepts relevant to sampling.
🖊️ Key Terms
ecological fallacy: an error pertaining to the unit of analysis that researchers must avoid, which involves making individual-level conclusions while relying on group-level data.
generalizability: one of the key concepts of sampling, it refers to the idea that a study’s results will tell us something about a group larger than the sample from which the findings were generated.
group: this unit of analysis is the larger class of individuals the researcher wants to know more about. The group is the unit of analysis when your goal is to learn more about the dynamics of the group members and the features of the group as an entity in and of itself. A gang is an example of a unit of analysis at the group level.
heterogeneous: when all elements within the population are different with respect to the relevant characteristics of interest, it is said to be heterogenous and a larger, more strategically selected sample is needed.
homogeneous: when all elements within the population are similar with respect to the relevant characteristics of interest, it is said to be homogenous and a small sample will suffice.
individual: this unit of analysis is the individual person the researcher wants to know more about. The individual is the most common unit of analysis in criminology studies, which typically seek to describe characteristics of individual offenders or victims. Victims of domestic violence are an example of a unit of analysis at the individual level.
organization: this unit of analysis is a more formal class of individuals the researcher wants to know more about. Like the group, the organization is the unit of analysis when your goal is to learn about the dynamics of the organization and the features of the organization in and of itself. Police detachments are an example of a unit of analysis at the organizational level.
population: the large set of all the people, things or events one wants to learn more about in our research and from which the sample is drawn.
representativeness: one of the key concepts of sampling, it refers to the extent to which our sample resembles the population from which it is drawn. A representative sample is one that captures the variation that exists in the population.
sample: a subset of the larger target population that is directly included in our research and from which we collect observations.
sampling: the process of selecting a subset of the population to include in your study.
sampling error: the difference in results or outcomes between the sample and a population.
sampling frame: an exhaustive and complete list of all the possible units of analysis that could theoretically be included in our study from the larger population. The sampling frame is a necessary part of probability sampling.
social artifact: this unit of analysis is the product of social beings and their behaviour the researcher wants to know more about. These are inanimate objects, such as news articles.
unit of analysis: the unit we hope to learn more about through the course of our research. The research question determines the unit of analysis. The four units of analysis relevant for criminologists are: individual, group, organization, and social artifact.
unit of observation: the person or thing you actually collect data from in your study. It is sometimes, but not always, the same as the unit of analysis.
🧠 Chapter Review
Crossword
Fill in the term in the right-hand column and it will display in the crossword puzzle. Be sure to include spaces where appropriate.
Discussion Questions
- Think about a research question you might want to answer in a study conducted in your own neighbourhood that focuses on crime and justice in some way. Name the unit of observation and the unit of analysis as well as the target population and the sample.
- What are some potential characteristics of a sample comprised of Indigenous peoples that would highlight the need to collaborate with the Indigenous community/ies in question when devising a sample.
- Provide examples of both heterogeneous and homogeneous Indigenous populations and what hypothetical features of these populations that you would need to take into account when making sample size decisions.
References
Hassan, S. (2010). The long-term offender provisions of the Criminal Code: An evaluation [PDF] [Doctoral dissertation, Simon Fraser University]. SFU Summit Research Repository. https://summit.sfu.ca/_flysystem/fedora/sfu_migrate/11562/etd6444_SHassan.pdf
Maxfield, M. G., & Babbie, E. R. (2018). Research methods for criminal justice and criminology (8th ed.). Cengage Learning.
Palys, T. S., & Atchison, C. (2014). Research decisions: Quantitative, qualitative and mixed methods approaches (5th ed.). Nelson Education.
Rennison, C. M., & Hart, T. C. (2019). Research methods in criminal justice and criminology. Sage.
Adaptation Statement
Chapter adapted from
- Research Methods for the Social Sciences by Valerie Sheppard, licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.
- Applied Research Methods in Criminal Justice and Criminology by Eric J. Fritsch, Chad R. Trulson, and Ashley G. Blackburn, licensed under Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.
Media Attributions
- Figure 7a.2
- Figure 7a.2
The process of selecting a subset of the population to include in your study.
The unit we hope to learn more about through the course of our research. The research question determines the unit of analysis. The four units of analysis relevant for criminologists are individual, group, organization, and social artifact.
This unit of analysis is the individual person the researcher wants to know more about. The individual is the most common unit of analysis in criminology studies, which typically seek to describe characteristics of individual offenders or victims. Victims of domestic violence are an example of a unit of analysis at the individual level.
This unit of analysis is the larger class of individuals the researcher wants to know more about. The group is the unit of analysis when your goal is to learn more about the dynamics of the group members and the features of the group as an entity in and of itself. A gang is an example of a unit of analysis at the group level.
This unit of analysis is a more formal class of individuals the researcher wants to know more about. Like the group, the organization is the unit of analysis when your goal is to learn about the dynamics of the organization and the features of the organization in and of itself. Police detachments are an example of a unit of analysis at the organizational level.
This unit of analysis is the product of social beings and their behaviour the researcher wants to know more about. These are inanimate objects, such as news articles.
An error pertaining to the unit of analysis that researchers must avoid, which involves making individual-level conclusions while relying on group-level data.
The person or thing you actually collect data from in your study. It is sometimes, but not always, the same as the unit of analysis.
The large set of all the people, things or events one wants to learn more about in our research and from which the sample is drawn.
An exhaustive and complete list of all the possible units of analysis that could theoretically be included in our study from the larger population. The sampling frame is a necessary part of probability sampling.
A subset of the larger target population that is directly included in our research and from which we collect observations.
One of the key concepts of sampling, it refers to the extent to which our sample resembles the population from which it is drawn. A representative sample is one that captures the variation that exists in the population.
One of the key concepts of sampling, it refers to the idea that a study’s results will tell us something about a group larger than the sample from which the findings were generated.
The difference in results or outcomes between the sample and a population.
When all elements within the population are similar with respect to the relevant characteristics of interest, it is said to be homogeneous and a small sample will suffice.
When all elements within the population are different with respect to the relevant characteristics of interest, it is said to be heterogeneous and a larger, more strategically selected sample is needed.