Program Evaluation

Program evaluation can be defined as “the systematic collection of information about the activities, characteristics, and outcomes of programs, for use by people to reduce uncertainties, improve effectiveness, and make decisions” (Patton, 2008, p. 39). This utilization-focused definition guides us toward including the goals, concerns, and perspectives of program stakeholders. The results of evaluation are often used by stakeholders to improve or increase capacity of the program or activity. Furthermore, stakeholders can identify program priorities, what consti­tutes “success,” and the data sources that could serve to answer questions about the acceptability, possible participation levels, and short- and long-term impact of proposed programs.

The community as a whole and individual community groups are both key stake­holders for the evaluation of a community engagement program. This type of evaluation needs to identify the relevant community and establish its perspectives so that the views of engagement leaders and all the important components of the community are used to identify areas for improvement. This approach includes determining whether the appropriate persons or organizations are involved; the activities they are involved in; whether participants feel they have significant input; and how engagement develops, matures, and is sustained.

Program evaluation uses the methods and design strategies of traditional research, but in contrast to the more inclusive, utility-focused approach of evaluation, research is a systematic investigation designed to develop or contribute to gener­alizable knowledge (MacDonald et al., 2001). Research is hypothesis driven, often initiated and controlled by an investigator, concerned with research standards of internal and external validity, and designed to generate facts, remain value-free, and focus on specific variables. Research establishes a time sequence and control for potential confounding variables. Often, the research is widely disseminated. Evaluation, in contrast, may or may not contribute to generalizable knowledge. The primary purposes of an evaluation are to assess the processes and outcomes of a specific initiative and to facilitate ongoing program management. Evaluation of a program usually includes multiple measures that are informed by the contri­butions and perspectives of diverse stakeholders.

Evaluation can be classified into five types by intended use: formative, process, summative, outcome, and impact.

Formative evaluation provides information to guide program improvement, whereas process evaluation determines whether a program is delivered as intended to the targeted recipients (Rossi et al., 2004). Formative and process evaluations are appropriate to conduct during the imple­mentation of a program. Summative evaluation informs judgments about whether the program worked (i.e., whether the goals and objectives were met) and requires making explicit the criteria and evidence being used to make “summary” judg­ments. Outcome evaluation focuses on the observable conditions of a specific population, organizational attribute, or social condition that a program is expected to have changed. Whereas outcome evaluation tends to focus on conditions or behaviors that the program was expected to affect most directly and immediately (i.e., “proximal” outcomes), impact evaluation examines the program’s long-term goals. Summative, outcome, and impact evaluation are appropriate to conduct when the program either has been completed or has been ongoing for a sub­stantial period of time (Rossi et al., 2004).

For example, assessing the strategies used to implement a smoking cessation program and determining the degree to which it reached the target population are process evaluations. In contrast, an outcome evaluation of a smoking cessation program might examine how many of the program’s participants stopped smok­ing as compared with persons who did not participate. Reduction in morbidity and mortality associated with cardiovascular disease may represent an impact goal for a smoking cessation program (Rossi et al., 2004).

Several institutions have identified guidelines for an effective evaluation. For example, in 1999, CDC published a framework to guide public health professionals in developing and implementing a program evaluation (CDC, 1999). The impe­tus for the framework was to facilitate the integration of evaluation into public health programs, but the framework focuses on six components that are critical for any evaluation. Although the components are interdependent and might be implemented in a nonlinear order, the earlier domains provide a foundation for subsequent areas. They include:

  • Engage stakeholders to ensure that all partners invested in what will be learned from the evaluation become engaged early in the evaluation process.
  • Describe the program to clearly identify its goals and objectives. This description should include the program’s needs, expected outcomes, activities, resources, stage of development, context, and logic model.
  • Design the evaluation design to be useful, feasible, ethical, and accurate.
  • Gather credible evidence that strengthens the results of the evaluation and its recommendations. Sources of evidence could include people, documents, and observations.
  • Justify conclusions that are linked to the results and judged against standards or values of the stakeholders.
  • Deliberately ensure use of the evaluation and share lessons learned from it.

Five years before CDC issued its framework, the Joint Committee on Standards for Educational Evaluation (1994) created an important and practical resource for improving program evaluation. The Joint Committee, a nonprofit coalition of major professional organizations concerned with the quality of program evaluations, identified four major categories of standards — propriety, utility, feasibility, and accuracy — to consider when conducting a program evaluation.

Propriety standards focus on ensuring that an evaluation will be conducted legally, ethically, and with regard for promoting the welfare of those involved in or affected by the program evaluation. In addition to the rights of human subjects that are the concern of institutional review boards, propriety standards promote a service orientation (i.e., designing evaluations to address and serve the needs of the program’s targeted participants), fairness in identifying program strengths and weaknesses, formal agreements, avoidance or disclosure of conflict of inter­est, and fiscal responsibility.

Utility standards are intended to ensure that the evaluation will meet the information needs of intended users. Involving stakeholders, using credible evaluation methods, asking pertinent questions, including stakeholder perspectives, and providing clear and timely evaluation reports represent attention to utility standards.

Feasibility standards are intended to make sure that the evaluation’s scope and methods are realistic. The scope of the information collected should ensure that the data provide stakeholders with sufficient information to make decisions regarding the program.

Accuracy standards are intended to ensure that evaluation reports use valid methods for evaluation and are transparent in the description of those methods. Meeting accuracy standards might, for example, include using mixed methods (e.g., quantitative and qualitative), selecting justifiable informants, and drawing conclusions that are consistent with the data.

Together, the CDC framework and the Joint Committee standards provide a gen­eral perspective on the characteristics of an effective evaluation. Both identify the need to be pragmatic and serve intended users with the goal of determining the effectiveness of a program.

Page last reviewed: August 1, 2011