Evaluation Approaches, Framework, and Designs



Evaluation Approaches, Framework, and Designs

HS 490

Chapter 14

This chapter focuses on evaluation approaches, an evaluation framework, and evaluation designs. House’s (1980) taxonomy of eight evaluation approaches was presented. No one approach is useful in all situations; therefore, evaluators should select an approach or parts of approaches to structure the evaluation based on the needs of the stakeholders involved with each program. The Framework for Program Evaluation in Public Health presents a process that is adaptable to all health promotion programs, yet is not prescriptive in nature.

The steps for selecting an evaluation design were also presented with a discussion about quantitative and qualitative methods. Evaluation design should be considered early in the planning process. Planners/evaluators need to identify what measurements will be taken as well as when and how. In doing so, a design should be selected that controls for both internal and external validity.

Evaluation Framework

The evaluation framework can be thought of as the “skeleton” of a plan that can be used to conduct an evaluation. It puts in order the steps to be followed.

Evaluation Design

An evaluation design is used to organize the evaluation and to provide for planned, systematic data collection, analysis, and reporting. A well-planned evaluation design helps ensure that the conclusions drawn about the program will be as accurate as possible.

Evaluation Approaches

Categorizing the different evaluation approaches. A brief description of each is presented here:

Systems analysis uses output measures, such as test scores, to determine if the program has demonstrated the desired change. It also determines whether funds have been efficiently used, as in cost analysis.

Behavioral objectives, or goal-based evaluation, uses the program goals and collects evidence to determine whether the goals have been reached.

Decision Making focuses on the decision to be made and presents evidence about the effectiveness of the program to the decision maker (manager or administrator).

Evaluation Approaches

Goal-free evaluation does not base the evaluation on program goals; instead, the evaluator searches for all outcomes, often finding unintended side effects.

Art criticism uses the judgment of an expert in the area to increase awareness and appreciation of the program in order to lead to improved standards and better performance.

Professional (accreditation) review uses professionals to judge the work of other professionals; the source of standards and criteria is the professionals conducting the review.

Evaluation Approaches

Quasi-legal evaluation uses a panel to hear evidence considering the arguments for and against the program; a quasi-legal procedure is used for both evaluating and policy making.

Case study uses techniques such as interviews and observations to examine how people view the program.

Systems Analysis Approach

A systems analysis approach of evaluation is based on efficiently- determining which are the most effective programs. It focuses on the organization, determining whether appropriate resources are devoted to goal activities (and to nongoal activities, such as staff training or maintenance of the system).

Economic Evaluations

• Economic evaluations are typical strategies used in systems analysis approaches. They have been defined as the comparison of alternative courses of action in terms of both costs and outcomes. Control over rising health care costs has forced many administrators and planners to be concerned about the cost of health promotion programs.

Cost-identification Analysis (Cost-Feasibility)

Cost-identification analysis is used to compare different interventions available for a program, often to determine which intervention would be the least expensive. With this type of analysis, planners identify the different items (i.e., personnel, facilities, curriculum, etc.) associated with a given intervention, determine a cost for each item, total the costs for that intervention, and then compare the total costs associated with each of several interventions.

Cost-benefit Analysis (CBA)

Cost-benefit analysis looks at how resources can best be used. It will yield the dollar benefit received from the dollars invested in the program.

Cost-effectiveness Analysis (CEA)

Cost-effectiveness analysis is used to quantify the effects of a program in monetary items. It is appropriate for health promotion programs than cost-benefit analysis, because a dollar value does not have to be placed on the outcomes of the program. Instead, a cost-effectiveness analysis will indicate how much it costs to produce a certain effect. For example, based on the cost of a program, the effect of years of life saved, number of smokers who stop smoking, or morbidity or mortality rates can be determined.

Cost-utility Analysis (CUA)

This approach is different from the others in that the values of the outcomes of a program are determined by their subjective value to the stakeholders rather than their monetary cost. For example, an administrator may select a more expensive intervention for a program just because of the good public relations (i.e., the subjective value in the administrator’s eye) for the organization.

Behavioral Objectives, Goal-Attainment Approach, and Goal-Based Approach

Behavioral Objectives Approach

The most common type of evaluation model.

Focuses on the stated goals of the program.

Approaches using this type of goal-directed focuses are also known as goal attainment and goal based.

In the behavioral objective approach, the program goals serve as the standards for evaluation. This type of evaluation was first used in education, to assess student behaviors. Competency testing is an example of goal-attainment evaluation, determining whether a student is able to pass an exam or advance to the next grade.

Objective-Oriented Approaches

Objective- oriented approaches are probably the most commonly used approaches to health promotion program evaluation. They specify program goals and objectives have been reached. Success or failure is measured by the relationship between the outcome of the program and the stated goals and objectives. This type of approach is based on action, and the dependent variable is defined in terms of outcomes the program participant should be able to demonstrate at the end of the intervention.

Goal Attainment/Goal Based

Emphasis is placed on how objectives are to be measured. This approach is also found in business, where organizations are “management by objectives” to determine how well they are meeting their objectives.

Five Steps in Measuring Goal Attainment:

Specification of the goal to be measured.

Specification of the sequential set of performances that , if observed, would indicate that the goal has been achieved.

Identification of the performances that are critical to the achievement of the goal.

Description of the “indicator behavior” of each performance episode.

Collective testing to find whether each “indicator behavior” is associated with each other.

Decision-Making Approach

There are three steps to the evaluation process: delineating (focusing of information), obtaining (collecting, organizing, and analyzing information), and providing (synthesizing information so it will be useful).

The decision maker, usually a manager or administrator, wants and needs information to help answer relevant questions regarding a program.

The four types of evaluation in this approach include 1) context, 2) input, 3) process, and 4) product (CIPP), with each providing information to the decision maker.

Context evaluation describes the conditions in the environment, identifies unmet needs and unused opportunities, and determines why these occur.

Input evaluation is to determine how to use resources to meet program goals.

Process evaluation provides feedback to those responsible for program implementation.

Product evaluation is to measure and interpret attainments during and after the program.

Goal-Free Approach

Suggests that evaluation should not be based on goals in order to enable the evaluator to remain unbiased. The evaluator must search for all outcomes, including unintended positive or negative side effects. Thus, the evaluator does not base the evaluation on reaching goals and remains unaware of the program goals.

Goal-Free Approach

The goal-free approach is not often used in evaluation. It is difficult for evaluators to determine what to evaluate when program objectives are not to be used. One concern is that evaluators will substitute their own goals, since there is a lack of clear methodology as to how to proceed.

Management-Oriented Approaches

Management-oriented approaches focus “on identifying and meeting the informational needs of managerial decision makers.” That is, good decision making is best made on good evaluative information. In this approach, the evaluators and managers work closely together to identify the decisions that must be made and the information needed to make them. The evaluators then collect the necessary data “about the advantages and disadvantages of each decision alternative to allow for fair judgment based on specified criteria. The success of the evaluation rests on the quality of the teamwork between evaluators and decision makers.”

CIPP

The acronym CIPP stands for the four type decisions facing managers, context, input, process, and product. Context evaluation describes the conditions in the environment, identifies unmet needs and unused opportunities, and determines why these occur. The purpose of input evaluation is to determine how to use resources to meet program goals. Process evaluation provides feedback to those responsible for program implementation. The purpose of product evaluation is to measure and interpret attainments during and after the program. It is the decision maker, not the evaluator, who uses this information to determine the worth of the program.

Consumer-Oriented Approaches

Consumer-oriented approaches focus on “developing evaluative information on ‘products,’ broadly defined, and accountability, for use by consumers in choosing among competing products.” This approach gets its “label” of consumer-oriented, in part, from the fact that it’s an evaluation approach that helps “protect” the consumer by evaluating “products” used by the consumer. The consumer-oriented approach, which is summative in nature, primarily uses checklists and criteria to allow the evaluator to collect data that can be used to rate the “product.” This is the approach used by: Consumer Reports when evaluating various consumer products, principals when evaluating their teachers, and instructors when they are evaluating the skill of their students to perform cardio-respiratory resuscitation (CPR). It is an approach that has been used extensively in evaluating educational materials and personnel.

The highest level of checklist in the hierarchy is a COMlist. A COMlist is a checklist comprised of the criteria that essentially define the merit of the “product.” For example, what are the criteria that define an excellent health promotion program, or an outstanding program facilitator, or exemplary instructional materials for a program? The criteria of merit (COM) are identified by being able to answer the question: “What properties are parts of the concept (the meaning) of ‘a good X’?” Thus if we were to apply this to program planning, we would ask the question “What are the criteria that define an excellent health promotion program?” Or, “What are the qualities of an outstanding facilitator?” Or, “What must be included for an instructional material to be considered exemplary?”

Table 14.1 Comparison of the Goal-Attainment and Goal-Free Approaches

Goal-Attainment Approach

Have the objectives been reached?

Has the program met the needs of the target population?

How can the objectives be reached?

Are the needs of the program administrators and funding source being met?

Goal-Free Approach

What is the outcome of the program?

Who has been reached by the program?

How is the program operating?

What has been provided?

Expertise-Oriented Approaches

Expertise-oriented approaches, which are probably the oldest of the approaches to evaluation, rely “primarily on the direct application of professional expertise to judge the quality of whatever endeavor is evaluated.” Most of these approaches can be placed in one of three categories, formal professional review systems, or informal professional reviews, and individual reviews. Formal professional reviews are characterized by having:

1) structure or organization established to conduct a periodic review; (2) published standards (and possibly instruments) for use in such reviews; (3) a prespecified schedule (for example, every five years) for when reviews will be conducted; (4) opinions of several experts combining to reach the overall judgments of value; and (5) an impact on the status of that which is reviewed, depending on the outcome.

The most common formal professional review system is that of accreditation. Accreditation is a process by which a recognized professional body evaluates the work of an organization (i.e., school, universities, and hospitals) to determine if such work meets prespecified standards. If they do, then the organization is approved or accredited. Examples of accreditation processes with which readers may be familiar are those of the National Council for the Accreditation of Teacher Education (NCATE) which accredits teacher education programs, including health education programs, and the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) which accredits various healthcare facilities.

Expertise-Oriented Approaches

In all of the approaches presented so far in this chapter, the primary focus of each has been on something other than serving the needs of the priority population. It is not that those who use the previous approaches are unconcerned about the priority population, but the valuation process does not begin with the priority population. The participant-oriented approaches are different. They focus on a process “in which involvement of participants (stakeholders in that which is evaluated) are central in determining the values, criteria, needs, data, and conclusions for the evaluation” In addition, their characteristics of less structure and fewer constraints, informal communication and reporting, and less attention to goals and objectives may be a drawback for those who want more formal, objective-type evaluation.

Fitzpatrick and colleagues (2004) have identified the following common elements of participant-oriented approaches:

1. They depend on inductive reasoning. Understanding an issue or event or process comes from grassroots observation an discovery. Understanding emerges; it is not the end product of some preordinate inquiry plan projected before the evaluation is conducted.

2. They use multiplicity of data. Understanding comes from the assimilation of data from a number of sources. Subjective and objective, qualitative and quantitative representations of the phenomena being evaluated are used.

3. They do not follow a standard plan. The evaluation process evolves as participants gain experience in the activity. Often the important outcome of the evaluation is a rich understanding of one specific entity with all the idiosyncratic contextual influences, process variations, and life histories. It is important in and of itself for what it tells about the phenomena that occurred.

4. They record multiple rather than single realties. People see things and interpret them in different ways. No one knows everything that happens in a school, or in the tiniest program. And no one perspective is accepted as the truth. Because only an individual can truly know what she has experience, all perspectives are accepted as correct, and a central task of the evaluator is to capture these realties and portray them without sacrificing the program’s complexity.

Table 14.2 Differences between conventional evaluation and participatory evaluation

| |Conventional Evaluation |Participatory Evaluation |

|Who |External experts |Community, project staff facilitator |

|What |Predetermined indicators of success, |People identify their own indicators of |

| |primarily cost and health outcomes or gains|success, which may include health outcomes |

| | |and gains |

|How |Focus on “scientific objectivity,” |Self evaluation; simple methods adapted to |

| |distancing evaluators from other |local culture; open, immediate sharing of |

| |participants; uniform, complex procedures; |results through local involvement in |

| |delayed, limited access to results |evaluation processes |

|When |Usually completion; sometimes also midterm |Merging of monitoring and evaluation; hence|

| | |frequent small-scale evaluations |

|Why |Accountability, usually summative, to |To empower local people to initiate, |

| |determine if funding continues |control, and take corrective action |

Figure 14.1 Framework for Program Evaluation

Framework for Program Evaluation

Once evaluators have selected the approach or approaches that will be used in the evaluation, they are ready to apply an evaluation framework. In 1999, an evaluation framework to be used with public health activities was published. Since the framework is applicable to all health promotion programs, an overview of it is provided here.

The early steps provide the foundation, and all steps should be finalized before moving to the next step:

Step 1- Engaging stakeholders

This step begins the evaluation cycle. Stakeholders must be engaged to insure that their perspectives are understood. The three primary groups of stakeholders are 1) those involved in the program operations, 2) those served of affected by the program, and 3) the primary users of the evaluation results. The scope and level of stakeholder involvement will vary with each program being evaluated.

Step 2- Describing the program:

This step sets the frame of reference for all subsequent decisions in the evaluation process. At a minimum, the program should be described in enough detail that the mission, goals, and objectives are known. Also, the program’s capacity to effect change, its stage of development, and how it fits into the larger organization and community should be known.

Step 3- Focusing the evaluation design:

This step entails making sure that the interests of the stakeholders are addressed while using time and resources efficiently. Among the items to consider at this step are articulating the purpose of the evaluation (i.e., gain insight, change practice, assess effects, affect participants), determining the users and uses of the evaluation results, formulating the questions to be asked, determining which specific design type will be used, and finalizing any agreements about the process.

Step 4- Gathering credible evidence:

This step includes many of the items mentioned in Chapter 5 of this text. At this step, evaluators need to decide on the measurement indicators, sources of evidence, quality and quantity of evidence, and logistics for collecting the evidence.

Step 5- Justifying Conclusions

This step includes the comparison of the evidence against the standards of acceptability; interpreting those comparisons; judging the worth, merit, or significance of the program; and creating recommendations for actions based upon the results of the evaluations.

Step 6- Ensuring use and sharing lessons learned:

This step focuses on the use and dissemination of the evaluation results. When carrying out this final step, concern must be given to each group of stakeholders.

In addition to the six steps of the framework, there are four standards of evaluation. These standards are noted in the box at the center of Figure 14.1. The standards provide practical guidelines for the evaluators to follow when having to decide among evaluation options. For example, these standards help evaluators avoid evaluations that may be “accurate and feasible but not useful or one that would be useful and accurate but it infeasible.”

The four standards are:

Utility standards ensure that information needs of evaluation users are satisfied.

Feasibility standards ensure that the evaluation is viable and pragmatic.

Propriety standards ensure that the evaluation is ethical (i.e., conducted with regard for the rights and interests of those involved and effected).

Accuracy standards ensure that the evaluation produces findings that are considered correct.

Selecting an Evaluation Design

As noted in the section above evaluators must give careful consideration to the evaluation design, since the design is critical to the outcome of the program.

There are few perfect evaluation designs, because no situation is ideal, and there are always constraining factors, such as limited resources. The challenge is to devise an optimal evaluation- as opposed to an ideal evaluation. Planners should give much thought to selecting the best design for each situation.

The following questions may be helpful in the selection of a design:

How much time do you have to conduct the evaluation?

What financial resources are available?

How many participants can be included in the evaluation?

Are you more interested in qualitative or quantitative data?

Do you have data analysis skills or access to computers and statistical consultants?

In what ways can validity be increased?

Is it important to be able to generalize your finding to other populations?

Are the stakeholders concerned with validity and reliability?

Do you have the ability to randomize participants into experimental and control groups?

Do you have access to a comparison group?

There are four steps in choosing an evaluation design. These four steps are outlined in

Figure 14.2.

Step 1

The first step is to orient oneself to the situation. The evaluator must identify resources (time, personnel), constraints, and hidden agendas (unspoken goals). During this step, the evaluator must determine what is to be expected from the program and what can be observed.

Step 2

The second step involves defining the problem- determining what is to be evaluated. During this step, definitions are needed for independent variables (what the sponsors think makes the difference), dependent variables (what will show the difference), and confounding variables (what the evaluator thinks could explain additional differences).

Step 3

The third step involves making a decision about the design- that is, whether to use qualitative or quantitative methods of data collection or both.

Quantitative Method

The quantitative method is destructive in nature (applying a generally accepted principle to an individual case), so that the evaluation produces numeric (hard) data, such as counts, ratings, scores, or classifications. Examples of quantitative data would be the number of participants in a stress-management program, the ratings on a participant satisfaction survey, and the pretest scores on a nutrition knowledge test. This approach is suited to programs that are well defined and compares outcomes of programs with those of other groups or the general population. It is the method most often used in evaluation designs.

Qualitative Method

The qualitative method is an inductive method (individual cases are studied to formulate a general principle) and produces narrative (soft) data, such as descriptions. This is a good method to use for programs that emphasize individual outcomes or in cases where other descriptive information from participants is needed.

Figure 14.2 Steps in Selecting an Evaluation Design

Figure 14.3 Methods Used in Evaluation

Case studies- In-depth examinations of a social unit, such as an individual family, household, worksite, community or any type of institution as a whole

Content analysis- a systematic review identifying specific characteristics of messages

Delphi techniques- See chapter 4 for an in-depth discussion of the Delphi technique

Elite interviewing- interviewing that focuses on a certain type (“elite”) of respondent

Figure 14.3 Methods Used in Evaluation

Ethnographic studies- a variety of techniques (participant-observer, observation, interviewing, and other interactions with people) used to study an individual group

Films, photographs, and videotape recording (film ethnography)- includes the data collection and study of visual images

Focus group interviewing- see Chapter 4 for an in-depth discussion of focus group interviewing

Figure 14.3 Methods Used in Evaluation

Historical analysis- a review of historical accounts that may include an interpretation of the impact on current events

In-depth interviewing- A less structured, deeper interview in which the interviewees share their view of the world

Kinesics- “the study of body communication” page 233

Normal group process- see Chapter 4 for an in-depth discussion of the nominal group process

Figure 14.3 Methods Used in Evaluation

Participant-observer studies- those in which the observers (evaluators) also participate in what they are observing

Quality cycle- a group of people who meet at regular intervals to discuss problems and to identify possible solutions- page 236

Unobtrusive techniques- data collection techniques that do not require the direct participation or cooperation of human subjects- page 236- and include such things as unobtrusive observation, review of archival data, and study fo physical traces.

Rather than choose one method, it may be advantageous to combine quantitative and qualitative methods. Steckler and colleagues (1992) have discussed integrating qualitative and quantitative methods, since, to a certain extent, the weakness of one method is compensated for by the strengths of the other. Figure 14.4 illustrates four ways that the qualitative and quantitative methods might be integrated.

Figure 14.4 Method 1

Figure 14.4 Method 2

Figure 14.4 Method 3

Figure 14.4 Method 4

Experimental and Control Groups

Experimental group- the group of individuals participating in the program that is to be evaluated. The evaluation is designed to determine what effects the program has on these individuals.

Control group- should be as similar to the experimental group as possible, but the members of this group do not receive the program (intervention or treatment) that is to be evaluated.

Without the use of a properly selected control group, the apparent effect of the program could actually be due to a variety of factors, such as differences in participants’ educational background, environment, or experience. By using a control group, the evaluator can show that the results or outcomes are due to the program and not to those other variables.

Since the main purpose of social programs is to help client, the client’s viewpoint should be the primary one. It is important to keep this in mind when considering ethical issues in the use of control groups. Conner (1980) identifies four underlying premises for the use of control groups in social program evaluation:

All individuals have a right to status quo services.

All individuals involved in the evaluation are informed about the purpose of the study and the use of a control group.

Individuals have a right to new services, and random selection gives everyone a chance to participate.

Individuals should not be subjected to ineffective or harmful programs.

Comparison Group

When participants cannot be randomly assigned to an experimental or control group, a nonequivalent control group may be selected.

It is important to find a group that is as similar as possible to the experimental group, such as two classrooms of students with similar characteristics or a group of residents in two comparable cities. Factors to consider include:

Participant's age

Gender

Education

Location

Socioeconomic status

Experience

As well as any other variable that might have an impact on program results.

Evaluation Designs

Measurements used in evaluation designs can be collected at three different times: after the program; both before and after the program; and several times before, during, and after the program.

Measurement is defined as the method or procedure of assigning numbers to objects, events, and people.

Pretest- measurement before the program begins

Posttest- measurement after the completion of the program

Figure 14.5 Evaluation Designs

Figure 14.5 Evaluation Designs

Figure 14.5 Evaluation Designs

Experimental Design

Offers the greatest control over the various factors than may influence the results

Random assignment to experimental and control groups with measurement of both groups.

Quasi-experimental design

Results in interpretable and supportive evidence of program effectiveness.

Usually cannot control for all factors that affect the validity of the results.

There is no random assignment tot eh groups, and comparisons are made on experimental and comparison groups.

Non-experimental design

Without the use of a comparison or control group, has little control over the factors that affect the validity of the results.

The most powerful design is the experimental design, in which participants are randomly assigned to the experimental and control groups. The difference between I.1. And I.2. in figure 14.5 is the use of a pretest to measure the participants before the program begins. Use of a pretest would help assure that the groups are similar. Random assignment should equally distribute any of the variables (such as age, gender, and race) between the different groups. Potential disadvantages of the experimental design are that it requires a relatively large group of participants and that the intervention may be delayed for those in the control group.

A design more commonly found in evaluations of health promotion programs is the quasi-experimental pretest-posttest design using a comparison group (II.1 in figure 14.5). This design is often used when a control group cannot be formed by random assignment. In such a case, a comparison group (a nonequivalent control group) is identified, and both groups are measured before and after the program. For example, a program on fire safety for two fifth-grade classrooms could be evaluated by using pre- and post knowledge test. Two other fifth-grade classrooms not receiving the program could serve as the comparison group. Similar pretest scored between the comparison and experimental groups would indicate that the groups were equal at the beginning of the program. However, without random assignment, it would be impossible to be sure that other variables (a unit on fire safety in a 4-H group, distribution of smoke detectors, information from parents) did not influence the results.

Figure 14.6 Staggered Treatment Design

Internal Validity

Internal validity of evaluations are the degrees to which the program caused the change that was measured. Many factors can threaten internal validity, either singly or in combination, making it difficult to determine if the outcome was brought about by the program or some other cause.

History occurs when an event happens between the pretest and posttest that is not part of the health promotion program. An example of history as a threat to internal validity is having a national antismoking campaign coincide with a local smoking cessation program.

Maturation occurs when the participants in the program show pretest-to-posttest differences due to growing older, wiser, or stronger. For example, in tests of muscular strength in an exercise program for junior high students, an increase in strength could be the result of muscular development and not the effect of the program.

Testing occurs when the participants become familiar with the test format due to repeated testing. This is why it is helpful to use a different form of the same test for pretest and posttest comparisons.

Instrumentation occurs when there is a change in the measuring between pretest and posttest, such as the observers becoming more familiar with or skilled in the use of the testing format over time.

Statistical regression is when extremely high or low scores (with are not necessarily accurate) on the pretest are closer to the mean or average scored on the posttest.

Selection reflects differences in the experimental and comparison groups, generally due to lack of randomization. Selection can also interact with other threats to validity, such as history, maturation, or instrumentation, which may appear to be program effects.

Mortality refers to participants who drop out of the program between the pretest and posttest. For example, if most of the participants who drop out of a weight loss program are those with the least (or the most) weight to lose, the group composition is different at the posttest.

Diffusion or imitation of treatments results when participants in the control group interact and learn from the experimental group. Students randomly assigned to an innovative drug prevention program in their school (experimental group) may discuss the program with students who are not in the program (control group), biasing the results.

Compensatory equalization of treatments occurs when the program or services are not available to the control group and there is an unwillingness to tolerate the inequality. For instance, the control group from the previous example (students not enrolled in the innovative drug prevention program) may complain, since they are not able to participate.

Compensatory rivalry is when the control group is seen as the underdog and is motivated to work harder.

Resentful demoralization of respondents receiving less desirable treatments occurs among participants receiving the less desirable treatment compared to other groups, and the resentment may affect the outcome. For example, an evaluation to compare two different smoking cessation programs may assign one group (control) to the regular smoking cessation program and another group (experimental) to the regular program plus an exercise class. If the participants in the control group become aware that they are not receiving the additional exercise class, they may resent the omission, and this may be reflected in their smoking behavior and attitude toward the regular program.

The major way in which threats to internal validity can be controlled is though randomization. By random selection of participants, random assignment to groups, and random assignment of types of treatment or no treatment to groups, and differences between pretest and posttest can be interpreted as a result of the program. When random assignment to groups is not possible and quasi-experimental designs are used, the evaluator must make all threats to internal validity explicit and then rule them out one by one.

External Validity (Generalizability)

The other type of validity that should be considered is external validity. The extent to which the program can be expected to produce similar effects in other populations. This s also known as generalizability. The more the program is tailored to a particular population, the greater the threat to external validity, and the less likely it is that the program can be generalized to another group.

Several factors can threaten external validity. They are sometimes known as reactive effects, since they cause individuals to react in a certain way. The following are several types of threat to external validity:

Social desirability occurs when the individual gives a particular response to try to please or impress the evaluator. An example would be a child who tells the teacher she brushes her teeth every day, regardless of her actual behavior.

Expectancy effect is when attitudes projected onto individuals cause them to act in a certain way. For example, in a drug abuse treatment program, the facilitator may feel that a certain individual will not benefit from the treatment; projecting this attitude may cause the individual to behave in self-defeating ways.

The following are several types of threat to external validity:

Hawthorne effect refers to a behavior change because of the special status of those being tested. This effect was first identified in an evaluation of lighting conditions at an electric plant; workers increased their productivity when the level of lighting was raised as wall as when it was lowered. The change in behavior seemed to be due to the attention given to them during the evaluation process.

Placebo effect causes a change in behavior due to the participants’ belief in the treatment.

Blind- study in which the participants do not know what group (control or type of experimental group) they are in.

Double blind- study of the type of group participants are in is not known by either the participants or the program planners.

Triple blind- study where the information is not available to the participants, planners, or evaluators.

It is important to select an evaluation design that provides both internal and external validity. This may be difficult, since lowering the threat to one type of validity may increase the threat to the other. For example, tighter evaluation controls make it more difficult to generalize the results to other situations. There must be enough control over the evaluation to allow evaluators to interpret the findings while sufficient flexibility in the program is maintained to permit the results to be generalized to similar settings.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download