I. Introduction

The role of fathers has recently received increased attention from academics, the government, and private foundations. Fatherlessness is not only viewed as a cause of child poverty, but has also been shown to affect child development and children's prospects for academic and labor market success. There is also a perceived link between fatherlessness and social problems such as youth violence, domestic violence, and teen child bearing. The seriousness of father absence has prompted the federal government and organizations such as the Ford Foundation to begin funding programs that promote responsible fatherhood. There is, however, a paucity of evaluation information on the effectiveness of these programs.

The increased interest in programs that promote responsible fatherhood and the limited information currently available on the services provided and effectiveness of these programs has generated interest in the systematic evaluation of responsible fatherhood programs. For this reason, the Office of the Assistant Secretary for Planning and Evaluation (ASPE) in the U.S. Department of Health and Human Services and the Ford Foundation have funded The Lewin Group and Johns Hopkins University to conduct an evaluability assessment of responsible fatherhood programs. The goal is to provide the Department and other policymakers with an evaluation design that can be used to evaluate a variety of responsible fatherhood programs. In addition, this report is intended to provide direction to organizations that would support or conduct evaluations by illustrating what is involved in the evaluation process and what mechanisms must be in place before a formal impact evaluation may be undertaken. It may also provide direction to programs that are building the capacity be evaluated.

In developing this report, we conducted several activities designed to learn more about fatherhood interventions and to identify the specific evaluation issues confronting these programs. These activities include:

In the remainder of this chapter, we provide a brief overview of the aim of fatherhood interventions; discuss the objectives of evaluating fatherhood programs; describe the major components of a program evaluation; and discuss some of the characteristics fatherhood programs must have in order to be ready for an evaluation. In the final section, we provide an overview of the remaining chapters of the report.

II. Fatherhood Programs and Evaluation Objectives

A. The Aim of Fatherhood Interventions

Many non-custodial fathers are responsible parents and want to be actively involved in the lives of their children. However, there may exist substantial barriers that prevent or inhibit a father's involvement with his child. The National Center on Fathers and Families identified seven core findings about fathers based on the experiences of front-line people working with fathers.(2) They include the following:

The core findings provide an important context for understanding the unique challenges faced by young and adult men who want to become responsible fathers and the programs that help them achieve that goal.

In a recent publication, Jim Levine and Ed Pitt compiled the most extensive work to date on responsible fatherhood programs.(3) Their research and analysis of 300 community-based initiatives revealed characteristics common to the programs. Based on their findings, they offer the following strategic objectives as a framework for programs that promote responsible fatherhood:

The Levine and Pitt framework provides a broad view of the aim of fatherhood interventions. Individual programs, however, vary substantially in both the specific outcomes they attempt to achieve and the activities they undertake to achieve them. Among the five programs we visited, we observed substantial variation in the numbers of fathers served, the recruiting methods used, the services fathers received, and program goals (see Appendix B). One common theme, however, was an underlying philosophy that in order to be an effective and responsible father, men needed first to develop the capacity to take care of themselves.

B. Why Evaluation is Important for Fatherhood Interventions

Fatherhood programs and emphasis on male parenting are relatively recent phenomena in the social service sector. Many of the programs currently in place are either very new or, if established, have been experimenting with new interventions or changing the program focus over time to meet the interests and objectives of funders. It is generally the case that fatherhood programs have not adequately documented their performance. This may be because of limited resources, a lack of experience with methods of measuring performance, or simply because the focus of program staff has been on serving fathers rather than proving that methods are effective. While program staff may believe that their activities are helping fathers and resulting in positive impacts on society, others, particularly funders, may be skeptical of evidence of program effectiveness that is limited to anecdotes.

Evaluations of responsible fatherhood programs can serve two important functions:

From the program funding perspective, the results of an evaluation can be used to attract and justify funding from outside sources. The results of an objective evaluation conducted using accepted scientific methods provide believable evidence of a program's effectiveness. In addition to using evaluation findings as evidence of effectiveness, programs can use the findings to demonstrate how their objectives are similar to the objectives of potential funders. Both of these are critical elements for convincing organizations that they should provide funding to a particular fatherhood intervention.

From the program design perspective, an evaluation can address a variety of questions, the answers to which can help program staff tailor their programs to more effectively serve their clients. Examples of questions that might be addressed through an evaluation include:

An evaluation design can provide a structured framework for collecting and analyzing the information necessary to answer these questions.

Systematic evaluation of fatherhood program outcomes is crucial to both program design and funding. Conducting rigorous evaluations using standard scientific methods can assist program operators in effectively planning their programs to meet funding requirements, in improving their work with fathers, and in furthering the development of the field of fatherhood research and policy.

III. Components of Program Evaluation

There are three primary components to conducting a program evaluation: the process evaluation, the impact evaluation, and the cost-benefit or cost-effectiveness evaluation. In Chapter Two, we describe how and why a process evaluation should be conducted, and in Chapters Three through Eight we describe in great detail the steps necessary for conducting an impact evaluation. This report does not address cost-benefit or cost-effectiveness evaluations, but we include a brief discussion of them here because they represent the next logical step once process and impact evaluations have been conducted. In addition to these three components, an important element of the ability to conduct a program evaluation is having a management information system (MIS) in place that is capable of maintaining and processing some of the data necessary for an evaluation.

Before launching into the detailed discussion of process and impact evaluations in the subsequent chapters, we provide a brief overview of the primary evaluation components.

A. Management Information Systems

An automated system for tracking program participants is a precursor to any evaluation effort. A program management information system (MIS) is necessary to document a client's participation in the program, the services he receives and does not receive, and important outcomes related to program participation. If it cannot be shown from a cursory analysis of program administrative data that there are beneficial outcomes related to program participation, then there is often no point in conducting a full-scale impact evaluation. The ability to track a client's progress through the program, both in terms of the services he receives and changes in important outcomes, is not only necessary before an evaluation effort can be undertaken, but is also useful to program managers who may use the information to improve program effectiveness. Mature social service programs often have an MIS in place for administrative purposes, including quality control.

B. Process Evaluation

A process evaluation is the systematic collection and synthesis of information on the program environment and processes. It provides contextual information to support analyses of program outcomes, impacts, and costs. The types of information collected in a process evaluation are not only vital inputs for helping to assess program effects, but also provide feedback that can be helpful in efforts to refine the program intervention and to support replication of successful program components at other locations. A process evaluation can tell us if the underlying model for the program was implemented with integrity, as well as identify variations in treatment and participants. It can identify key similarities and differences across program sites in program objectives, participation levels, service delivery strategies, the environment, and a variety of other areas. A process evaluation can also suggest hypotheses to be tested in an impact evaluation.

The types of information collected under a process evaluation include information on: the social, economic, educational, and cultural environment in which the program operates; program goals and objectives; program strategies and interventions; major program components and services; clients' goals and objectives and their flow through the service delivery system; participant characteristics; and funding and referral sources. In general, the information collected for the process evaluation is more qualitative than quantitative in nature.

The steps involved in conducting a process evaluation include: determining the specific information to be collected/questions to be answered; identifying key program stakeholders; developing interview discussion guides; conducting interviews with program stakeholders; analyzing program administrative data; and reporting findings. The information and insights obtained through conducting a process evaluation are extremely useful to evaluators in designing and conducting an impact evaluation.

C. Impact Evaluation

An impact evaluation determines the extent to which a program causes change in the outcomes of interest. The concept of impact assessment implies that there are a set of defined objectives and criteria of success that may be used to measure the impact of the program. Impact evaluations are essential when there is an interest either in comparing different programs or in testing the effectiveness of new efforts to ameliorate a particular community problem.

To conduct an impact evaluation, the evaluator must develop a plan for collecting and analyzing data on program outcomes that will permit him or her to demonstrate that observed impacts are a function of the intervention and not a result of other factors. Impact analyses typically involve the comparison of outcomes for program participants to those of a comparison or control group. To undertake such a comparison, appropriate scientific methods and controls must be employed in the sampling, data collection, and data analysis steps to ensure that the estimated program impacts are unbiased.

Unless programs have a demonstrable impact, it is difficult to defend their implementation or continued operation. A rigorous impact evaluation provides information about the effectiveness

of a particular program that may be used to modify and improve program design and to justify continued funding and operation.

The major steps involved in conducting an impact evaluation include the following:

D. Cost-Benefit and Cost-Effectiveness Evaluations

Establishing the degree to which programs have an impact on desired outcomes, as is the purpose of an impact evaluation, is important to program managers, funders, and policymakers. What may be equally important is the comparison of program outcomes to their costs. A comparison of costs to benefits, whether done formally or informally, is inherent in decisions regarding whether to implement, expand, or continue any social program.

Cost-benefit and cost-effectiveness evaluations provide a formal framework for relating program costs to program outcomes. Cost-benefit evaluations address the issue of economic efficiency. In other words, what are the benefits (to individuals, funders, or society) of allocating resources to a particular program relative to the benefits of allocating those resources to any alternative endeavor. Cost-benefit evaluations attempt to translate all program benefits and costs into dollar values so that what is gained can be compared to what is be given up. A cost-benefit evaluation can answer questions such as:

Cost-effectiveness evaluations are more limited in scope. They focus on the cost of producing a particular outcome. Here, the outcome or benefit need not be expressed in monetary values, as with a cost-benefit evaluation. Instead, the effectiveness of a program in attaining a particular outcome is related directly to the costs. Assuming that paternity establishment is the relevant outcome, a cost-effectiveness evaluation can answer questions such as:

In general, a cost-benefit analysis informs questions regarding whether or not an outcome should be pursued at all, while a cost-effectiveness analysis informs questions regarding the most effective method for achieving a desired outcome, assuming the decision to pursue that outcome has already been made.

Whether a cost-benefit evaluation, a cost-effectiveness evaluation, or both are conducted will depend on the specific questions a program, funder, or policymaker wants answered and the feasibility of conducting such evaluations. Cost-benefit evaluations are considerably more difficult to perform than cost-effectiveness evaluations because of the difficulty in putting dollar values on the benefits of social programs. Placing a dollar value on outcomes such as paternity establishment and improved father/child relationships is a difficult and controversial task. Cost-benefit analyses must often rely on strong assumptions made by the evaluator when benefits or costs cannot be easily determined. For this reason, cost-effectiveness evaluations are often a more feasible alternative. Neither cost-benefit nor cost-effectiveness evaluations should be undertaken, however, until program impacts have been quantified.

IV. Program Readiness for Evaluation

There are several important traits that programs must develop before a rigorous impact evaluation may be conducted. These include:

Below, we discuss why each of these is important to the evaluation process, and describe where the fatherhood programs we visited are in their development of each trait.

A. Measurable Outcomes

Fatherhood programs need to have clearly stated goals to guide the evaluation process. Program goals may be very broad or quite specific, but in either case, the evaluator must be able to translate the goals of the program into a set of measurable outcomes that can be analyzed in an evaluation of the program. The outcomes that are chosen will play a major role in determining the kinds of data that will be collected, the methods that will be used to collect that data, the required sample size, the methods used to conduct the analysis, and, hence, the cost and feasibility of conducting an evaluation.

Most of the fatherhood programs we visited were able to articulate a set of measurable outcomes believed to be influenced by the program. Among the most common were increased education and employment, reduced alcohol and drug use, improved parenting skills, and increased father involvement with his child(ren). Programs also cited some more difficult-to-measure outcomes, for example, improved attitudes or feelings toward children and improved social and family interactions.

One program had some difficulty defining a set of measurable outcomes influenced by program participation, mostly because the focus of the program was on general attitude change rather than on achieving more easily measured objectives. The primary goal of this program is to reconnect fathers with their children, or, in their words, "to turn the hearts of fathers to their children, and the hearts of children to their fathers." The underlying philosophy and secondary goal of the program is attitude change. Staff at this program believe that reconnecting fathers to their children will lead to changes in attitude and behavior leading to paternity establishment, job placement, and improved relationships with their child and the child's mother. For evaluation purposes, it is difficult or impossible to devise a measure of "turning hearts of fathers to their children" and vice versa. Attitude change is also difficult to measure, but consequences of attitude change, such as paternity establishment, employment, etc., can be measured. Staff were, however, somewhat hesitant to identify specific consequences that could be used in an evaluation of their program, although an assessment of potential program impacts had been previously conducted by outside researchers.

B. Defined Service Components and a Hypothesized Relationship to Outcomes

Before an evaluation is conducted, there should be an established, underlying model relating specific program services to specific outcomes. If a program cannot identify the mechanisms through which it affects outcomes, it may be that its services are not affecting the outcomes of greatest interest to the program. As discussed above, an impact evaluation should not be undertaken unless programs can demonstrate some beneficial change in outcomes among participants, and have a logical reason for attributing the change to program services.

In addition, if there is the intent to evaluate the effectiveness of specific service components, it is necessary to identify those components and characterize them in a manner that may be used to quantify their presence and impact on outcomes of interest. While this is not crucial to an evaluation of overall program outcomes, including information on service components can be useful in gaining a better understanding of the determinants of favorable program outcomes, and can be used to control for differences in treatments both within and across programs.

Of the programs we visited, all were able to define the services they offered and, with the exception of the one program described above, link those services to hypothesized impacts on a set of measurable outcomes. The specific services offered tend to change over time, however. All programs seemed to be in the process of adding new services or refining those already in place. This is probably because most of the programs we visited are only a few years old.

C. Established Recruiting, Enrollment, and Participation Process

Responsible fatherhood programs often recruit their participants through a variety of channels including the courts, welfare agencies, hospitals, mothers, media, and word of mouth. The method of recruitment is an important consideration in designing an evaluation as it can point to potential sources of selection bias, dictate the feasibility of an experimental evaluation approach, and offer innovative ways to derive a comparison group if a non-experimental approach is adopted. For these reasons, the recruiting methods must be thoroughly understood by the evaluator and must remain consistent throughout the evaluation process.

Determining when and how a father actually enrolls and begins participation in the program is also important in conducting an evaluation. There should be an identifiable event that marks the individual as a formal participant receiving the program treatment. If "partial" participants or non-participants are counted as full participants, the effects of the treatment may be underestimated in the evaluation. The enrollment process is also important to consider because it may be a source of selection bias. If programs are using criteria to select participants such that those allowed to participate are most likely to experience successful outcomes, then not controlling for this selection will lead to an overestimate of the program's effect.

Of the programs we visited, most have established recruiting and enrollment practices. Only one program is in the process of experimenting with new recruiting techniques, as it is having difficulty attracting participants. This program also has a rather lengthy pre-screening process that would be difficult to replicate in recruiting control group members if an evaluation were to be conducted. With respect to program participation, two of the programs we visited are having difficulty defining exactly who is an active participant in their program. This is because a number of men in their programs do not participate on a regular basis, periodically returning to the program after long intervals of non-participation.

D. Understanding of the Characteristics of the Target Population, Program Participants, and Program Environment

Having an understanding of the characteristics of the target population, the characteristics of program participants, and the economic, policy, and social environment in which the program operates is important in designing the evaluation. This information can assist the evaluator in developing the sampling methodology to ensure that a study sample representative of the target population is obtained. This information is also important in deciding which variables should be included in the data collection effort and subsequently used in the participation and impact analyses. Finally, an understanding of the characteristics of the population served and the program context can help evaluators interpret the findings once the evaluation has been conducted.

All of the programs we visited seemed to have a good understanding of the population they serve and the environment in which the program operates. Many of the program managers live in or near the neighborhoods in which they operate their programs. While all but one of the programs lack an MIS, most of the programs still produce descriptive statistics on important characteristics of their participants, such as age, race, education, marital status, employment, number of children, and paternity status. In addition, most of the program managers we met seemed to be very knowledgeable about and well-linked to other agencies in the community such as state and local health and welfare agencies, child support enforcement, the criminal justice system, and agencies providing specific services to persons with low income such as housing, employment services, legal services, medical care, and substance abuse treatment.

E. Ability to Collect and Maintain Information

As discussed above, a program MIS is necessary to document a client's participation in the program, the services he receives, and important outcomes related to program participation. The ability to track a client's progress through the program, both in terms of the services he receives and changes in important outcomes, is a necessity for conducting an evaluation.

Only one of the programs we visited has any kind of computerized tracking system, and its system was still being developed and modified at the time of our visit. Another program has an MIS, but it is being used only to track female clients enrolled in its primary program. No computerized tracking of male clients is currently conducted.

F. Adequate Program Size

In order to conduct an impact evaluation, there must be a sufficient number of individuals participating in the program to obtain a reasonable level of statistical precision when estimating the program impacts. The sample size necessary for conducting an evaluation will depend, in part, on the outcomes of interest. Outcomes with values that vary greatly among those in the study population will require a larger sample size for statistical precision. This is also true for program impacts that are small. The smaller the program impact, the greater the sample size necessary to detect it.

Most of the programs we visited serve a very small number of individuals, so it would be difficult for an evaluator to obtain statistically significant results. Only one program serves a relatively large number of fathers. The caseload of this program at the time of our visit was about 500 fathers. The program receives from 50 to 60 new referrals each month. This program is by far the exception. Three of the programs we visited serve only about 50 new fathers each year. In addition to simply serving more clients, there are ways to enhance sample size for evaluation purposes. If programs operate at multiple sites, or use a relatively homogeneous methods to serve fathers, then multiple sites may be pooled for the evaluation. Another way to increase sample size is to increase the period of recruiting study participants for the evaluation. There are some disadvantages (discussed in Chapter Six), however, to prolonged periods of recruiting in conducting an impact evaluation.

To summarize, most of the programs we visited appear not to be ready for a formal impact evaluation. This is due primarily to three factors: the programs are very new and still at the stage of refining recruiting methods and program services; the programs lack automated systems for tracking and reporting on clients; and the number of fathers served by most of the programs is very small.

V. Overview of the Remaining Chapters

The remainder of the report is organized as follows:

In Chapter Two, we describe the elements necessary for conducting a process evaluation. We begin with a brief overview of the reasons why conducting a process evaluation in conjunction with an impact evaluation is useful, and then describe the evaluation questions and major data sources that can and should be incorporated into a process evaluation of responsible fatherhood programs. We then provide a detailed description of various data collection methods that may be used for obtaining new and existing data. We also provide an overview of an automated participant-level data system that could be used by responsible fatherhood programs to track participant characteristics, service utilization, and outcomes. We conclude with examples of descriptive, comparative, and exploratory analyses that could be conducted to address key process evaluation questions.

In Chapter Three, we discuss two major design choices that must be made in the planning process for an impact evaluation. These choices concern: whether to use an experimental (i.e., randomized program assignment) or non-experimental design, or some hybrid; and whether to evaluate each individual site independently or to pool the data from multiple sites and evaluate them jointly. We describe the options and discuss criteria to be considered in making the choice between design alternatives. The main criteria we discuss include: feasibility, impact estimator bias, estimator precision, and cost. We conclude the chapter with a summary of the most important points with respect to these criteria for each design feature.

In Chapter Four, we describe potential outcomes of fatherhood interventions, suggest specific measures that may be used in an evaluation, and discuss difficulties that may be encountered when developing measures for outcomes of fatherhood interventions. In Chapter Five, we provide a similar discussion for explanatory variables, including a discussion of how and why explanatory variables are used in an impact analysis.

In Chapter Six, we address issues related to the selection of the study sample and methods for collecting data on study participants. We begin with a discussion of the process by which treatment and control/comparison groups may be selected and methods for determining sample size. We then describe methods available to evaluators for collecting data on study participants, including surveys and program administrative data sources. We conclude the chapter with a discussion of the content and timing of baseline and follow-up data collection efforts.

In Chapter Seven, we discuss reasons why a participation analysis should be conducted in conjunction with an impact evaluation of fatherhood interventions, and present methods that may be used to perform such analyses.

In Chapter Eight we discuss the analyses of the evaluation data that will be necessary to estimate the impacts of responsible fatherhood programs. We present methods of conducting analyses under each of the alternative evaluation designs. We also discuss methods for jointly analyzing the impacts of multiple programs.

In Chapter Nine we provide summary and concluding comments.

Finally, we include several Appendices to the report: In Appendix A, we list the experts interviewed for the project; Appendix B contains site visit summaries of the fatherhood programs we visited; In Appendix C, we provide sample discussion guides for conducting a process evaluation; Appendix D contains preliminary evaluation findings from the Racine Goodwill Industries program; and in Appendix E, we provide a technical discussion of the participation and impact analysis methods presented in Chapters Seven and Eight.

Return to Contents


1.  The Technical Review Group members are:  Fred Doolittle (Manpower Demonstration and Research Corporation), Ronald Ferguson (Kennedy School, Harvard University), and Jeffrey Smith (Department of Economics, University of Western Ontario).

2.  See National Center on Fathers and Families (1994). "Fathers and Families:  Building a Framework to Support Practice and Research," Concept Paper. Philadelphia, PA.

3.  See Levine, Jim and Pitt, Ed (1995). New Expectations:  Community Strategies for Responsible Fatherhood, Family and Work Institute. New York, NY.