Marc Robinson's Blog

September 10, 2025

Reimagining Public Finance? (Part 2)

The Reimagining Public Finance (RPF) team has put forward for discussion a thoughtful set of proposals intended to improve technical work in public financial management (PFM) and public finances more generally. These proposals are designed to address the problem of PFM advice that is dominated by a best practice mentality and that sees PFM reforms as ends in their own right. This problem may not be as pervasive as RPF thinks, but it is undoubtedly a real one.

What is the underlying cause of the “best practices” problem? The prevailing PFM analytic framework is, in RPF’s view, to blame. RPF suggests, in other words, that it is the set of principles and concepts that we habitually use to think about PFM that lead to the best practice approach, with its implicit assumption that “one size fits all.” This is why RPF proposes a new and radically different analytic framework. It is also the motivation for RPF’s new “development outcome” based diagnostic methodology, which I discussed in my last blog piece.

My perspective is different. I think that the mainstream PFM analytic framework is, in fact, completely inconsistent with a best practice approach. The principal drivers of the best practice mentality are not PFM theories. They are more mundane and practical. I am therefore not convinced that radical changes to the PFM framework are necessary or helpful. I explain why in what follows.

This blog piece is the second of two intended to contribute to the debate at the World Bank conference on Reimagining Public Finance (September 29-30, 2025) – at which I will be a panelist. The RPF team is to be congratulated for the effort that they have put into developing the proposals to be discussed at this conference, and for encouraging open debate on the issues.

PFM Reforms as Ends rather than Means?

Multiple ways of thinking about PFM  have always coexisted. There is, however, a mainstream PFM analytic framework that has dominated the PFM literature for the last three decades, early formulations of which can be found in the World Bank’s landmark 1998 Public Expenditure Management Handbook and a classic paper by Allen Schick. This mainstream analytic framework explicitly views PFM institutions as instruments to facilitate the achievement of the fundamental objectives of public finance, which in the Handbook version are:

1.           Aggregate fiscal discipline

2.           Allocation of resources in accordance with strategic priorities

3.           Efficient and effective use of resources in the implementation of strategic priorities[1].

In this mainstream approach to PFM, the merits of every potential reform are assessed by reference to its ability to facilitate the achievement of one or more of these public finance objectives. Whether fiscal councils are a good idea in a specific country is, for example, a question that is to be answered by asking whether in that country the establishment of such a council is likely to improve fiscal discipline (objective 1). Whether performance budgeting is an appropriate reform is a question that would be answered by determining whether, in the country concerned, it could reasonably be expected to advance objective 3 and — depending on the model of performance budgeting — perhaps also objective 2. The appropriateness of spending review would be assessed by its potential to contribute to all three objectives.

This is an explicitly instrumental approach to PFM reform that has nothing in common with a view of PFM reforms as ends in their own right. It connects PFM firmly to the government’s broader public finance policies.

The mainstream PFM analytic framework is equally far removed from one-size-fits-all thinking. It has always emphasized the importance of taking into account country circumstances and capacity. This includes guarding against the assumption that what works in an advanced country will necessarily work in a developing country. For mainstream PFM, it is, for example, entirely natural to recognize that, whatever the general merits of fiscal councils may be, the establishment of a fiscal council is not a useful reform in every country in the world.

It would not be difficult to produce dozens of quotations from the core PFM literature to demonstrate that this instrumental/adaptive approach represents mainstream thinking. I will not take up limited space here by doing so.

In my last blog piece, I pointed out that the RPF development outcome-based diagnostic methodology can’t be used when the task is to analyze a component of a country’s PFM system – such as cash management, aggregate fiscal policy formation or the budget classification – and propose reforms to make it more supportive of government as a whole. For this type of work — and most PFM diagnostic work is of this type — the right approach is to apply the principles of the mainstream PFM analytic framework by asking how the relevant PFM institutions can be improved to better support the three objectives of public finance[2].

Over the years, I’ve seen plenty of technical advisory work which has been guided by precisely this mainstream instrumental view of the role of PFM. In such work, analysts anchor their reform proposals on the three objectives of public finance, while giving careful consideration to country institutional and socio-economic characteristics, capacity and resource constraints.

The Best Practice Approach

There is, however, no denying that some PFM work is just as the critics describe. Cookie-cutter reforms are recommended and implemented with little consideration of their relevance to country needs or circumstances. Sometimes the application of the best practices approach has been absurdly inappropriate[3].

But if we want to tackle this problem, we need to properly understand its origins. If it isn’t due to the mainstream PFM analytic framework, where does it come from?

In my view, the problem is mainly practical rather than theoretical. Its causes include the following:

Work to fully understand national circumstances and the country-specific causes of public finance problems is very demanding and time-consuming. PFM technical advisory work is often not resourced sufficiently to undertake the magnitude of work that would be required. It is much easier to put together a PFM reform program using pre-fabricated components.Some PFM consultants do not have the experience or skills required to undertake this type of analytic work.Clients themselves are sometimes keen to be advised on what they understand to be cutting-edge reforms – and are therefore themselves sources of demand for “best practice” type advice. Sometimes this is appropriate, sometimes not.

These sorts of practical problems will not be resolved by changes in the PFM analytic framework. They require other solutions, but are not easy to solve.

PEFA and Best Practices

What about PEFA – the Public Expenditure and Financial Accountability diagnostic framework? Doesn’t PEFA promote a best practice mentality?

It’s not entirely clear where RPF stands on PEFA. Other critics of the best practice approach have, however, not been shy in blaming PEFA. This is, in my view, unfair. PEFA was never conceived as an instrument for promoting a best practices approach to PFM reform. It was based instead on the proposition that — even if what works best in many areas of PFM varies considerably from country to country — there is a small set of universal good practices in the PFM domain which all, or almost all, countries should follow. The great majority of the PEFA diagnostic criteria concern this limited group of universal good practices, and are therefore relevant to everybody. It is, for example, objectively better for all countries to have greater budget reliability (PEFA indicators 1, 2 and 3), informative budget documentation (indicator 5), proper debt management (indicator 13), and a clear aggregate fiscal strategy (indicator 15). Viewed in this way, PEFA is entirely consistent with the instrumental view ofPFM reforms embodied in the mainstream analytic framework described above.

This doesn’t mean that PEFA is perfect. The scope of PEFA has expanded over the years and there are now certain PEFA indicators that inappropriately go beyond the core of universal good practices[4]. More serious is the misuse of PEFA in PFM reform programs the objective of which is to raise PEFA scores. A reexamination of PEFA and its uses to address these issues would be useful.

There is more to RPF than I have been able to cover in these two blog pieces, which have focused only on the implications of RPF for PFM[5]. I admire the ambition of the RPF agenda. However, I am not persuaded that radically modifying the PFM analytic framework is the right direction to be moving. The mainstream PFM framework is both intellectually sound and widely understood. It also has the advantage that it is less complex than the new framework which RPF is now proposing.

But whatever position one takes on these issues, one thing is certain: the debate which we are now having is valuable, and will help to further improve the way we think about, and practice, public financial management.

[1] These three objectives have been formulated in slightly different ways over the years, but arguably the Handbook version is the best. The three objectives correspond to the economic concepts of (2) fiscal sustainability, (2) allocative efficiency and (3) effectiveness and operational efficiency.

[2] The RPF framework includes four “roles” of public finance thatare essentially a modified version of the three mainstream objectives of public finance. I suspect that, in practice, most applications of RPF would use these “roles” as the diagnostic criteria, skipping the “development outcomes”  step. This would make RPF analysis very similar to analysis using the mainstream framework. There are, however, a number of questions that can be raised about the way in which the four “roles” are formulated.

[3] I recall, for example, an EU project on medium-term budgeting initiated in the Democratic Republic of the Congo during its civil war or the massive USAID effort to develop performance budgeting in Afghanistan after the overthrow of the Taliban.

[4] Although the extent of this problem has often been exaggerated. The 2020 paper Advice, Money, Results, for example, criticized PEFA for setting standards for performance budgeting, when in fact it does nothing of the sort. See my critique of that paper.

[5] RPF also suggests that we should abandon analysis narrowly focused on PFM systems in favor of analysis of public finances as a whole. My view is that there is a place for both, but that to require that analysts always cover the totality of public finance issues would be extremely demanding and not always the most useful approach.

 •  0 comments  •  flag
Share on Twitter
Published on September 10, 2025 04:30

September 9, 2025

Reimagining Public Finance? (Part 1)

Reimagining Public Finance (RPF) is an initiative by a committed group within the World Bank to address a significant problem that affects public financial management (PFM) and public finance reform practice. The problem is that too much PFM technical work is driven by a best practice mentality. When operating in best practice mode, analysts don’t proceed by asking what PFM reforms will improve government performance in the country concerned. Instead, they ask what needs to be done to bring country practices into conformity with a set of pre-defined PFM best practices. This makes PFM reforms ends in their own right, rather than instruments for improving government performance in tangible ways.

I don’t think that this approach is as dominant as RPF suggests, but there is no doubt that it characterizes a certain portion of contemporary PFM technical work. RPF is pointing to a real problem, and the issue at stake is how to tackle it.

This blog piece is the first of two intended to contribute to the debate at the World Bank conference on Reimagining Public Finance (September 29-30, 2025) – at which I will be a panelist. The RPF team is to be congratulated for the effort that they have put into developing the proposals to be discussed at this conference, and for encouraging open debate on the issues.

The RPF remedy for this problem is twofold. Firstly, RPF calls for radical change in the PFM analytic framework – that is, in the set of principles and concepts that are used to structure work on these issues. Secondly, it proposes a new diagnostic methodology – i.e. a new way of identifying the problems in PFM systems and public finance policies that require reforms.

In this blog piece, I look at the second of these proposals – the new diagnostic methodology.

I will address the call for a new analytic framework in my next blog piece. By way of preview, my message there will be that is neither necessary nor useful to reinvent PFM doctrine. This is because there is a well-established mainstream PFM doctrine that is sound, and the problem that RPF addresses is not due to any flaws in this mainstream PFM doctrine. It is mainly due to more mundane and practical factors, which means that it cannot be solved by developing new theories.

The RPF “Development Outcome” Based Diagnostic Methodology

RPF’s proposed new diagnostic methodology is complex. The essential idea is to take as the starting point a specific “development outcome” that government wishes to achieve, such as universal literacy and numeracy. The analysis then proceeds, by a number of steps, to identify “public finance bottlenecks” – including PFM bottlenecks – that present obstacles to the achievement of that development outcome. Reforms to remove those bottlenecks are, finally, proposed. In the case of education, for example, the result would be a set of PFM and other public finance reforms to facilitate the achievement of universal literacy and numeracy.

This is a version of the so-called problem-driven approach to public sector reform. It represents a worthwhile attempt to ensure that PFM reforms are not pursued as ends in their own right, but as ways of improving government service delivery. In the words of the RPF team, it “reverses the logic behind many existing PFM interventions. Rather than starting from an isolated focus on the quality of PFM systems, it considers first the development outcomes that governments are pursuing and that public finance can achieve.”

I have a range of technical issues with the specifics of the methodology proposed, including its complexity and several conceptual issues. However, if we set aside such technical issues, the broad approach has a lot going for it. Indeed, I currently have the privilege of working with a team from the World Bank and World Health Organization to apply a somewhat similar approach to the overhaul of the FinHealth Toolkit. What we are developing is precisely a methodology to identify PFM bottlenecks that degrade the performance of public health systems and, in doing so, create obstacles to the achievement of universal health coverage.

I therefore look forward to RPF’s continuing work on developing its new methodology for sectoral applications such as these. The project has already commissioned initial work on applying the methodology to several sectors. This should serve as the basis for extensive discussion with, I would expect, subsequent major improvements.

The Limits of the RPF Methodology

The big “but” is that the new RPF methodology has great limitations. Its general approach can work when the focus is on specific sectoral objectives such as universal literacy and numeracy. It cannot be used for more broadly-focused work that looks at the way in which PFM systems can be improved to better support the performance of government as a whole.

Governments pursue hundreds of major outcomes – national security, economic development, low levels of crime, social harmony, faster technological progress, and many others. PFM systems need to support the achievement of all of these many outcomes. There are also important output goals that PFM systems need to serve, of which universal health cover is one[1]. It would be impossible to carry out a diagnostic analysis of a PFM system, or of a specific component of the system, such as budget preparation processes, using a methodology that proceeds by looking, one-by-one, at every one of these goals to identify the specific PFM bottlenecks that obstruct its achievement.

RPF more or less recognizes this: the RPF project “framing” paper acknowledges that the proposed new diagnostic methodology “may” only be suitable for the analysis of public finance problems contributing to “specific development outcomes” in individual country contexts[2].

So when PFM technical advisers are tasked with analyzing and proposing reforms to some part of a country’s PFM system – such as cash management, aggregate fiscal policy formation or the budget classification – they will be disappointed if they look to RPF for new ways of carrying out their mission. Fortunately, however, the existing mainstream PFM analytic framework already provides a good way of approaching such work. I’ll explain why in my next blog piece, and will consider whether RPF has made a convincing case for radical change in the PFM analytic framework.

[1] UHC is often mistakenly referred to as an outcome goal. Because it is a goal concerning who receives a public service — i.e. an output — it is in fact an output goal. The health outcome that UHC serves is reduced mortality and morbidity.

[2] To quote the RPF Framing Paper in full on this point: “Given the complexity of the proposed framework, it might be impossible to apply [the methodology] wholesale to a broad set of development outcomes … Rather, it is probably better used to analyze the contribution of fiscal policies [i.e. public finance policies] and PFM systems to specific development outcomes in individual country contexts …”

 •  0 comments  •  flag
Share on Twitter
Published on September 09, 2025 07:37

August 25, 2025

Putting Accruals to Work

Many governments have invested large sums of money in the development of accrual accounting. Unfortunately, for most of these governments this has to date been a poor investment. Why? Because many are making little or no use of all the information that has been produced. This is a regrettable situation, to say the least.

This gives urgency to the question: how can greater use be made of accrual information to deliver a better return on the investment?

One potential use is to reframe aggregate fiscal policy using accrual measures of the overall fiscal stance. There are good arguments for doing so (see box). However, the main benefits that are typically claimed for accrual accounting do not lie in the realm of aggregate fiscal policy. Rather, accrual accounting is usually proposed as an instrument for promoting improved government performance.

Accruals and Aggregate Fiscal Policy

The role of accrual accounting in aggregate fiscal policy has been discussed extensively in previous blogs1. Three points stand out. First, with respect to fiscal sustainability, it makes complete sense to move from narrow cash measures of debt to broader accrual debt measures – e.g. net financial worth. Second — again with respect to fiscal sustainability — it would be a major mistake to shift from a focus on debt to the use of the accrual net worth measure as the key fiscal sustainability measure. Third, there are persuasive arguments for using the accrual operating balance as a measure of the intergenerational equity stance of fiscal policy.

The most obvious performance application for accrual accounting is as an instrument for promoting increased operational efficiency. Operational efficiency means producing outputs – in government, mainly services delivered to citizens, such as education and medical treatments– at the lowest possible cost without sacrificing quality. Accrual accounting is the only conceptually sound way of measuring output costs because it measures the costs of the resources used in the production of those outputs. It is because they need to accurately measure their cost of production, as well as revenues and profits, that businesses use accrual accounting.

If governments are to make more use of accrual accounting to promote operational efficiency, the two obvious applications are:

Accrual output unit cost measures: these are the best indicators of operational efficiency because measuring the unit costs of outputs based on accrual accounting is inherently more accurate than measuring cash expenditure per output.Output budgeting (a.k.a. unit cost budgeting): this means financing the providers of specific categories of public sector service based on output unit costs and the planned volume of outputs. An example is the system used in Scandinavia and elsewhere of funding government schools primarily on enrolment numbers multiplied by the average annual cost of students at various levels. This makes use of accrual unit cost measures as a budgeting tool rather than “merely” for information purposes.

Any government with sufficient resources and technical capacity should be using accrual accounting for these two purposes. A few governments already do this. Unfortunately, however, they are the exceptions. Even the majority of developed countries are making far too little use of accrual accounting for either output costing or output budgeting.

This “thumbs up” to the expanded use of accrual accounting for output costing and budgeting needs, however, to be heavily qualified by noting that these are applications that only make sense for selected government services. They can work well for appropriate categories of education, health and certain other services, but are either unsuitable or of limited value in many other areas (e.g. defense, policing and emergency services)2.

Quite demanding accounting is also required. To measure output unit costs, accrual accounting must be combined with complex managerial accounting (in particular, to allocate costs between different categories of outputs). Not all governments have the capacity to do this, and it is not cheap.

The conclusion to which this points is clear. The development of accrual output unit cost measures, and the use of these measures as the basis for output budgeting, should be done on a strictly selective basis. Output unit cost should only be measured for those government services for which unit costs are a sufficiently useful measure of operational efficiency to justify the cost of undertaking the complex accounting needed. As for output budgeting, experience demonstrates that it cannot and should not be the basis for the entire government budget. Governments in a few countries (Australia and New Zealand) tried this thirty years ago, and it didn’t work3.

Mention of the government-wide budgeting system raises the other big issue: accrual budgeting. Does it make sense to shift the entire government budget onto an accrual basis? We turn to this topic in the next blog piece.

See “A Net Worth Rule?” (May 2023) and “Debt, Not Net Worth, Is What Matters” (May 2022). See also my papers on the topic, including “Accrual Budgeting and Fiscal Policy” (2009). ↩See my series of four blogs on this topic, starting with “Unit Cost Budgeting?” (January 2022). ↩See my paper on “Purchaser-Provider Systems” in Marc Robinson (ed) Performance Budgeting: Linking Funding and Results” (IMF, 2007). ↩

 •  0 comments  •  flag
Share on Twitter
Published on August 25, 2025 12:16

July 9, 2025

A Strategic Approach to Performance Information

Performance information is crucial for high-performance government. Performance budgeting is part of that: budget decision-making needs to be informed by good information about the effectiveness and efficiency of government spending.

The approach to performance information must, however, be highly strategic. To succeed, it needs to be built on a recognition of two fundamental realities. Firstly, there are considerable limits to our ability to measure and analyze the effectiveness and efficiency of expenditure. Secondly, producing more performance information does not mean that the information will be used.

A recent report from the Tony Blair Institute in the UK appears not to grasp these essential points. The Institute envisages building a government-wide performance dashboard in which all of the outcomes of government programs will be reported in “close to real time.”  This will mean that “failure will have nowhere to hide.”  Armed with this “comprehensive shared visibility of spending against outcomes,” the Ministry of Finance (HM Treasury in the UK) will supposedly be empowered to make continuous decisions about what programs to eliminate or ramp up. No need then for multi-year budgets of the sort that the UK has at present – budgeting will be done on a continuous basis.

This vision of budgeting driven by an outcomes dashboard is, unfortunately, pure fantasy. Governments should certainly arm themselves with the best practical set of outcome indicators. But surely we don’t need to be reminded that quite a few government outcomes are unmeasurable (e.g. the level of national security) or are – because of the extensive influence of “contextual factors” – only very imperfectly measurable (e.g. changes in crime rates are not a good measure of the effectiveness of policing). As for reporting outcomes in “close to real-time,” are we going to test the literacy and numeracy levels of school children on a weekly or daily basis? Will there be daily household surveys to measure the unemployment rate? Or the prevalence of undesirable health behaviors such as smoking? A moment’s thought makes it obvious that many of the outcomes that are important to government can only be measured periodically.

Implicit in the Blair Institute’s vision is what might be called the perfect information illusion – that it is possible to scientifically measure the effectiveness and efficiency of all government expenditure. The reality is, however, that what the economists call “imperfect information” is a fundamental reality of life in government as elsewhere.

Taking a Strategic Approach to Performance Information

What does it mean, then, to take a strategic approach to performance information? How is it that we can increase the role that performance information plays in budgeting and government-wide performance management, while recognizing its inherent limitations?

Here are some key principles:

Put more emphasis on using performance information, not just producing it

Governments in many advanced countries have spent massively on developing better performance information over past decades. Regrettably, much of the information that has been produced is never used, or used relatively little. Supply has run ahead of demand. So we need to focus more on how to ensure that this valuable information is used.

In a budgeting context, this means building more systematic processes for reviewing performance as part of the budget preparation process, to ensure that performance is more systematically taken into account in resource allocation decisions. It also means carrying out more spending reviews, using available performance information. (Note that here I am referring to spending review in the international sense of the systematic review of baseline expenditure, not in the UK sense of the preparation of the multi-annual budget.)

2. Be highly selective in the choice of performance indicators.

We need to focus on indicators that have demonstrable decision-making relevance, and for which we can justify the cost of collecting and verifying the data (which can be considerable). There should be no assumption that the more indicators, the better.

For budgeting and government-wide performance management, outcome and output indicators are what is most important. The primary focus should be on those areas of government where outcomes and outputs are most measurable. This means particularly areas of service provision to individual citizens, such as education and health.

There needs to be a better recognition of the difference between the performance indicators that are relevant for internal management within government agencies (many input and activity measures) and those that are relevant for government-wide budgeting and performance management (outcomes and outputs). Too often, the mistake is made of throwing a whole lot of internal management indicators at political leaders and parliaments.

3. Provide narrative advice on how to interpret performance indicators

Performance indicators have great potential to mislead as well as to inform. An outcome indicator might look bad for reasons that have nothing to do with the effectiveness of what government is doing. Crime rates might, for example, be going up for long-term social and economic reasons even though policing is becoming more effective. So when outcome indicators are presented – in performance reports or “dashboards” – there should always be brief discussions of the “contextual factors” that may be influencing them.

The same applies to some output indicators. Unit cost measures, for example, are potentially a very valuable measure of efficiency. But it is often not possible to see whether efficiency is improving or deteriorating simply by looking at the time trend of unit costs. Analysis of other influences that might be involved (such as changing input prices, or changes in average case complexity) is frequently required.

More generally, we need to be aware of the potential “perverse effects” that performance indicators – and even more, performance targets – may generate. This refers to people making performance indicators look better by doing things that reduce efficiency or effectiveness.

4. Be even more selective is using evaluation

Evaluation – applied to the right topics – is a very useful part of the performance information armory. However, it is expensive, and its ability to yield robust conclusions about the effectiveness of government programs is widely exaggerated. So the choice of evaluation topics and the tasks assigned to evaluators in conducting those evaluations should be carried chosen.

The proposition that all government programs should be regularly subjected to evaluations is misguided. A few governments have tried this in the past – for example, Australia and Canada. It turned out to be a major waste of money that was quickly abandoned.

5. Increase the role of other forms of systematic performance analysis

Efficiency is at least as important as effectiveness, and evaluation isn’t much good for efficiency analysis. We need more reliance on other techniques of efficiency analysis, such as cost analysis and business process analysis. However, the same caveat applies as for evaluation: these analytic methodologies should be applied highly selectively.

 •  0 comments  •  flag
Share on Twitter
Published on July 09, 2025 03:30

February 26, 2025

Evaluation, Spending Review & the Budget

Evaluation should be a key source of information for budget decision-makers. In practice, however, it has had surprisingly little impact on decisions about what to fund and by how much1. This raises the question: what can be done to realize the true potential of evaluation to improve budgeting?

The failure of evaluation to contribute much to resource allocation decisions has been on particularly vivid display during episodes where governments have set out to make major spending cuts (e.g. Canada in the 1990s and the UK after the global financial crisis). During these episodes, available evaluation reports proved to be of little assistance in guiding difficult decisions about where to cut spending.

The problem has, however, been a more general one, with finance ministries repeatedly complaining that evaluations are not very useful for their purposes. Indeed, the poor track record of evaluation in supporting budgeting has been one factor contributing to rise of spending review.2

This is the third in a 3-part series on evaluation. The first focused on the importance of practical evaluation methods, and the second on the need for more systematic efficiency analysis.

Spending review is not a form of evaluation. Evaluation means the application of formal analytic methods (such as impact evaluation and contribution analysis/program logic analysis)3. Spending reviews, by contrast, have no prescribed methodologies and use whatever formal and informal analytic approaches make sense for the topic and are feasible within the time available. Spending reviews are carried out quickly, to provide timely input into budget preparation. Most evaluation methods take considerably more time.

Evaluation should, however, be an important part of the information base for spending review. Although most evaluation takes too much time to be carried out in conjunction with a spending review, spending reviews should be able to make extensive use of evaluations carried out previously. In addition, it should be possible to use certain of the more practical evaluation methods during spending reviews in relevant cases. The better the information base that spending review can draw on – including from evaluation – the better it will do its core job of presenting savings options to budget decision-makers.

More on how evaluation should support spending review may be found in my presentation at a workshop organized by the Brazilian Center for Learning on Evaluation and Results at the Getulio Vargas Foundation as part of the 2024 gLocal Evaluation Week.

Well, that’s the theory. In practice, spending review doesn’t seem to make much use of evaluation. Although they should in principle work closely together, spending review and evaluation are essentially separate processes.

How then can evaluation be made more useful to support spending reviews and budgeting generally? In considering this question, several points stand out.

The first point is that we need to increase the amount of systematic efficiency analysis. Efficiency analysis – that is, analysis aimed at identifying ways of lowering the costs of delivering government outputs – is particularly useful for budgeting. However, as discussed in my last blog piece, evaluation is not much good at efficiency analysis, even though it claims to cover both effectiveness and efficiency. But we shouldn’t get hung up on whether we call efficiency analysis “evaluation” or not. The important thing is to put more effort into the application of methods such as cost analysis and business process analysis. Doing so will yield big dividends for budgeting.

The second point is about how to make evaluations of effectiveness more useful for budgeting. Assessing effectiveness is what evaluation does best. But most effectiveness evaluations are not useful for budgeters. What they are most useful for is helping agencies to make policy and implementation changes that increase the effectiveness of government interventions. This is not to say that evaluations of effectiveness are never useful for budgeting. If an evaluation finds that a program is essentially ineffective, this may be of great relevance to the budget because it may lead to the decision to close it down. However, only a small minority of evaluations conclude that the program being evaluated is ineffective.

It is nevertheless possible to substantially increase the contribution that effectiveness evaluations make to budgeting. The most obvious way of doing this is by initiating more evaluations of programs that are suspected to be ineffective, and which are potential candidates for elimination. But effectiveness evaluations can also make other contributions to budgeting.4

This leads us to the third point, which is about who decides what will be evaluated and what the criteria of the evaluation will be. The reality is that evaluation will never realize its potential as an instrument for informing budgeting unless ministries of finance have more control over the choice of topics and terms of reference of evaluations.

The distinction between self-evaluation and centralized evaluation is helpful here. Self-evaluation means that line ministries initiate evaluations and carry them out themselves. Centralized evaluation means that central agencies such as the ministry of finance initiate and manage evaluations of line ministry policies and programs. In the Anglo-Saxon countries, almost all evaluation is self-evaluation, whereas in certain other countries (e.g. Chile) most evaluation is centralized. One clear lesson from experience is that self-evaluations are in general of limited value for the ministry of finance or other central decision-makers – for reasons that will no doubt be obvious. It follows that if the ministry of finance is serious about making evaluation a tool that works for it, it is necessary to have a program of centralized evaluations under its control.

This is not to say that self-evaluation is a bad thing. If line ministries are serious about it – and not just undertaking evaluations as a compliance exercise – it has the potential to help them substantially improve their performance. The ideal evaluation system combines both self-evaluation and centralized evaluation.

Evaluation has, in most countries, the potential to make a much bigger contribution to budgeting than it does at present.5 But action on multiple fronts will be required to realize that potential.

As my colleague Jordi Baños-Rovira puts it based on a review of OECD survey data, “public policy evaluation shows a very low level of integration with the government budget, playing a minor role in budget decisions.” (El reto de integrar la presupuestación pública con la evaluación de políticas.) I reached the same conclusion in my 2014 study Connecting Evaluation and Budgeting. ↩An example of spending review driven by “dissatisfaction” with evaluation – more specifically, with its failure to deliver sufficient “useful information for budget decision-making” is to be found in the Canadian province of Quebec (see Jacob et al, “Evaluation et révision des dépenses publique” in Praxis de l’évaluation et de la révision des programmes publics ). ↩Most definitions of evaluation state that it involves “systematic” analysis, where systematic refers to the application of formal evaluation methods. ↩For example, in assessing the extent to which transfer payments and other benefits are correctly targeted – i.e. going to the right people. ↩Apart from efficiency and effectiveness, there is another evaluation criteria which is significant for budgeting – relevance. Irrrelevant spending should by definition be eliminated. Assessing the relevance of spending is something that can generally be done as part of a spending review without using analytical methods which are part of the evaluation toolkit. ↩

 •  0 comments  •  flag
Share on Twitter
Published on February 26, 2025 00:30

February 19, 2025

Evaluation: What about Efficiency?

Is efficiency analysis the poor relation in the current wave of evaluation reforms?

Governments around the world have been establishing government-wide evaluation policies; creating central units and task forces to promote evaluation; and encouraging or requiring ministries to build evaluation capability and carry out more and better evaluations. In these reforms, it is always said that the role of evaluation is to analyze both efficiency and effectiveness. In practice, however, the evaluation reform effort has concentrated overwhelmingly on improving the analysis of effectiveness. Only perfunctory attention has been given to evaluation as a tool for analyzing and improving the efficiency of government service delivery.

In mainstream public sector usage, “effectiveness” means the extent to which an output (such as a medical treatment or school education) achieves its intended outcomes 1 (such as lives saved or literacy). “Efficiency” is about the cost of outputs, so that “improving efficiency means government being able to spend less to achieve the same or greater outputs, or to achieve higher outputs while spending the same amount”. 2 Efficiency is not the same as “cost-effectiveness,” which is about achieving increased outcomes per dollar.

This is unsurprising given the nature of evaluation as a discipline. Although evaluation is always defined as the systematic analysis of both effectiveness and efficiency3, its focus is in fact almost entirely on the use of social science methods to analyze effectiveness. The typical evaluation methods monograph4 treats methods for evaluating effectiveness, such as impact evaluation, in great detail. But when it comes to the evaluation of efficiency it limits itself to mention of several economic evaluation methods5 (methods which in fact analyze cost-effectiveness rather than efficiency6 and which are of no value in identifying options for reducing the cost of service delivery). There is rarely any acknowledgment of efficiency analysis methods developed by accountants, management experts and others – such as various forms of cost analysis7 and business process analysis. The implicit message is that evaluators can leave the efficiency analysis to others, while firmly maintaining their focus on effectiveness.

Improving efficiency is as important for governments as improving effectiveness. It is therefore a problem if the evaluation system claims the mandate for efficiency analysis but has neither the competence nor commitment to give this mandate the attention it requires.

There are in principle two ways of responding to this problem.

One is to rein in the pretensions of evaluation by limiting the mandate of the evaluation system to the evaluation of effectiveness8 and certain related criteria. Efficiency analysis would then be left to others. After all, some advanced countries have made major progress in efficiency analysis without calling it evaluation, and have done so independently of any initiatives they may or may not have taken under the evaluation banner.

Excluding evaluation systems from the analysis of efficiency would have the advantage of clarity. But the disadvantages are obvious. Governments often want programs analyzed from both the efficiency and effectiveness perspectives, and when this is the case it is far better if the two perspectives are integrated.

The alternative approach is to develop an evaluation system which truly covers, and focuses equally on, both effectiveness and efficiency analysis. This would require that central evaluation units and task forces give as much attention to the promotion of methods of efficiency analysis as to methods of effectiveness analysis. It would mean that the organizational units charged with the evaluation function within spending ministries would have people with the right skills to undertake each form of analysis. Evaluation would need to be seen as a multi-disciplinary activity requiring not only people trained in evaluation as conventionally defined, but also people with the right skills from other disciplines including management accounting and business process design.

There are a few examples internationally, such as Chile, of government-wide evaluation systems that have made serious efforts to develop and apply at least some efficiency analysis methods. They are, however, the exceptions.

Which of these two alternative approaches makes the most sense is an issue worthy of further discussion. My feeling is that the answer varies between countries.

One final point: In my last blog piece, which focused on the evaluation of effectiveness, I stressed the importance of including within the evaluation methods toolkit practical evaluation methods as well as more complex “scientific” methods. By practical evaluation methods I mean methods that are less analytically complex, less data-intensive, lower-cost and quicker. The same point applies to efficiency analysis. The most sophisticated forms of cost analysis methods – such as output unit cost benchmarking – are (when applied in appropriate cases) very useful9. However, they are very demanding. By contrast, techniques such as business process analysis – the mapping of the processes whereby inputs are turned into outputs and analysis to identify options for streamlining those processes – are less complex, require only limited data, and do not require quite the same level of specialist skill to apply. To achieve the best results, we need to develop and promote a balanced toolkit of efficiency analysis methods.

The evaluation literature makes a distinction between “outcomes” and “impacts,” whereas in mainstream usage the term “outcome” is used to cover both – as it is in this blog. The outcome/impact distinction is arguably artificial because there is no clear dividing line between the two. This is why, in mainstream usage, the outcome/impact distinction is replaced with a reference to lower-level and higher-level outcomes (or equivalent terms). ↩UK National Audit Office (2021), Efficiency in Government. (Note that accountants sometimes break this concept of efficiency into two components – “economy” and “efficiency.”) ↩This is made explicit in many of the definitions of evaluation (e.g. that of the 2018 US Evidence Act: “the assessment using systematic data collection and analysis of one or more programs, policies, and organizations intended to assess their effectiveness and efficiency”). But even where this isn’t explicit in the definition, effectiveness and efficiency are identified as “evaluation criteria.” ↩For examples see Rossi, Lipsey and Henry’s leading textbook Evaluation: A Systematic Approach, Anne Reveillard (ed) Policy Evaluation , the United Nations Evaluation Group’s Compendium of Evaluation Methods and the EvalCommunity webpage on “efficiency” evaluation. ↩Cost-benefit analysis and cost-effectiveness analysis (and sometimes also data envelopment analysis). ↩The evaluation literature generally fails to distiguish between cost-effectiveness and efficiency and generally uses the term “efficiency” to mean cost-effectiveness. (The influential OECD/DAC evaluation criteria quite idiosyncratically choose to define efficiency to mean both cost-effectiveness and efficiency as conventionally understood.) The divergence of the evaluation lexicon from mainstream public sector terminology is unfortunate, as it is inevitably a source of confusion. ↩In Rossi et al’s Evaluation: A Systematic Approach, “cost analysis” is mentioned briefly but in the sense only of obtaining information on how much money is spent on the program being evaluated. ↩This is in effect what INTOSAI proposes in its guidelines on Evaluation of Public Policies , which define evaluation as exclusively concerned with aspects of effectiveness (the achievement of impacts), and excludes efficiency analysis. ↩Bearing in mind the limitations on the applicability of unit output cost analysis, as discussed in my series of blogs on unit cost budgeting. ↩

 •  0 comments  •  flag
Share on Twitter
Published on February 19, 2025 07:00

February 12, 2025

Evaluation: Learning from Past Mistakes

A wave of enthusiasm for evaluation has been sweeping governments around the world in recent years. In country after country, governments are acting to increase the amount of evaluation undertaken and ensure that it is used more. Government-wide evaluation policies have been developed, and specialized units and task forces established to promote, guide, and regulate evaluation. Some governments have also passed laws enshrining the obligation to evaluate. All of this is very positive.

There is, however, a risk that much of this will end in disappointment unless we learn from the repeated past failures of government-wide evaluation reforms. I say this as a strong supporter of evaluation who has advised several countries on building government-wide evaluation systems.

One key lesson from past experience is that a prominent place in the evaluation toolkit must be assigned to practical evaluation methods. This means methods that are low-cost, can be implemented quickly, cope with limited data, are action-oriented, and systematic without being so complex that a PhD is required to undertake them. Evaluation policies and units must actively promote – and provide technical guidance on – such methods.

While evaluation might sound like a new frontier in public sector reform, it is anything but. There have been successive waves of government-wide evaluation efforts in advanced countries since the 1960s. Almost all of them petered out in disappointment. This is because, throughout its long history, evaluation has been dogged by a well-documented set of problems. Evaluations have frequently been “too costly and time-consuming compared to their real use and effect”. They have all too often been short on actionable findings/recommendations, difficult for decision-makers to understand, and taken too long to produce to feed into decision-making at the time when they are needed [1]. These are problems that remain even today.

The record is clear: one of the biggest sources of the difficulties that have afflicted evaluation over the decades has been the insistence by an influential part of the evaluation profession that all evaluation must be scientifically “rigorous” — i.e. based on sophisticated social science statistical analysis techniques.

This purist attitude is riding high today. For today’s purists, “rigorous” evaluation essentially means impact evaluation [2] and – for the real hard-liners –  only impact evaluation using randomized controlled trials. Those who take this view are dismissive of other methods, which they regard as insufficiently rigorous to even qualify as evaluation.

Impact evaluation is a great tool for the evaluation of some government programs. It has made considerable progress over past decades, assisted by both methodological advances and digitalization. It is unquestionably one of the evaluation methods that should be part of the analytic toolkit of a well-developed evaluation system.

However, impact evaluation also has major limitations. There are many government interventions for which its use is either not possible [3], not practical, or not cost-effective. It is highly data-intensive, time-consuming and costly – typically taking several years at a cost of hundreds of thousands of dollars. Impact evaluation findings are also much less reliable (“externally valid”) than is often claimed — even when they are obtained using randomized controlled trials [4]. Moreover, while impact evaluation can provide information on whether an intervention is effective or not, it provides no guidance on why it is or is not effective and what might be done to improve its effectiveness.

This is why practical evaluation is so important. For the analysis of effectiveness, so-called theory-based evaluation — variants of which include contribution analysisis a particularly important type of practical evaluation. The core of theory-based evaluation is the assessment of the conceptual credibility of the reasoning about how a program is supposed to achieve its intended outcomes – what is often called “program logic analysis”. This is complemented by a review of whatever data is available and, optionally, some additional data collection. Such evaluation can be carried out in months rather than years, at modest cost, and can be applied to any program. The method is systematic — as analysis must be to meet the definition of evaluation — but very practical.

The fact that, whatever the country, there are limited financial and specialist human resources available for evaluation reinforces the importance of this type of practical evaluation. But this is particularly true in developing countries setting out to develop evaluation systems.

All of this is about the evaluation of effectiveness. But what about efficiency? After all, evaluation usually claims to cover the analysis of both. The evaluation of efficiency will be the topic of the next piece in this series.

[1] I reviewed this historical experience in a 2014 paper prepared for the World Bank Independent Evaluation Group.

[2] An impact evaluation is an analysis of the effectiveness of a government intervention (i.e. its outcomes) using counterfactual analysis based on experimental or policy-experimental techniques.

[3] For example, if the outcome that the program seeks to achieve is not measurable.

[4] There is considerable literature outlining the reasons why, as Deaton and Cartwright put it, “any special status for RCTs is unwarranted” (See also, e.g., Pritchett, 2021).

 •  0 comments  •  flag
Share on Twitter
Published on February 12, 2025 07:12

December 10, 2024

Budgeting Lessons from France

France’s budgetary position is disastrous, and the country appears to be moving inexorably towards a fiscal crisis like that which brought Greece to its knees over a decade ago. The land of Molière presents one of the extreme cases of failure to implement fiscal consolidation after the large-scale fiscal stimulus during the COVID-19 pandemic. But the problem is not new. Successive French governments have for decades failed to contain the upward trajectory of public debt.

The reason for France’s chronic fiscal incontinence is first and foremost political, lying in the scale of popular resistance to any significant cut to government expenditure. But does the budgeting system also play a role? This is a question that the French supreme audit institution – the Cour des Comptes (CdC) – set out to answer in a recent report on The Preparation and Execution of the State Budget. The report is excellent, and presents lessons that are highly relevant to other countries. However — as discussed below — one of its main recommendations is open to debate.

The CdC zeros in on the absence of a sufficiently top-down budgeting system. In other words, the French budget preparation process is not adequately designed to ensure that annual budgets respect aggregate expenditure ceilings. More specifically, it is not sufficiently focused on the identification and management of available fiscal space*.

A key problem here, according to the CdC analysis, is the failure to reliably determine the expenditure baseline early in the budget preparation process. This reflects in part the lack of a sufficiently well-established baseline estimation methodology. But it also results from the fact that, although the baseline is principle supposed to be worked out in the first stage of the French budget preparation process (the “preparatory” stage), in practice spending ministries are allowed to continue to debate the baseline – and to take these debates to the political level – during the next (“arbitrage”) stage when the focus is supposed to be exclusively on the allocation of available fiscal space.

The CdC also finds that the budgeting system has been ineffective in identifying and implementing savings in baseline expenditure. It acknowledges here that the government in 2023 introduced a new spending review mechanism  which explicitly aims to identify major savings options, and which in principle covers all government expenditure. However, this new spending review system suffers, in the CdC’s view, from major institutional defects — including a lack of leadership at the top political level, bureaucratic separation from the budget directorate, and insufficient integration with the budget calendar.

Overall, the CdC’s considers the French budget system to be too bottom-up. There are insufficient constraints on spending ministries proposing new spending, and a lack of incentives for cooperative behavior from ministries in achieving overall government fiscal objectives.

The report makes a range of valuable recommendations to address these problems. These include action to ensure that in future the estimation of the baseline is completed in the first stage of the budget preparation process, so that the process can then move on to focus on the allocation of fiscal space available in the context of strict respect for the aggregate expenditure ceiling. This accords with what would be widely acknowledged as good top-down budgeting practice.

So far so good. The CdC goes, however, one step further and argues that ministry expenditure ceilings should be set early in the budget preparation process. It find fault with the fact that the budget framework letter (“lettre de cadrage” – approximately speaking, the budget call circular) issued to spending ministries by the Prime Minister at the end of the “preparatory” stage of the budget preparation process does not include ceilings for each spending ministry, and recommends that henceforth it should do so. This is the headline recommendation of the report.

To understand the significance of this recommendation, the thing to bear in mind is that it is only in the subsequent (“arbitrage”) stage of the process that spending ministry proposals for new spending are supposed to be discussed. In essence, then, the CdC is proposing that ministry ceilings should be set prior to the formal discussion of new spending proposals.

This recommendation clearly reflects the influence of the “hard ministry ceilings” (HMC) doctrine which I have previously critiqued at length. HMC asserts that the appropriate way of enforcing aggregate expenditure ceilings is to split them into ministry ceilings in the budget call circular before spending ministries are permitted to formally present new spending proposals (or savings proposals, for that matter). In its pure form, this doctrine further demands that these ceilings should be hard, with spending ministries under no circumstances being granted additional funds during the rest of the executive budget preparation process. This tough approach is, it is claimed, the best way to contain excessive bottom-up spending pressures.

The HMC doctrine has many flaws. Perhaps the most important is that setting hard ceilings without any prior formal consideration of new spending proposals would undermine the basic budgeting objective of rational resource allocation (allocative efficiency).

Perhaps aware of this problem, the CdC shies away from the pure version of HMC by recommending that what the PM’s budget framwork letter should contain are what it calls “pre-ceilings” (pré-plafonds) for each ministry. While no definition of “pre-ceilings” is provided, the only possible interpretation of the CdC’s line of argument is that it means ceilings which can be varied during the later stages of the budget preparation process.

The obvious difficulty with this is that a “ceiling” which can subsequently be varied is no ceiling at all, and cannot serve the function of enforcing expenditure discipline.

The CdC suggests that what is it proposing is standard practice elsewhere, citing Sweden as an example. However, most advanced countries do not set ministry ceilings at an early stage of the budget process. Moreover, although Sweden and a few other countries (e.g. Germany) set early ministry ceilings**, none of these countries (to the best of my knowledge) set these ministry ceilings without having first permitted spending ministries to formally present new spending proposals and have these properly discussed. To be able to set these ministry ceilings early, they have created processes to negotiate major new spending proposals even earlier in the budget preparation process.

If France were to follow the lead of Sweden and Germany and set ministry ceilings at an early stage of the budget preparation process, it would have to bring forward the presentation and negotiation of new spending proposals to an even earlier stage, prior to the setting of these ministry ceilings. The problem here is that in a top-down budgeting system decisions on new spending proposals cannot be made without first determining baselines and estimating available fiscal space. It is only because they have the technical capacity to determine the baseline right at the outset of the budget preparation process that countries like Sweden and Germany can move at such an early stage to set ministry ceilings. This is something which the CdC analysis indicates that France is currently unable to do, and is unlikely to be able to for some years to come. It follows that the CdC recommendation for the early setting of ministry ceilings in France is impractical at present.

Be this as it may, the Cour des Comptes report is excellent and offers much in the way of technical tools to help France restore order to its public finances. Regrettably, however, technical solutions cannot resolve the underlying political problem.

* Recall that fiscal space – the “marge de manoeuvre” as the French call it – is the difference between the aggregate expenditure ceiling and baseline expenditure.

** Of, in Sweden’s case, “expenditure area” ceilings.

Note: France also sets supposedly fixed two-year ceilings for multi-ministry policy areas (“missions”), but as the CdC points out these are not binding on the annual budget and have frequently been exceeded.

 •  0 comments  •  flag
Share on Twitter
Published on December 10, 2024 16:25

March 21, 2024

Does PFM need to be rebuilt?

Contemporary public financial management (PFM) is, according to an emerging school of thought, in a parlous condition and needs a complete rebuild. I disagree.

An influential recent formulation of the views of this school of thought can be found in the 2020 report Advice, Money, Results (AMR). This report issues a challenge to contemporary PFM practice that cannot be left unanswered.

Before starting, we need to acknowledge that there is undoubtedly room for improvement, including in the form PFM technical assistance to developing countries. But the fact remains that PFM has over past decades developed an impressive toolkit of highly relevant good practice models for fiscal policy formulation, budget preparation, budget execution, public investment management, debt management and a range of other areas. Implementation strategies have also progressed enormously.

To the new critics, however, a recognition of room for improvement is not enough. They believe that PFM is rotten from the foundations up and needs complete rebuilding.

The foundations of modern PFM lie in three basic objectives of public finance – formulated by the World Bank in the 1990s – which PFM aims to support, namely:

1.      Aggregate fiscal discipline

2.      Allocation of resources in accordance with strategic priorities

3.      Efficient and effective use of resources in the implementation of strategic priorities.

The new critics believe that this formulation of public finance objectives is flawed and needs to be completely reworked. They consider that many of the good practice institutional models that PFM has developed are inappropriate. Many of the critics go even further and challenge the basic idea of good practices, as embodied in instruments such as the Public Expenditure and Financial Accountability (PEFA) assessment tool. They argue that national contexts are so enormously variable that few generalizations can be made about good practice.

In a just-completed working paper, I review Advice, Money, Results in detail and find it to be an unpersuasive document. AMR presents a critique which claims that PFM neglects expenditure reprioritization, is biased towards fiscal austerity, and is value-laden and neoliberal. None of this is true. AMR also presents an assessment of the standards-based approach to PFM – as embodied in PEFA – which is too negative and fails to sufficiently credit the benefits of the approach.

AMR contrasts what it refers to as the current “closed” PFM discipline with a new “open” PFM which it favors. However, the nature of this so-called “open” PFM is vague and lacking in operational specificity. The main recommendations of the report are for more review, evaluation and data-gathering, and there is nothing concrete about how we should change our advice or implementation strategies. The result is a report which is of very limited value as a guide to practice.

Advice, Money, Results is, unfortunately, a missed opportunity. It misses the opportunity to provide a credible and informed critique of what’s wrong with current PFM practice and to present a constructive way forward.

PFM is certainly not perfect. In my view – which I know is shared by a large number of PFM practitioners – there are two major areas that require ongoing work. One is refinement of the technical design of the institutions in the PFM toolkit: as good as this is overall, there are certain areas where bad ideas prevail or where the hobbyhorses of particular disciplines (e.g. accounting) have too much influence. The other is better tailoring of institutional design to the circumstances of particular developing countries. For example, while medium-term budgeting is indeed good practice that most countries in the world should adopt – at least, those that already have reliable short-term budgeting – the concrete form that it takes should be significantly different in most developing countries from that in advanced countries.

Recognizing that there is room for improvement should not, however, blind us to the considerable achievements of past decades in building an impressive toolkit of institutions and implementation strategies. We should beware of “throwing the baby out with the bathwater” when thinking of future directions for public financial management. 

Read the working paper Radically Reshaping PFM? A Review of “Advice, Money, Results”

Tweet
 •  0 comments  •  flag
Share on Twitter
Published on March 21, 2024 09:11

November 21, 2023

“Government Analytics” vs M&E

To what extent can digitalization give us what we need to properly measure government performance? What follows is the last in a series addressing this question, as well as a closely associated question: should we shift our focus from the monitoring and evaluation of performance to government analytics?

Although digitalization is a valuable tool for improving government performance measurement, there are – as outlined in the preceding pieces – major limits to its potential contribution. In particular, the effectiveness of government expenditure can only to a limited extent be measured using administrative data (data on activities carried out and other information routinely collected during service delivery). Outcome and output quality indicators need to draw substantially on data sources beyond administrative data, including surveys, physical sampling, testing, official statistics, client interviews, and expert quality assessments. The biggest contribution digitalization can make lies in the areas of indicators of outputs and intermediate services. Even here, however, there are significant limits to this contribution – such as for measuring efficiency*.

We are left with the conclusion that as valuable as it is the administrative data to which digitalization gives us easier access is only one of the sources of data required for good public sector performance measurement.

When you think about it, this point is obvious. So why am I making it?

My immediate inspiration comes from having just read the World Bank’s new Government Analytics Handbook. The Handbook is a mine of valuable information and analysis. Nevertheless, there are a few aspects that trouble me. By “government analytics,” the Handbook means the analysis of data from digitalized government business processes, supplemented by some use of surveys. It asserts that government analytics based on these data sources can provide public managers and other stakeholders with real-time performance dashboards that cover all dimensions of performance from inputs through activities and outputs to outcomes. This leads to the recommendation that all government agencies establish government analytics units to carry out this analysis and provide the dashboards.

I’ve got two problems with this. The first is that it is, in my view, inappropriate to discuss government performance measurement with such a narrow focus on two data sources – and with a primary emphasis on administrative data – thereby disregarding the importance of other data sources. This makes me, incidentally, unsympathetic to the idea that we should have a shift in terminology away from “performance measurement” to “government analytics.”

The other thing that worries me is the exclusive emphasis on performance measurement – monitoring – and the failure to acknowledge the crucial role of evaluation. Performance measures are valuable, but often not sufficient. The techniques by which these measures are analyzed – including formal evaluation – are extremely important. The importance of evaluation in government is a cause that the World Bank has long espoused, including in the form of advice to governments to establish “monitoring and evaluation” (M&E) systems. Organizationally, this advice translates into recommendations that government organizations have dedicated M&E units with a broad mandate – as opposed to government analytics units with a narrow focus on the analysis of administrative microdata.

The “government analytics” approach looks to me far too much like an attempt to copy private-sector business analytics into the public sector. Business analytics are fine as far as they go. But they are not enough, precisely because government is not the same as the private sector. In the first place, outcomes matter to government, whereas customer satisfaction – which is not the same thing – is essentially the only thing that matters to businesses. In the second place, many government outputs are not delivered to specific clients/customers, but to the community as a whole. These two fundamental facts make public sector performance measurement significantly different from performance measurement in the private sector. While it is useful to learn from good private-sector practice in performance measurement, simply copying private-sector approaches is not the way to go.

None of this removes my enthusiasm for exploiting digitalization to the full to help improve government performance measurement. But this needs to be part of a much broader M&E strategy.

*Thus, although administrative data will provide information on output per unit of labor under circumstances where staff are devoted exclusively to the delivery of a single type of output, when staff are involved in the delivery of multiple types of outputs, administrative data will not typically record their time allocation between those outputs. To measure output per unit of labor, it is then necessary to introduce time records – i.e. records in which staff record how much time they spend on each of the multiple outputs they help deliver.

Tweet
 •  0 comments  •  flag
Share on Twitter
Published on November 21, 2023 22:45