Lies, Damn Lies and Benchmarks: An Injunction for Trustees

Oct 19, 2022 by Richard M. Ennis

Public pension funds and many endowment funds periodically report their investment performance publicly. They accomplish this by comparing the return achieved with that of an ostensibly relevant benchmark, one that supposedly reflects the return of a comparable investment. They use benchmarks of their own devising, typically referred to as strategic (or custom) benchmarks. Most exhibit significant benchmark bias, meaning the chosen benchmarks underperform ones that, in fact, better represent a fair economic return given observed market exposures and risk characteristics. As a result of benchmark bias, the majority of funds give the impression they are performing favorably compared to passive management, when, in fact, they are underperforming by a wide margin. Benchmark bias masks serious agency problems in the management of institutional funds. Investment trustees must step up and take control of benchmarking and performance reporting.





            Purpose of Benchmarking


            The topic of benchmarking institutional portfolios is so muddled that one scarcely knows where to begin in addressing it.[1] I start by describing the purpose of benchmarking multi-asset-class portfolios. If we can’t agree on that, we are unlikely to be able to agree on anything. The paramount purpose of benchmarking is to determine whether a portfolio has benefited from active investment management. This is the focus of what follows.


            Passive Management: the Logical Benchmark Framework


            Passive management is the term used to describe holding capitalization-weighted, market-index-based portfolios whose cost is negligible. Passive management across principal asset classes of marketable securities has been commercially available at next to no cost for more than 30 years. I have found that broad indexes for US stocks, non-US stocks and investment-grade US bonds cover the waterfront of capital market opportunity for US institutions.[2] Research shows that the overwhelming majority of individual active managers underperform passive management over periods of a decade or more.[3] My own work indicates that multi-asset-class institutional funds that rely primarily on active management have underperformed passive investment alternatives by the full margin of their cost.[4] Passive management enables institutional investors to achieve market-like performance at next to no cost. This makes an appropriate passive-management formulation, i.e., a fitting combination of stock and bond indexes, a natural benchmark by which to evaluate the efficacy of active investing for institutional investors. I refer to such as the Passive Benchmark (PB).





            The Passive Benchmark


            Design of the PB should reflect the tastes of trustees of institutional portfolios. In the field of economics, and as relates to investing, “taste” refers to investor preferences for (1) the level of risk undertaken, (2) liquidity and (3) domestic investments, to name a few. In terms of risk tolerance, public pension fund trustees have exhibited a preference for maintaining average equity exposures that cluster in the vicinity of 72%.[5] For trustees of large endowments, the figure is approximately 84%.[6] Both types of institutions, as a matter of practice, essentially limit their non-tactical investment-grade bond holdings to the US market, reflecting a clear preference for dollar-denominated bond returns. Both types exhibit a decided home-bias for equity investments just as do institutional and individual investors globally. Home bias is a universal phenomenon of long standing with no signs of abating.[7]


            I sought to identify implicit PBs for 46 institutional funds based on historical returns. These are the fixed proportions of market indexes that best represented the funds’ average asset allocation over time. Using returns-based style analysis (RBSA), I inferred PBs for 24 large public funds and 22 large endowments based on their return behavior relative to market indexes during the 13-year-period ended June 30, 2021.[8] The market indexes employed are Russell 3000 stocks, MSCI ACWI ex-US stocks, and Bloomberg US Universal bonds. The average implied PB asset allocation for the two fund types are shown in Exhibit 1.[9]


Exhibit 1

Average Allocation of Implicit Passive Benchmarks

 by Investor Type Developed By Means of RBSA


Public Funds


US Equities



Non-US Equities



US Bonds







            Looking forward, the PB is intended to reflect how trustees, in the absence of insights about the future, would embrace the capital markets (1) as a matter of policy and (2) as a function of taste in delegating investment management of institutional assets. The PB does not incorporate a strategic outlook or any active investment bets. It is only revised when fundamental aspects of taste change. The PB is passively investable and fully transparent, making it feasible.[10]


            Few institutional investors report their performance relative to a pure PB. Rather, most use what is commonly referred to as a strategic (or custom) benchmark (SB). These benchmarks originated long ago in performance attribution. As fund managers began to depart from the PB in the implementation of investments, some began to include a provision for their strategic choices, such as the inclusion of real estate assets in the portfolio, in an alternative tracking series. Comparing the return of the SB with the that of the PB would reveal whether strategic, i.e., outlook-based, decisions in the investment office were adding value relative to the PB. This is a reasonable attribution framework, in principle at least. As employed traditionally, the attribution framework was more a matter of internal diagnostics than reporting to the outside world. Its original intent was not to supplant the PB, but to augment it for insight into the merit of strategic decision-making and execution.


            Strategic Benchmarks: A Black Art


            As practice evolved, PBs have largely gone by the wayside in public reporting. They have given way to a new breed of SBs, which are often highly customized to fit portfolio circumstances. Such benchmarks may incorporate numerous asset classes, including for private market investments and other active strategies (which immediately defeats the bencmark purpose of evaluating the contribution of active management). Asset classes themselves may have several sub-components, making SBs complex. There are no standards or guidelines for the selection of market indexes. Index returns for alternative investments are typically nebulous, merely representing the past outcomes of a select group of investors reporting in databases like those of Cambridge Associates, Preqin or NCREIF. Reported private equity returns may be internal rates of return (IRRs), which don’t blend easily with time-weighted returns and are subject to manipulation.[11] There is often no explanation in public reporting for the substitution of one index for another. The nature of asset classes and sub-components is often “customized,” usually without explanation of what the customization entails or why it occurs. Sometimes component benchmarks are expressed in non-investable terms, such as “CPI +3%.” Some funds use the actual recorded returns of private market investments in calculating benchmark returns, which tells us nothing about their impact on performance. SBs tend to be updated frequently, sometimes several times a year; the effect is to cause them to conform to actual portfolio exposures over time. (The appendix contains a case study illustrating this phenomenon.) SBs are invariably subjective in their construction, often complex, ambiguously customized, fluid in composition, opaque, and all but indecipherable to readers of financial reports. The fund’s CIO and/or consultant are responsible for the design and maintenance of SBs. Sometimes investment staff members are paid bonuses for outperforming the SB. Having the CIO and/or consultant, who are responsible for designing and implementing the investment program, also do the benchmarking and reporting is a clear conflict of interest and a sign of weak governance. As a gauge of financial performance, SBs are an economist’s worst nightmare.





            In “Cost, Performance, and Benchmark Bias of Public Pension Funds in the United States: An Unflattering Portrait” (Ennis 2022a), I analyzed the primary performance benchmarks used by 24 large public funds in their public reporting. These are benchmarks of the funds’ own devising. As far as I could tell, none of them were of the pure PB type described above. In the course of the analysis, I identified significant bias in the returns of benchmarks used by the funds I studied. By this I mean the principal benchmarks used in public fund reporting have generally produced rates of return that are significantly less than those of benchmarks that, in fact, represent a fair economic return given the funds’ market exposures and risk statistics.


            I devised a unique benchmark for each of the 24 public pension funds using RBSA. The RBSA benchmarks are static combinations of US and non-US stock indexes, as well as the aggregate US bond index, in the general style of PBs. They serve as proxies for fund PBs for the purpose of this analysis. Then I compared the rate of return of those empirically-determined benchmarks to the return of the benchmark each fund reported in its annual report for the 10-years ended June 30, 2020, in order to determine benchmark bias. Exhibit 2 shows the results, with funds ranked in order of benchmark bias (from least to greatest). Benchmark bias averaged 1.7 percentage points a year for a decade. Only one fund did not exhibit benchmark bias. Nineteen of the 24 exhibited bias of greater than one percentage point. One fund exhibited benchmark bias of nearly 4.5 percentage points per year. Benchmark bias is significant and pervasive.


            In analysis from Ennis (2022a) not reported here in detail, the 24 funds reported outperforming their custom benchmarks by an average of 0.4 percentage points per year for the decade. In fact, they underperformed their RBSA benchmarks by an average of 1.3 percentage points annually. Thus, the funds give the impression that they performed favorably relative to passive management when, in fact, they underperformed passive by a wide margin.


            More generally, I have estimated that large public pension funds have, on average, underperformed passive management since the Global Financial Crisis of 2008 (GFC) by 1.2 percentage points per year, and in 12 years out of 13. Their margin of underperformance approximates their independently estimated average expense ratio.[12] This finding is consistent with the finance dictum that, given reasonably efficient markets, diversified portfolios can be expected to underperform properly constructed benchmarks by the margin of their cost.







Exhibit 2

Benchmark Comparison: Public Pension Funds

(10 years ended June 30, 2020)



Pension Fund


















Iowa (PERS)












California (STRS)




North Carolina




Texas (Teachers)












N.Y. State (Teachers)




California (PERS)




Rhode Island




Ohio (School Employees)




Vermont (Teachers)








Illinois (SERS)




New Jersey




South Carolina




Pennsylvania (School Employees)








Illinois (TRS)




Arizona (SRS)




New Mexico (PERA)




South Dakota




Missouri (SERS)
















            I performed the same analysis for 22 large endowments. Exhibit 3 shows the results. The average benchmark bias of the endowments is 1.4 percentage points for the same 10-year period. Only three funds did not exhibit benchmark bias. Thirteen of the 22 exhibited bias of more than one percentage point per year. One fund exhibited benchmark bias of nearly 3.4 percentage points per year.


            The 22 endowments reported outperforming their custom benchmarks by an average 0.1 percentage points per year, when in fact they underperformed their RBSA benchmarks by an average of 1.3 percentage points during the decade. Thus, like the pension funds, the endowments give the false impression that they performed favorably relative to passive management when, in fact, they underperformed by a significant margin.


            More generally, I estimate that large endowments, as represented by the NACUBO large fund composite, have underperformed passive management over the 13 years since the Global Financial Crisis of 2008 (GFC) by as much as 2.5 percentage points per year. Their average margin of underperformance approximates their independently estimated average expense ratio of about 2.5% of asset value.[13]


Exhibit 3

Benchmark Comparisons: Endowments

(10 years ended June 30, 2020)

























University of North Carolina




North Carolina State








University of California




UCLA Foundation




University of Michigan




















University of Rochester








Penn State












University of Southern California




University of Chicago




University of Virginia








Notre Dame




University of Georgia


















            I note in passing that research also reveals evidence of benchmark bias in mutual fund performance reporting. Cremers, Fulkerson and Riley (2019) use the terms “benchmark mismatch” (or “discrepancy”) to describe a phenomenon akin to my benchmark bias. They use a holdings-based method to identify a benchmark that closely mimics each mutual fund’s return history. (Holdings-based benchmarks are conceptually similar to RBSA benchmarks in that they also attempt to identify the return drivers of the portfolios, albeit at a more granular level and at a particular point in time.) CFR then compare their benchmark return for each mutual fund to that identified in the fund’s prospectus as deemed being appropriate by the fund manager. Sixty-seven percent of the funds studied exhibited a meaningful benchmark discrepancy. In reporting their results, the authors concentrate on just the funds that did exhibit a discrepancy. Here is a key finding:


Funds with a benchmark discrepancy outperform their prospectus benchmarks on average by 0.66% per year, but underperform their [holdings-based] benchmarks by 0.84% per year. Likewise, using prospectus-benchmark-adjusted returns, funds with a benchmark discrepancy outperform those without a benchmark discrepancy on average by 1.04% per year. If instead [holdings-based] benchmarks are used for funds with benchmark discrepancies, the difference in performance between the two groups is negligible. Taken together, a primary conclusion of our study is that, among funds with a benchmark discrepancy, the prospectus benchmark significantly overstates ex post performance [pp. 3-4].


            For funds exhibiting a benchmark discrepancy in CFR (2019), the average measure is 1.5 percentage points per year, 1.5 being the sum of 0.66 and 0.84 from the paragraph above. In other words, benchmark bias also exists among mutual funds, and its magnitude is similar to that observed for public pension funds and endowments.


            In sum, benchmark-bias is prevalent among all three types of institutional portfolios discussed here. It looks like fudging benchmarks has become de rigueur in the field of institutional investing. This is discomfiting, to put it mildly.





            For years we have been witnessing the waning of active money management. Despite the growth of alternative investments, active money management has lost significant market share to index funds. Active managers have also experienced fee compression (at least in public markets and for hedge funds). Investors that do their own manager selection, e.g., mutual fund investors, have steadily shifted assets from active management to passive, and now have a majority of their assets in index funds.[14] Those investors have realized that active money management has long been a losing proposition.[15] On the other hand, institutional investors, such as public pension funds and endowments, have been loathe to pick up on the realities of modern capital markets. Since the GFC, they have consistently underperformed passive management by the margin of their substantial costs when evaluated against fair benchmarks. The cost of equity mutual fund investing, with the shift from active to passive, has decreased from 104 bps of managed assets to 50 bps over the past 24 years.[16] Meanwhile, public pension funds and endowments, remaining ardently active, have paid more and more. I estimate that during the last two decades, large public fund expense ratios have approximately doubled, increasing from about 60 bps to about 120 bps.[17] Large endowments spend twice as much as public funds owing to their much greater use of alternative investments, which cost about 10 times as much as traditional ones.[18] What accounts for the behavior of top-level CIOs, presumably well versed in investment theory and evidence?


            In a word: agency. There is big money to be made in asset management—on both sides of the table. For their part, institutional CIOs know they can earn much more when overseeing a complex, active investment program than if they were to adopt a passive approach at next to no cost. The benchmark gaming described here may be the best evidence that the CIOs themselves believe that the odds of beating the market are very long. (Keeping a thumb on the scale as insurance, they are.) And yet, they persist in gambling with moneys entrusted to them. CIOs are prime players in The Great Game of Institutional Investing, one rife with agency.


            What about the professionals, the consultants? Nowadays, as well credentialed as asset managers and CIOs, these are the ostensibly independent advisors to boards of trustees and CIOs. Why are they complicit in the status quo? Again, the answer is agency. Consultants thrive on complexity—the more complexity, the greater the fees. Plus, they are unwilling to concede that they are unable to identify managers that beat the market consistently, despite decades of having failed to do so. In the main, the field of consulting has opted for the role of facilitator in The Great Game, rather than that of firmly grounded, evidence-based advisor.


            What about trustees? Most public fund trustees are well-intentioned. But, as mostly laypersons, they are in way over their heads in attempting to supervise huge, complex, multi-asset-class portfolios with an average of more than 100 asset managers in a politically charged environment. They have no choice but to accept the recommendations of staff and consultant. I note that a reviewer of an early draft of this paper thought the preceding statement inadequate, offering the following observation:


I believe you are correct in saying most [public fund trustees] are well intentioned and in over their heads…. But they do have a choice. They could insist on hearing a diversity of opinions and they could create incentive systems that would do away with the bonus systems that public pensions use where investment staff are rewarded for beating the benchmarks they create. They could and should insist that if investment staff and the people they hire to manage funds can’t beat real benchmarks, they should be fired and just move to passive investing. They could insist on hiring evidence based advisors.  If they can’t then there really is no point in having public trustees.[19]


I could not have put it better myself, so I won’t try.


            Some trustees of elite colleges and universities with large endowments believe their institutions are—by legacy or lore—exceptional investors, and that it is incumbent on them to realize the potential before them. Large endowments are consummately active investors, enamored of pricey alternative investments with an average allocation of 60% there. At least it can be said of endowed institutions that it is their own money they are frittering away on such a conceit. Many endowment trustees are simply playing follow-the-leader. Evidently, no one has informed them that the higher returns earned by some are the result of greater risk-taking, and that those funds would have been better off—greater return with no greater risk—to have had their assets in a stock index fund.[20]


            Alas, and perhaps unwittingly, trustees of both fund types give the appearance of being de facto agents themselves, caught up to varying degrees in The Game. Only the ultimate stakeholders of institutional investing—taxpayers, public workers, and future scholars—are principals in fact. But they have no direct say in investment management.


            In my opinion, gaming performance benchmarks is the biggest problem facing institutional investing today. If conscientious trustees are misled by rosy reporting, how can they be expected to address chronically poor performance?[21] Trustees must step back from investment operations and temper their understandable enthusiasm for institutional success to embrace their paramount responsibility, which is holding the whole stack of agents beneath them accountable. They must take control of performance reporting and see that it is done right. This means adopting a passive benchmark of the type described here and living with it—no tampering or tweaking! Even if trustees know next to nothing about investing, they must learn the rudiments of performance measurement. For they are the ones charged with watching the watchmen.







            The California Public Employees’ Retirement System (CalPERS) is fairly typical in its approach to performance reporting: It uses an SB and tweaks it regularly. In fact, CalPERS’s staff has the authority to revise the benchmark quarterly to keep asset class weights in line with the portfolio. So, in addition to being large and prominent, CalPERS serves as a good referent for the benchmarking issues discussed here. What follows is not intended to single out CalPERS or present it in an unfavorable light, but rather to demonstrate how public funds commonly present their investment results. Exhibit A1 compares CalPERS’s total fund rate of return with that of its SB and an RBSA benchmark of the type described above. The RBSA benchmark is a proxy for a hypothetical CalPERS PB. The CalPERS RBSA benchmark comprises 79% US and non-US stocks and 21% US investment-grade bonds.


Exhibit A1

Benchmarking at CalPERS


Fiscal Year Ending

CalPERS Reported Return

Strategic Benchmark Return



RBSA Benchmark Return






































































Annualized Return (10 years)











Annualized SD/TE






R2 with

Total Fund









            CalPERS’s reported return tracks that of the SB extraordinarily closely. This is not surprising inasmuch as CalPERS’s staff has the authority to revise the benchmark to keep asset class weights in line with the portfolio. (Here we have a case of the benchmark hugging the portfolio rather than vice versa.) The 10-year annualized returns for the two series differ by all of 3 bps, 8.54% versus 8.51%. Year to year, the two-return series move in virtual lockstep, as demonstrated by the measures of statistical fit—an R2 of 99.5% and annualized tracking error of just 0.5%—and even by simple visual inspection of the annual return differences. There we see that the annual return deviations from the SB for the most recent seven year are never greater than plus or minus 0.4%. This is a skintight fit.

            CalPERS’s RBSA benchmark return series also has a close fit with reported returns in terms of R2 and tracking error, although not as snug as with the SB. (Recall that the RBSA benchmark is inferred empirically rather than being induced, as is the case with the SB. This close fit is testament to the objective inferential power of RBSA.) There is an important difference in the level of returns. Whereas CalPERS’s 10-year annualized return is virtually identical to that of its SB, it underperforms its RBSA benchmark by an average of 114 bps a year. And it does so with remarkable consistency: in 10 years out of 10. The return shortfall relative to the RBSA benchmark is statistically significant, with a t-statistic of –2.9. In other words, CalPERS’s SB return, as garnered by its components, collectively, was much less than that of the stock and bond indexes that mirror CalPERS’s market exposure and risk statistics


I appreciate the comments of Anonymous, Matthew Arnold, John Bass, Mitchell Bollinger, Brian Bruce, Russell Chaplin, Rudy Fichtenbaum, Barry Gillman, Ryan Harvey, Mark Higgins, Jeffrey Hooke, Jeffrey McCurdy, Preston McSwain, Soeren Fryland Moeller, Paul O’Brien, Joseph Pagliari, Jr., Ludovic Phalippou, Mike Sebastian, Brian Schroeder, William Sharpe and Ron Surz.





Cremers, K.J.M., J.A. Fulkerson and T.B. Riley. 2020. “Benchmark Discrepancies and Mutual Fund Performance Evaluation.” Available at SSRN:


Ellis, C.D. 1975. “The Loser’s Game.” Financial Analysts Journal, Volume 31, Issue 4, pp. 19-26.


Ennis, R.M. 2020. “Institutional Investment Strategy and Manager Choice: A Critique.” Journal of Portfolio Management (Fund Manager Selection Issue): 104-117.


——. 2021. “Endowment Performance.” The Journal of Investing,” 30 (3) 6-20.

——. 2022a. “Cost, Performance and Benchmark Bias of Public Pension Funds: An Unflattering Portrait.” Journal of Portfolio Management, April, 48 (5) 138-150.


____ . 2022b. “Cutting through the Fog of Asset Class Labels.” The Journal of Investing, 31 (2) 6-10.


____. 2022c. “Overexposed? Public Pension Funds Have A Lot Riding On The U.S. Stock Market, Possibly Even More Than They Realize.” Journal of Investing, 31 (3) 6-9.


____. 2022d. “A Universal Investment Portfolio for Public Pension Funds: Making the Most of Our Herding Ways.” The Journal of Investing December, joi.2022.1.241.


——. 2022e. “The Modern Endowment Story: A Ubiquitous Equity Factor.” Journal of Portfolio Management, November.


Huang, D. 2022. “Passive Investing, Mutual Fund Skill, and Market Efficiency.” Available at SSRN: or


Investment Company Institute. 2021. “Mutual Fund Expense Ratios Have Declined Substantially over the Past 24 Years.” News Release, March 24. See


Sharpe, W. F. 1988. “Determining a Fund’s Effective Asset Mix.” Investment Management Review 2 (6): 16–29.


——. 1992. “Asset Allocation: Management Style and Performance Measurement.” The Journal of Portfolio Management 18 (2): 7–19.


Swedroe, L. 2019. “Global Impact of Investor Home Country Bias.” Alpha Architect, December 19.


[1] See, for example, Ennis (2022b).

[2] See Ennis (2020).

[3] See

[4] Ennis (2022a, 2022e).

[5] Ennis (2022d).

[6] Ennis (2022e).

[7] See

[8] I use the RBSA methodology of Sharpe (1988, 1992). This is a quadratic programming technique that identifies the combination of market indexes that offers the best statistical fit for a particular investment portfolio or portfolio composite.

[9] RBSA is effective in identifying a fixed combination of market indexes that mimics the return of individual funds. This can be attributed to the funds’ asset allocations shifting but gradually over time and the diversification of their portfolios. For the 24 public funds, the average R2 is 98% and tracking error is 2.6%; for the 22 endowments those figures are 93% and 3.8%.

[10] The PB, of course, must also be legitimate. I observe funds reporting performance using a PB that incorporates a 60% allocation to a stock index when the effective equity exposure is 75% or more. In cases such as this, the PB obviously does not reflect risk tolerance. These funds are chasing slow rabbits.

[11] There is no way to way accurately gauge the impact of the use of IRRs on the reported private equity returns of institutional funds.

[12] See Ennis (2022a).

[13] See Ennis (2022e).

[14] See Huang (2022)

[15] See Ellis (1975).

[16] See Investment Company Institute (2021).

[17] Public funds’ average allocation to alternative investments was approximately 10% of assets in 2001 and grew to approximately 30% in 2021. The growth in the allocation to alternatives was the primary driver of the increase in expense ratios. See Ennis (2022a).

[18] See Ennis (2022e).

[19] Rudy H. Fichtenbaum, PhD, is Professor Emeritus (economics), Wright State University, Dayton, Ohio and a seasoned trustee of Ohio State Teachers Retirement System.

[20] The effective equity exposure of large endowments has risen steadily in recent years. I found that the NACUBO large fund composite exhibited effective exposures of 97% to the Russell 3000 and 3% to frictional cash over the most recent 5-7 years. This compares to an effective equity exposure of  70%—75% in the years leading up to the GFC of 2008. However, for the six years ended June 30, 2021, the NACUBO composite underperformed a 97%—3% passive benchmark by more than 4.0 percentage points per year. See Ennis (2021) and (2022d)

[21] Paul O’Brien writes, “...I disagree with your claim that gaming benchmarks is the biggest problem in institutional investing. It’s a big problem, of course, but the biggest is the rise of alternative investments and the resulting gap between governance capabilities versus needs. We are running 21st Century portfolios with mid-20th Century governance models.” Paul O’Brien, PhD (economics), was for many years an institutional bond portfolio manager for Morgan Stanley Asset Management, a senior manager of the Abu Dhabi Investment Authority (sovereign wealth fund), and is currently a trustee of the Wyoming Retirement System.