The burden of knowledge: dealing with open-source risks

11 min read Original article ↗
LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing.

Organizations relying on open-source software have a wide range of tools, scorecards, and methodologies to try to assess security, legal, and other risks inherent in their so-called supply chain. However, Max Mehl argued recently in a short talk at FOSS Backstage in Berlin (and online) that all of this objective information and data is insufficient to truly understand and address risk. Worse, this information doesn't provide options to improve the situation and encourages a passive mindset. Mehl, who works as part of the CTO group at DB Systel, encouraged better risk assessment using qualitative data and direct participation in open source.

Mehl started with a few assumptions about the audience and open-source usage at the organizations they worked at. The first assumption was that audience members were in some way responsible for the use of open source in their organization. Next, those organizations have a five- to seven-digit number of open-source packages in use, spread out among a three- to five-digit number of internal projects. Many of the packages in use at those organizations are direct dependencies—the software the organization's developers actively chose to use—but the majority are indirect dependencies that are required for the software the organization wants to use.

Understanding risk

Those working with open source know that there are potential risks inherent in open-source projects. A project might have security vulnerabilities, or it might change licenses at some point and no longer be considered safe for the organization to use. Projects might also have sustainability issues, he said, which could take the form of an inactive maintainer, a completely dead project, or "other things that indicate that the longevity is not there".

Naturally, those responsible for open-source use need ways to measure the risk and communicate it to the organization. Mehl noted that there are many frameworks to choose from when assessing risk, but he chose to talk specifically about four methodologies: the Cybersecurity and Infrastructure Security Agency's (CISA) Framework for Measuring Trustworthiness, the OpenSSF Scorecard, the Community Health Analytics in Open Source Software (CHAOSS) framework metrics, and DB Systel's Open Source Red Flag Checker, which examines repositories for both red flags as well as activity and licensing conditions the group considers good.

Mehl put up a slide that highlighted a few quotes from the CISA blog post about its framework that, he said, helped to "understand the mindsets at play" in trying to codify risk around open-source software. For example, CISA claims that it is more complex to judge the trustworthiness of open-source software than proprietary software because there is "no direct relationship between the authors of software and those who use that software". Mehl said that the CISA framework is a bad framework for measuring risk, with a very narrow view on trust. "It puts me in a very passive relationship with open source, [it assumes] I have no direct relationship; I cannot change anything about this."

Passive metrics are not sufficient

Mehl said the problem with relying exclusively on the frameworks is that they only measure what can be measured and that passive metrics were not enough. He had a hot take on health metrics for open-source projects: "the people in this room can better assess the health of an open-source project than all the metrics". Metrics cannot replace an experienced "gut feeling" about a project. He did not say that organizations should not use them at all, but that they should not be the sole authority.

He brought up a paper from 2023, "Do Software Security Practices Yield Fewer Vulnerabilities?", by Nusrat Zahan, Shohanuzzaman Shohan, Dan Harris, and Laurie Williams. The OpenSSF Scorecard is an automated tool that assesses various factors in open-source projects hosted on GitHub, and assigns scores from 0-10. It is meant to be used by projects to assess and improve their security posture, or to be used by organizations to make decisions about the projects they use or may want to use. The paper found that a project's OpenSSF score "only explains 12% of vulnerabilities". In other words, the scorecard may be missing other factors that predict vulnerabilities.

The OpenSSF Scorecard and other metrics simply do not take into account many factors that are important when assessing an open-source project. For example, how is the maintainer behaving? How do they react to bug reports or pull requests? Is there a connection to an open-source foundation, or is the project a single-vendor effort? If it is funded by venture capital money, that is not sustainable and is predictive of license changes that will make the software non-free. Mehl pointed out that the CHAOSS framework covers some of these things, but it doesn't weight them.

Most importantly, Mehl said, "those passive metrics do not make us active". A huge problem inherent in using open source is that an organization does not have an alternative if one of these frameworks finds that an open-source package scores badly. "Most of the open source we're using is not controlled by ourselves." He said that he had software bill of materials (SBOMs) generated for software being used by DB Systel, and the software had about 125 dependencies on average. Many packages had more than 1,000 dependencies. In total, the project found 117,000 individual packages in use—and that is without considering versions of packages. If versions were taken into account, Mehl said that the number of packages would increase dramatically. Even worse, nine out of ten of the packages in use score worse than an aggregate five on the OpenSSF Scorecard. "If our rule were that you can only use open-source projects that score better than five out of ten, we would have a huge problem."

What do we do?

The frameworks can provide a lot of insight and knowledge, he said, which in some ways can be a burden. Now the organization has information to act on, but without clear recommendations. Using the frameworks to assess risks may grant an understanding that a project is flawed, but they do not provide the answers for addressing that risk. Replacing dependencies is not easy or economical. In organizations that are not experienced with open-source use, there may be a temptation to downplay the risks. "I wouldn't recommend that", he said. Another path is to become active and ask developers to replace dependencies, or even contact maintainers and ask them to update. "Or we could force them to fill out forms. You're laughing, but it has happened. Projects have been bombarded with forms to fill out." (Curl creator and maintainer Daniel Stenberg has written about exactly this scenario.)

Instead, Mehl suggested, "let's get more honest". Start with assessing the organization's risk profile and what is specifically important to it. Maybe the mere existence of a contributor license agreement is a risk, or dependence on a single vendor is a risk. "Risk profiles and risk assessment can be very individual. You should identify qualitative and quantitative data that matters to you" and group assets into different risk classes. For example, he said that DB Systel had some software that could be deployed for up to 60 years in critical infrastructure. Organizations could create risk "themes" instead of measuring all software use equally.

Organizations might come to the conclusion that they want to fund some projects financially by supporting maintainers or projects directly. He suggested that it would be wiser to do that through entities like Germany's Sovereign Tech Agency or to nudge software foundations to support specific projects and for organizations to come together and fund things collaboratively rather than one-off funding attempts. Money, however, does not solve everything. Mehl observed that some developers are not looking for or motivated by funding. Money can complicate things in open-source projects, and companies usually want something in return for their funding, which can be off-putting.

Another option is for organizations to contribute code to projects, perhaps even having employees become co-maintainers of projects. He also recommended that organizations could set up teams that provide support for open source and coordinate contributions to external projects, or even partner up with other organizations.

Recommended toolset

All of those options are on the more-reactive side of responding when it's clear an open-source project already in use has greater risk, though, and Mehl encouraged organizations to be more proactive. He proposed coming up with criteria for selecting projects based on risk assessments beforehand, so developers could make educated choices and choose open-source projects more wisely. Since he had knocked the CISA framework earlier, he highlighted what he considered a good example from the Biden administration's 2024 Cybersecurity Priorities. That memo, in part, recommends that agencies should improve open-source-software security and sustainability by directly contributing to maintaining open-source projects.

Organizations should have open-source program offices (OSPOs), he said, not for the sake of having an OSPO, but to help define roles and responsibilities and methods of engagement with open-source projects. "The opposite of CISA, this provides an active framework and mindset." Each organization should have tools in its toolbox to allow it to do four things: assess, sponsor, select, and engage with open-source projects that are important to the organization.

His final thoughts for the audience were that organizations need to collaborate more on assessment criteria and on how to share those efforts with other organizations. Too often, Mehl complained, that is duplicated work in thousands of organizations that could benefit from sharing.

Get active. Time's over for passive consumption of open source. We see in the world we can't rely on others to fix issues for us. We have to collaborate. We have to get active. We should do this in general but especially in open source.

Questions

The first question was, since Mehl was encouraging collaboration and reuse, whether DB Systel's internal guide on choosing open source was available. Mehl admitted that it wasn't, "but we should publish it, we should do that. It's a sensitive issue and it gets complicated, but yeah, I think we should share more."

Another audience member wanted to know how to tell internal developers "no" if a project doesn't meet the criteria, and asked if Mehl had seen pushback when telling someone that a developer's choice doesn't meet guidelines. Mehl said that DB Systel does not centralize that choice, that it gives guidance but the teams themselves make risk assessments. "This is a matter of an organization's risk assessment. It makes sense to centralize this in some organizations, but we have a different attack surface".

One member of the audience said that he was happy to hear someone say you can't judge risk solely by the numbers. He wanted to know if Mehl could publish the guidance that these decisions "have to be from the gut" somewhere that he could link to. Mehl didn't respond directly to the question of publishing the guidance but reiterated that "sometimes it makes sense to use something that scores badly" and then plan for the organization to engage with the project to improve it. "There is no other way around than getting active" in open source.

Some projects die over time, an audience member observed. He wanted to know what experience Mehl had with that and how to spot a project that is dying. Mehl said that there are a number of indicators that can hint a project is in trouble. That might include how many commits and active contributors a project has over time, or some of the OpenSSF metrics. But "sometimes software is just complete". Some projects do not need daily activity to sustain themselves; it might be good enough to have an update once per year. It depends on the project. Kubernetes "should be maintained more than once every 90 days", but other projects do not need that kind of activity. Of course, it also depends on where a project is being used, too. He said that an organization had to consider its strategy, identify the software it depends on most, and look at various things—such as the health of the community, the behavior of the maintainer, and CHAOSS metrics—as "a starting point of what you look at". Ultimately, it depends on each organization's risk profile.

While many of Mehl's guidelines can be boiled down to the unsatisfying answer no one likes ("it depends"), it's refreshing to see someone telling organizations they require more in-depth analysis to assess risk than can be had with one-size-fits-all frameworks and scorecards. It is even more encouraging that Mehl pushes organizations to be active in participating in open source, rather than treating projects like another link in the supply chain that can be managed like any other commodity.


Index entries for this article
ConferenceFOSS Backstage/2025