Is a Python package Actually a Hacker Trap? – NO Complexity

Python is the most widely used programming language worldwide. Consequently, many programs, even those built on top of FOSS Python modules, are freely available on The Python Package Index (PyPI.org).

Python security is gaining attention due to its rising usage. Python can be considered a secure language, yet Python applications are also susceptible to common security flaws. Researchers, especially security researchers, often exaggerate security risks.

AI generated summaries of scientific papers are often of very limited use.

The value of reviews should be to discuss claims, opinions and quality aspects of research work. People are biased. That is a good thing, so always read the scientific paper and various human reviews to get a broad view and strengthen your knowledge. I like Python,Free and Open ML/AI, and I am working on a FOSS Python SAST scanner. So read my review remarks with that in mind!

I like making security simple and more effective and I love playing with new technology, so I was attracted to the paper: “DySec: A Machine Learning-based Dynamic Analysis for Detecting Malicious Packages in PyPI Ecosystem,” Sk Tanzir Mehedi, Chadni Islam, Gowri Ramachandran, Raja Jurdak, 2025, url https://arxiv.org/abs/2503.00324 . The paper was created by researchers at Queensland University of Technology, Brisbane.

This paper focuses solely on FOSS security by analyzing packages on PyPI.org. This is harmful and fuels the inaccurate perception that open-source software is inherently insecure, a view still promoted by commercial vendors and managers worldwide.

This paper proposes a solution for the security problem of packages on PyPI.org.

Personally, I’m not yet a fan of AI solutions for security, but this paper doesn’t use LLM technology. Instead, it focuses on the advantages of using plain machine learning algorithms as part of the solution.

My question when reading this paper is: How big is the problem that the researchers want to solve and is the proposed solution really a cost effective and simple approach?

Every Python package that is able to dynamically load code is suspicious by default! Many packages on PyPI.org have this capability. In my opinion you should never trust these packages by default. A very easy way to check dynamic imports is by using a static security analyser, like e.g. Python Code Audit.

Reading this paper I questioned some numbers presented: The authors focus on mitigating attacks on the software supply chain. This is valid. But the numbers the authors refer to are not valid for Python. The authors mix numbers and non-scientific research from general FOSS reports and claim that Python has the same large issues. This is simply not true as other research papers show.

Stating that “A significant manifestation” of general FOSS security challenges are equally valid for Python is misleading at best. Actual numbers matter and within the security eco-system there we are continuously confronted with claims without any real evidence that can be inspected.

But later in the paper I noticed an interesting phrase:

“Recent reports up to July 2024 have identified 7,127 PyPI packages (1.2%) as malicious [11, 12, 13, 14]. The malicious packages caused different types of attacks such as data exfiltration using AWS keys or remote API, credential theft through typo squatting, and remote code execution via dependency confusion attacks [15, 16, 17]. For instance, the ‘Zebo-0.1.0’ package periodically captures screenshots and uploads them to an attacker-controlled server using a remote API [18].”

So finally, more solid numbers in this paper that make more sense to me. And also more inline with other research findings.

So with only 1.2% packages that are labelled malicious the problem is put in perspective. For a very large open repository, with no validation on packages uploaded the number is rather small.

The researchers created DySec. As I read the paper my impression is mainly to protect against malicious Python packages to be present on PyPI and to create a great scientific paper on Python security with Machine Learning using open data sets and eBPF.

The authors clearly explain why they think static security testing, such as SAST scanners for Python, is not enough. This is a strong opinion, but I agree that dynamic scanning for Python programs is certainly crucial to do. For example, it is crucial to run a fuzzer to check the risks of SQL injection or other unwanted behaviour, like an unintentional DDoS attack.

This paper is about creating a framework for detecting malicious packages on PyPI.org. Common flawed solutions for scanning packages on PyPI.org are:

Scan packages for known vulnerabilities. This will never work well since most vulnerabilities will never be reported. Also the fact that a package has a known vulnerability will not mean it is insecure to use in every context.
Bureaucratic procedures: For developers who want to upload packages and various procedures and tools will be needed before a package is publicly visible. This will also give a false sense of security, mainly due to the dynamic nature of updates of packages and the fact that modules can import code when running.

The authors summarise the limitations for detecting malware in PyPI packages nicely by summarising methods that can be used in Python code, but are hard to detect:

typo squatting
remote access and
dynamic payloads

Personally I would also add “code weakness”, since many Python packages use `assert` in production code among other worrying known weaknesses as `compile`, `exec` and various operating system methods without proper checks. Code weakness are easily detectable using a simple SAST security checker, check e.g. also how Python Code Audit does this.

The authors claim that static security tools are inadequate against typo squatting, remote access activation and dynamic payload generation. I do not fully agree with this claim. Even my simple Python Code Audit SAST tool can detect these patterns.

The DySec Framework presented in the paper is interesting. The visual helps to understand the architecture. I always think good architecture means explaining things, preferable with many visuals. And if you want all kinds of people to look at your architecture view: Do not use Archimate views. This will never work and is at best occupational therapy.

In its core I see DySec as a dynamic security analysis framework with ML capabilities.

A first view learns that this is a complex setup. This gives the following disadvantages:

Hard to maintain.
Security and complex security tools is a proven bad marriage. You do not want to be more vulnerable due to the use of vulnerable security products.
Various manual tasks are involved.

The setup the researchers used was done on a HPC cluster with 16-core CPUs, NVIDIA A100 GPUs, and 128 GB RAM. Evaluating different Machine Learning models and training is far from easy.

Some architecture choices are surprisingly smart. Like:

Using eBPF for network analyses. eBPF is a unix tool for real-time system monitoring. See https://ebpf.io/ eBPF is used for malware detection, network security, and performance tracing, enabling deeper visibility into system activities without significant overhead.

Some summarising efforts

The paper addresses the problem of malicious Python packages on PyPI.org as a pure technical problem and tries to solve this with Machine Learning for detecting malware on a network level. I love Machine Learning, and for some use cases this can really work. However, I doubt whether using this rather complex DySec dynamic framework will lead to a sustainable solution.

To further minimise the issue of malware being present in PyPI packages, I think other approaches would lead to more success, such as:

Educating developers on securing the complete supply chain.
Validating packages using reproducible builds. For simple packages, this is easily possible by default, but validation becomes much harder for complex packages with many dependencies, especially those in other languages. Finding a way to automate the process of validating a build, without sacrificing the capability to rapidly release new package updates, requires serious work and consideration.
Educate users of Python programs. E.g. teach people [security by design]. By default Python programs are not insecure. But vulnerabilities can be used to disrupt services or steal valuable data. A zero-trust environment for running Python programs seems obvious , but is seldom done. A zero trust environment makes needed connectivity to and from the Python program explicit.
eBPF solutions for detecting malware on networks really can benefit from the use of machine learning approaches. This has been evolving for more than 15 years. But it is good to know the limitations of using network detecting systems (HIDS / IDS) when it comes to malware patterns.

Scanning PyPI packages before uploading malware is a long time debate and faces many complex problems. To name a few:

Security is always context dependent.
Every package that is loaded dynamically is suspicious. This would directly ban e.g. the most popular `boto3` package (https://pypistats.org/packages/boto3 ) from AWS. Of course if you take security very seriously you never ever trust software you can not inspect.
Every Python package that makes use of telemetry or remote APIs for retrieving data is suspected. Packages that have functions to dynamically load modules should be banned. Note: [Python Code Audit]() warns you for this kind of package, since this tool is created from a paranoid perspective when it comes to security.
The process for security checks and follow ups is very hard to automate 100%. So it will be a major volunteer task and responsibility. If you have solid ideas to organize this, send me an email to discuss this.
The researchers see the issue that indirect dependencies remain hidden within packages, I am in the process of adding this functionality to Python Code Audit (ref), since I think it is vital for better static analyzer tools.

Regarding the FAIR principles (https://www.go-fair.org/fair-principles/ ), the results presented in the paper cannot be verified because code and data are missing.

This is a missed opportunity. Since PyPI is open and a lot of FOSS tools are used, I would minimally expect an appendix with instructions to reproduce the results and inspect the data and scripts used.

The authors did great work by analysing and comparing different ML algorithms for DySec. A look at the website created for DySec, https://dysec.io , directly shows a HTTP error. This suggests that no one will do anything practical with the work done for DySec in the future.