The Full Story of Large Language Models and RLHF
assemblyai.com> LLMs with coding abilities could be employed to create sophisticated malware with unprecedented ease.
If that is possible then shouldn't it also be possible to ask the AI to find and code remediation to the vulnerabilities it found?
So AI could be used to find all possible code-vulnerabilities and then how to neutralize those? This would advance software security in general.
In other words AI could be used like a microscope discovering tiny defects in our software which are not visible to the naked eye. Like a microscope that detects viruses and thus allows us to guard against them. Like a COVID-test.
> If that is possible then shouldn't it also be possible to ask the AI to find and code remediation to the vulnerabilities it found?
It's not that simple. Mal-AI will probably operate orders of magnitude faster than Bene-AI. Patching affected software/hardware is a bureaucratic or formalized process that can take time.
Time dilates for AI relative to humans because it can accomplish so much work in our time horizons. A month to us might be like years to an AI.
So good AI would always be orders of magnitude behind while cleaning up after bad AI.
This perspective doesn't even account for online devices that are un-patchable that constitute the majority of today's bot armies.
> Mal-AI will probably operate orders of magnitude faster than Bene-AI. Patching affected software/hardware is a bureaucratic or formalized process that can take time.
“Bureaucracy” misstates the fundamental issue: errors in patching the software against malware (e.g., breaking the prime function to support security) are more costly than errors in attempts to break in, so the former fundamentally demands more costly (even if it, too, is fully automated) verification.
Malware has to find one flaw.
Bene-ware has to avoid screwing up of any one of thousands (millions) of calls in a system.
'Undocumented but valid' is also a nightmare here. Recently the software I support broke for a bunch of clients in the field because they were using an undocumented method that worked, but that we did not actually test internally. This did come a surprise to our testing and development team as our customer base is adverse to any kind of data collection that shows how customers use the system.
If hackers can use AI to find the vulnerabilities then surely the good guys can find the same vulnerabilities with the same AI. And once you are aware of them you can protect against them.
I don't think the bad guys have any special advantage that would allow them to produce AI better than the good guys. Hackers are good at finding vulnerabilities because as you say they only need to find one vulnerability. But you can't IMPROVE the state of the art in AI with just a single discovery. So in the AI-race the hackers don't have the same advantage as they have in finding vulnerabilities. I think.
I work in the vuln finding/remediation industry and we have tooling that finds holes and tells the users to fix them, and the users don't. Maybe giving AI access to the code to scan, exploit, develop a fix for, push to dev, run QA tests on, then push to production might help, but that is a much bigger, more energy intensive, more expensive, and easier to break chain than an AI just just finds exploits and dumps out exploit code.
I'd hope that people would run good-ai before pushing or a white hat vuln finder before adding packages. Not sure people will or if it will be as good, but it will probably be available.
We already do that in a general sense (static and dynamic scanning) in our build pipelines, but all of this will surely accelerate.
At some point, every FAANG will have entire pods designated for RED/BLUE AI pentesting/vulnhunt, if they dint already...
Also will every other major country.
I'd bet china already has RED/BLUE challenges going on internally..
It might be that finding vulnerabilities is easier than fixing them (think AI-driven fuzz testing).
Right, it is probably us humans that will have to fix the vulnerabilities :-)
AI can definitely help remediate the bugs but it is all about incentives. One would hope browser and OS vendors would use AI to remediate vulnerabilities but vast majority of software vendors won't ever use it.
Also, automated vulnerability finding is very much real and already used today. This isn't something that has just become viable via LLMs, but I guess LLMs can enhance it:
> vast majority of software vendors won't ever use it.
It will likely become part of SOC2, and all the major cloud and IDE vendors will offer it as a service. Potentially just bundled in at no additional cost.
Does anyone know what happens if you do transfer learning in addition to scaling? It feels like people used to use transfer learning in lieu of scaling and I haven't wrapped my head around how they work together.
When you scale you've probably included the data you were going to transfer learn to in the dataset, right?
One important but that is often left out is that ChatGPT is not the first model to come out using RLHF to train LLMs.
As is typical in the AI field, Deepmind was key in the development of the process. Deepmind 's Sparrow came out just before ChatGPT (regarding language modeling with RLHF), and much of the RLHF work was explored in their robotics/agent exploration work just prior to application in language.
OpenAI was integral in PPO, but it's important to know and understand it wasn't ChatGPT or OpenAI that is solely leading these advancements.
I found this to be a particularly lucid writeup of the past 5 years of advancement in LLMs. I sent it to some undergrads to read.
I have been meaning to get a better overview of LLMs, this was a useful article.
"From Giant Stochastic Parrots to Preference-Tuned Models"
I found this sub-title quite interesting
Word to the wise: the RLHF part comes 80% of the way down.