Settings

Theme

AI Cybersecurity After Mythos: The Jagged Frontier

aisle.com

9 points by akavel 25 days ago · 1 comment

Reader

coder543 25 days ago

That is an extremely strange article, in my opinion. They test Gemma 4 31B, but they use Qwen3 32B, DeepSeek R1, and Kimi K2, which are all outdated models whose replacements were released long before Gemma 4? Qwen3.5 27B would have done far better on these tests than Qwen3 32B, and the same for DeepSeek V3.2 and Kimi K2.5. Not to mention the obvious absence of GLM-5.1, which is the leading open weight model right now.

The article also seems to brush over the discovery phase, which seems very important. If it were as easy as they say, then the models should have been let loose and we would see if they actually found these bugs, and how many false positives they marked as critical. Instead, they pointed the models at the flawed code directly.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection