Ask HN: AI-generated spam pull requests?

65 points by sudo_navendu 3 years ago · 26 comments · 1 min read

Are you seeing pull requests in open source projects that are clearly AI-generated?

We recently had someone open a pull request to our open source project and the code and the explanation of the code was clearly AI generated. It was obvious that the code doesn't work and the person had not tested the code. We do not know what the end goal of the person was but we confronted the person and closed the pull requests.

Has any other open source projects experienced this? What did you do?

fishtoaster 3 years ago

Relatedly, the Rails codebase recently received a (clearly marked) AI-generated pull request: https://github.com/rails/rails/pull/47708

At a glance, it looks like it's been mostly well-received and has not yet been immediately closed as spam.

sudo_navenduOP 3 years ago

This is how you use AI tools to write code.
In our project, the user just copy-pasted the output from the AI tool and called it a day. They did not even bother to build the project and test it.
I have also started using AI tools and it has made me much more efficient. I could have done a task without AI but with it, it is much more faster.
- badloginagain 3 years ago
  
  That makes no sense. The one thing chatgpt does _really_ well is setup unit tests, which is the part of unit tests I hate.
  My ChatGPT workflow is give requirements -> have it create unit tests -> give it test results until it passes.
  Been playing around with a generated-code-only project: https://github.com/JerkyTreats/scrivr/
  In that workflow I don't really look closely at the code. In most cases I've found it isn't really necessary.
not_your_vase 3 years ago

Oh... I already see this commit being used by MS in the Copilot lawsuit.

sudo_navenduOP 3 years ago

I wrote more about what actually happened here: https://navendu.me/posts/ai-generated-spam-prs/

It can help set some context to the discussions.

TLDR:

Recently, a person has been using AI tools to generate code and open pull requests to open source projects I contribute to.

The code is entirely wrong and doesn’t work, and it is evident that the person making these pull requests doesn’t understand the code.

The person also copied explanations (which was an obvious giveaway as it sounded like a typical <popular AI tool> response) into the pull request and attempted to explain the code and answer questions from the reviewers.

We were polite and when it didn’t work, reported the person to GitHub.

I don’t want to shame the person publicly. But I want to make other open source maintainers aware that this is a thing and prevent them from wasting time and effort chasing such people down.

enumjorge 3 years ago

> We do not know what the end goal of the person was but we confronted the person and closed the pull requests.

Maybe this was a naïve attempt at inflating their GitHub numbers? Some people use those as a credibility measure when applying to jobs or getting clients.

alephxyz 3 years ago

Hacktoberfest should be interesting this year
- joshxyz 3 years ago
  
  oh god not again haha
sudo_navenduOP 3 years ago

Now they clearly won't because we will be reporting them to GitHub.

sergiotapia 3 years ago

We've seen a guy create thousands of spam NPM packages for is-even, is-odd, is-red, etc. I don't doubt we'll see many people try to "contribute" to many projects via AI to make a name for himself.

tonsky 3 years ago

I got one too https://github.com/tonsky/FiraCode/pull/1518

kypro 3 years ago

I've said this before, but in the very near future AI will a consume your Jira board, write code, compile it, test it, then raise a PR all without you. In fact, it might not even bother with a PR because the output will be so good it will be comparable to a human trying to critique the moves of stockfish in Chess.

Devs who think they're "optimising their workflow" with AI are in lala land. Understand that soon you're not going to be needed to prompt ChatGPT to do your work for you. You're not going to be a 10x engineer, you're going to be an unemployed one. You're current workflow of copying ACs into ChatGPT then copying the output into VSCode will also be automated soon, obviously.

Learn hard skills now as it will buy you a few years. We have all been given a death sentence and most people are still yet to understand this. It's unlikely that in the future we're building humans will be needed, let alone our inefficient labour.

liopun 3 years ago

As a developer, it's important to pay attention how we are using these tools to co-exist, not to be replaced by them. In addition to using these tools, problem solving skills and creativity are going to be the most important parts of software development jobs in the future. Keep your problem solving skills sharp with ChatGPT help and guidance folks: https://github.com/Liopun/leet-chatgpt-extension

Shindi 3 years ago

This is a good example of bad use of AI. If your developer wrote code and tried to open a PR without talking to anyone, without testing, without an observability plan, without even running the code to make sure there are no errors, that would be crazy!

Good AI systems will do all the above.

Sorry to plug, but if you're a developer interested in building on top of langchain and building similar tech, please email me (in my profile). I'm a senior developer looking to collab.

quickthrower2 3 years ago

Call it “810 for code”!
https://xkcd.com/810/

georgel 3 years ago

Don't other projects have automated tests that check that the PR will actually compile/work? And if the tests fail, the PR is rejected.

styren 3 years ago

Yes? That doesn't make this discovery any less interesting/peculiar.
sudo_navenduOP 3 years ago

Actually, we were sure that it was spam. GitHub gives the option to "approve workflow run for first-time contributors". I guess none of the maintainers thought to approve it because they thought it might be spam. Still, a lot of time and effort spent to review it.
- ynik 3 years ago
  
  That button was added to GitHub to protect against new bot accounts creating PRs against random projects, adding a CI step that runs a cryptominer. Now that the CI doesn't run automatically for new users without a button click, these attackers have a much harder time.
  So tell your maintainers to use that button more liberally -- it mostly just exists to save GitHub money / discourage these attacks. It doesn't hurt to click it for these "CV improvement" spam PRs, and it makes rejecting the PR a lot simpler if there's a red X.
  I usually just scan file list changed by the PR, and if it isn't changing CI stuff, I just let the actions run prior to the actual code review.

xpe 3 years ago

If the AI PR covered the bases within one standard deviation of your other (human) PRs, would you care?

Formerly "hand wavy" questions about humanity, cognition, awareness are now showing up right in front of us. They are transforming into things like (a) is this PR worth my time? (b) does it introduce legal / license risk? (c) what principles were considered during its creation and so on.

ye-olde-sysrq 3 years ago

The dose makes the poison. Do you remember when digital ocean (or someone similar) did that promotion thing for opening PRs/contributing to open source projects? It was a nightmare for maintainers because they were inundated with outright spam and also well-intentioned but poorly executed PR attempts by hapless newbies.
I think one reason that open source even kind of works is because "people contributing to open source" has a shit-ton of positive selection bias "baked in". The type of person liable to open a PR to an open source project is probably several standard deviations (or, at least one, right?) above the average developer. So that probably has the general effect of making reviewing a random cold-open PR less onerous of a task.
But if we cross into a world where the average value of a PR opened with a project drops substantially - either due to AI or due to a permanent advertising campaign from some company rewarding badges for opening open-source PRs - I wouldn't be surprised to see lots of open projects close/ignore github PRs and start doing something that looks more like how Linux handles it, where it's a lot more social-based and puts some of that positive-selection-bias filter back in place.
- LorenDB 3 years ago
  
  Indeed, when I saw this, I thought "Hacktoberfest is going to have problems this year."

crop_rotation 3 years ago

It is the new reality though. Soon AI generated PRs will look human enough to require a good amount of time of the repo maintainers to review.

sudo_navenduOP 3 years ago

It is perfectly fine if the code actually works. At least for now, getting the exact code you want from an AI is a legit skill. This was pure spam.
I wrote more about what actually happened here: https://navendu.me/posts/ai-generated-spam-prs/
It can help set some context to the discussions.
debesyla 3 years ago

I doubt that AI generated PRs will ever be as bad as (some) human generated ones.
- crop_rotation 3 years ago
  
  Correct. But they will be a. much higher in volume and b. would not look bad at a glance.

Settings

Ask HN: AI-generated spam pull requests?

Keyboard Shortcuts