Settings

Theme

Fork Freshness: Project lifespans in the Ruby ecosystem

gilesbowkett.com

159 points by aazaa 4 years ago · 37 comments

Reader

joshuanapoli 4 years ago

Fork Freshness is poking at an important problem. If the author of an open source project stops responding, then there is usually no obvious way for the project's community to reorganize or recognize a new leader or a replacement for the project.

I agree with other commenters; I really don't want to talk about my dependencies on twitter to this bot.

  • bullfightonmars 4 years ago

    This might also be a great way for the original creator to identify that people still find their software useful. It might help in handing off the project to an active community.

    • tessierashpool 4 years ago

      yeah, that's an explicit goal: to surface the implicit/nascent open source communities that are already coalescing around projects.

      if a bunch of people all agree that it's worthwhile to keep a particular project alive, and they're doing the work to make it happen, then they have something in common, and it should be easy for them to meet each other.

  • tessierashpool 4 years ago

    author here! maybe I'll cave and just set up a regular UI. email might work also.

phreack 4 years ago

I usually just use https://techgaun.github.io/active-forks/index.html

Any chance this could be made not to rely on Twitter?

  • unityByFreedom 4 years ago

    I like this bookmarklet [1] that shows how many commits ahead/behind each fork is. There's also this extension [2] but you need to give it your own github access token.

    [1] https://stackoverflow.com/questions/54868988/how-to-determin...

    [2] https://github.com/dragongling/Better-Github-Forks

  • tessierashpool 4 years ago

    hi! I made Fork Freshness. that alternative is much faster, but it relies on the `pushed_at` attribute from the GitHub REST API. I was unable to find documentation for that attribute, but I rejected the GraphQL equivalent, `pushedAt`, because if Dependabot pushes to an otherwise dead repo with a PR to auto-update some dependency, `pushedAt` treats that as recent work. I didn't want to write a robot which chased another robot around in a circle, so Fork Freshness instead uses its own much more labor-intensive system. it's much slower, but I believe it's also more accurate.

    I wrote that up here, in a fairly gigantic blog post:

    https://gilesbowkett.com/blog/2021/08/15/fork-freshness-proj...

    that blog post also explains why I based the UI around Twitter. TLDR: fun experiment. re the question of making the UI work in a different way, TLDR: maybe.

    part of the experiment was just to see how far I could get without creating a User model. but since Fork Freshness does a relatively slow analysis, I wanted to use an asynchronous UI. I'm not married to it, though, I could see good arguments for setting it up to work differently.

    edit: btw, thanks for the discussion re my project! I'm late for a concert and travelling tomorrow morning but I can't wait to dig into these comments some more.

    • RileyJames 4 years ago

      Cool project. Going to fire a tweet now, as I’m interested in one particular repo.

      Same problem as stated here, repo owner moved on. Thankfully a few people have contributor access on the main repo so it hasn’t died yet. But I’m likely to go awol on it in the next few months and there’s no clear second in command to hand it to.

      I’ve also been working with the GitHub api this weekend and I was wondering how pushed_at and updated_at were differentiated. Good to know re: dependabot

    • unityByFreedom 4 years ago

      > I wrote that up here, in a fairly gigantic blog post:

      That blog post is the same link for this thread.

      > without creating a User model

      A User model isn't necessary if you don't require login to use your tool.

      • pmontra 4 years ago

        > My theory now is that whichever ActiveRecord model is closest to the UI will grow huge as it comes to incorporate a comprehensive catalog of user interactions. It's an interesting question, but it's also a topic for another time.

        User preferences, settings, interaction history, whatever can be stored in different models. The User model can be slim.

        • unityByFreedom 4 years ago

          Good to know. It seemed odd to include details about the framework. The tool could've been written in Django, Angular, React, etc.

exciteabletom 4 years ago

This problem could be solved if Github simply sorted the list of forks by stars instead of alphabetically.

  • zxcvbn4038 4 years ago

    More active repos don't necessarily have more stars, maybe a better way to sort would be to have main repo followed by number of commits ahead, then number of commits behind. Most often what I find is a repo will have 3-5 dozen forks and the vast majority will either be far behind or have one or two localizations. It is very rare that I find something that someone has really forked and started doing active development on.

    • trinovantes 4 years ago

      I think sorting forks by most recent commit would good enough. Not sure why GitHub doesn't do more to help with discoverability

      • tessierashpool 4 years ago

        I wholeheartedly agree with this. I think it's just a blind spot they have, although back in the day a GitHubber told me it was also about maintaining an agnostic approach to governance.

        btw, iirc GitLab does give you the option of sorting by most recent commit.

  • oezi 4 years ago

    That is a good idea but they actually also should do:

    - hide any repo that hasn't seen any commits

    - show commits ahead and behind as well as number of tags

    - include issue tracker activity

    - highlight those forks that were renamed which is often an indicator of a new package/gem being released from this fork

    - Show a link from the main page of a repository to the most active fork to make it clear that there is an active fork at all.

TheFreim 4 years ago

Is there a reason this requires a Twitter account? Seems like something I'd like to use but I don't have Twitter.

  • unityByFreedom 4 years ago

    The author says it's too slow and works better as an asynchronous request. They also write something about rails' "User" object being unwieldy, whatever that is.

    That still doesn't explain the Twitter dependency. It could just be an attempt to get popular and share the tool at the same time.

    • strken 4 years ago

      I sympathise with this a lot. It's tempting to say "oh just use ActiveJob, and a CAPTCHA to stop bots, and Devise for auth, and MailChimp to let users know it's completed, and build a UI for submitting new jobs, and also for viewing pending and completed jobs", but all those little things add up.

      • unityByFreedom 4 years ago

        Solve the bot problem when you get too popular, don't preemptively make it harder for people to try out your site.

  • FastEatSlow 4 years ago

    The article mentions it in the section "How It Works: UI"

    • pmontra 4 years ago

      I do have a Twitter account, basically so nobody else can create one with my usual handle, and never use it.

      I don't see why I should let Twitter know that I'm using this service. Furthermore I should log into Twitter in a browser tab and go there to check if the result arrived. On the other side I understand that it's hard to find a convenient asynchronous delivery mechanism. Email is what everybody has but it's hard to send thousands of emails for free and not to get into a spam list.

jccalhoun 4 years ago

I ran across this site a few years ago and I've used it since then to find active forks: https://techgaun.github.io/active-forks/index.html

bredren 4 years ago

Repo abandonment is a problem compounded by gaps in assignment of package publishing rights.

Recently, I helped a maintainer get a PEP 541 request done after a year of people intermittently pleading with the owner to do a release. It took pypi’s direct communication of potential reassignment for the owner to respond and they did so within two hours.

Not every package has a willing maintainer to back up an owner like this. So finding forks that have sufficiently merged PRs or have even gone off to do new work can be valuable to avoid duplication.

I’ve done this kind of girl research manually before, searching for something that goes the furthest and seems the most professional.

I’d like to see this tool integrate directly into the GitHub forks page, though, ideally as a browser extension.

  • Igelau 4 years ago

    > I’ve done this kind of girl research manually before, searching for something that goes the furthest and seems the most professional.

    You done what now?

    • ronsor 4 years ago

      Presumably the poster means "grunt"

      • paulryanrogers 4 years ago

        Swipe typos have made detection more complex.

        • bredren 4 years ago

          I think it was “fork,” which is very close to “girl” in swipe.

          You’re right it was likely a swipe typo. I’ve transitioned to swipe typing about 70% of the time. Hadn’t done it at all until a few months ago.

          Not sure how I waited so long to start using it, or if I’m more productive typing using it now.

          • Igelau 4 years ago

            I had a feeling it was, but as swipe typos go, it's... shall we say ducking hilarious? :)

thomzane 4 years ago

Is there a repository for Fork Freshness? I could see the twitter account ignoring requests in the future and the same fate could fall to this project. I would recommend releasing the project under AGPL-3.0-or-later to partially solve this issue so the project can continue in the event of abandonment. I could see people contributing code to search for projects in other known forges such as GitLab, Sourceforge, Savannah, Gitea, pagure, and sourcehut as sometimes projects are forked outside of the original forge.

I have noticed this issue that Fork Freshness tries to solve. My example is Twitter's project murder https://github.com/lg/murder When a project becomes unmaintained whether officially or unofficially, the future home is often lost unless the original points to the new home at the top of the README file. You can dig within GitHub in the Insights > Network section to get a visual glimpse of what has changed since. https://github.com/lg/murder/network The original repository put up a notice that the project is unmaintained and archived the project which effectively ends the project in practice. In this case, ervinb's fork seems to be the most active commits before being abandoned. https://github.com/ervinb/murder Other forks also had independent commits that never were pulled into other projects. Looking at the network method fails to differentiate 30 grammar fixes from 30 new features without digging into each promising looking fork. Even then, you may miss a single commit that included more work then the entirety of the other commits. Disclosure: I have not worked on murder.

This is a serious problem and I hope we solve it.

blendergeek 4 years ago

Does this product find forks that don't use Github's "fork" feature?

throwaway81523 4 years ago

This should be called something involving "farm to fork".

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection