Hugging Face and Google partner for AI collaboration

152 points by powera 2 years ago · 67 comments

Reader

What will each party gain from the contract they signed? The first few sections left me with a lot of questions.

From what I can tell, based on the later sections:

> new experiences for Google Cloud customers to easily train and deploy Hugging Face models within Google Kubernetes Engine (GKE) and Vertex AI

I assume that means a new API field like `huggingface_model: "google/flan-t5-base"`?

> Models will be easily deployed for production on Google Cloud with Inference Endpoints

That seems to mean the GCP button which is currently disabled in the Inference "Create a new endpoint" page (https://ui.endpoints.huggingface.co) will now be enabled, which is the clearest part of the announcement.

morkalork 2 years ago

From vertexai inside gcp, there's a whole bunch of models that aren't just Google's (ex: yolo, llama) available in their model "garden"[1] that can be doployed relatively easily in gcp. It sounds like what's available in the model garden will be extended with what hugging face has to offer?
[1] https://cloud.google.com/vertex-ai/docs/start/explore-models
sjwhevvvvvsj 2 years ago

Google is terrified of open source AI and the only thing the people in charge understand is how to manipulate people with money, they lack other ideas or methods. So yeah, that would explain it.
- mr_toad 2 years ago
  
  They can’t be that terrified of it or they never would have open sourced TensorFlow.
  - RandomBK 2 years ago
    
    They're even more terrified of torch or another framework take up the industry. As ML is a research-heavy space, getting ML researchers to use your platform has benefits.
    
    sjwhevvvvvsj 2 years ago
    
    Rational smart people would think so; but the group in charge at Google now just sees something that doesn’t have ads on it and kills it. They hit OSS hard in first round of layoffs.
    But as people who only understand money they give HF a pile of cash to hedge against a FOSS risk to the balance sheet. It’s cheap insurance. Same with Claude.
    Actuary tables and bank statements is all they understand, there is no technical leadership at google, accounting took over and the company only makes sense if you understand all decisions are made by the CFO.
dbish 2 years ago

This partnership is way too vague. I have to assume Google is just investing money into HF and getting some marketing and integrations in return. When deciding where to host models or which services to use for my startup, I have a hard time trusting GCP services though as being around for the long term. On the flip side, Azure and AWS you know will have support and staying power by their respective companies for as long as the companies are around.
- refulgentis 2 years ago
  
  +1, I bet it amounts to same as AWS/Microsoft: they get contact info for biz dev, a button in Inference Endpoint deployment for GCP, and mayyyyybe some custom integration to let them offer models without having a literal HF repo with the weights.
- rjzzleep 2 years ago
  
  GCP sometimes shuts down their own services while still supporting them for a few years while encouraging their customers to switch to partner products.
  One such example is the human-in-the-loop feature of document ai.

braza 2 years ago

Maybe it’s a naive question, but here we go: Following HF for a while in the industry, despite all the UI/UX, what’s the business use case to use them instead a “s3” to distribute the models or some kind of torrent for a non-centralised model distribution?

After the whole debacle around the GPT4chan[1] and the whole gating mechanism for models, it’s hard for me to think how some entity can trust that they are not going to shutdown or do gating due to some ToS shenanigans. In other words: if you’re this man in the middle between models and clients, it’s not better to treat yourself as a “dumb pipe”?

N.B.: I think the company has a great culture seeing from the outside and I assume that I can be misinformed about their business model.

[1] - https://en.m.wikipedia.org/wiki/GPT4-Chan

refulgentis 2 years ago

It's not naive and it's hard to answer, because the answer is equally naive:
It's sort of like asking what the business case is to have repos on GitHub instead of having a private git server / GitLab.
The value is because "that's where everything is happening": ex. I just did a 4 day hackathon wrapping llama.cpp on _all_ platforms for my as-yet unreleased app. If you need a local AI / llama.cpp model, you go to HuggingFace, full stop.
Then, I want to host these models on my own - I don't want to rely on the HF repos of 3rd parties being stable. Few clicks later, started my own, and uploaded the models. Then, I translate a Python function to Dart, and I can download these models, ranging from 2 GB to 28 GB, using the app, for free, without an API key.
That's much easier than S3, both in cost and integration time.
But still, the answer sounds naive and marginal I'm sure.
- braza 2 years ago
  
  But in this case, Github had 3 major tailwinds: (1) the popularization of VCS, (2) the network effect due to the number of users and (3) it's more or less the mandatory part of modern software engineering.
  For the use case that you mentioned, I agree that it is important and I understand. Still, for the folks that aren't relying upon LLMs or are doing some vanilla/traditional ML in some laggard industry, I have a hard time believing that those folks are going to HF.
  - refulgentis 2 years ago
    
    (1) popularization of pytorch
    (2) network effect due to the number of users
    (3) uploading weights, distributing weights, running demo on GPUs is mandatory part of ML engineering
    === interlude ===
    I hope it doesn't sound like I'm being argumentative; discussion is especially interesting to me because it often weighs on me how hard it is to explain HuggingFace. So I enjoy trying and improving at it.
    === longer analogy ===
    Imagine if all mobile developers in 2008* needed to host demos on iPhones captive in a server farm somewhere.* Some company offered that for free. On top of it apps were 30 GB, but the company hosted downloads for free. So everyone is putting their stuff on there. Then that feedback loop continues while the field takes a historic spike in interest and it's 4 years later.
    * AI developers in 2020.
    ** GPUs captive in a server farm somewhere.
    == Musings ==
    This sort of highlights a thread of discussion for startups, the unreasonable effectiveness of specialization. Data scientists in 2020 use Python because they can, they're not really familiar with GitHub as in VCS so their mental model of it is more a dropbox. All of a sudden there's an $X billion (so far) opportunity to clone GitHub, but make it marginally easier to use via hiding stuff that's necessary for all other software, and then light money on fire hosting GPUs and S3.
peab 2 years ago

my experience with HF is that it's incredibly useful for exploring models, and making prototypes, demos and MVPs.
Once you want to scale to production, you're right, it doesn't make sense to use the HF repository, it makes more sense to clone it into S3 or something else that you have more ownership over.
- MichaelZuo 2 years ago
  
  It sounds like SketchUp for LLMs.
whimsicalism 2 years ago

To be honest, I do not care about this gpt-chan comtroversy, I doubt most people who use them do, and they have built up a community and ecosystem around their offerings.
I am also surpriswd this has its own wiki page, although it looks like it has been rather quickly put together woth not the most fluid writing.
- jorvi 2 years ago
  
  I do care in the sense that I don’t want to live in some sterilized clean room Disney world.
  See also the neutering of all the big commercial models. No one is running / giving access to a high quality virtually (or completely) uncensored model.
  - UncleEntity 2 years ago
    
    Horses for courses...
    Big Corp wanting to replace their call center with the robots will certainly not want them to be offensive to someone who just wants to pay their bill.
- braza 2 years ago
  
  > have built up a community and ecosystem around their offerings
  The community it’s great and I am user for a while. My doubt is that if there’s a lot of use cases where companies and/or MLEs/DS will do some “git pull model_v0.1” from any of the HF model store.
  - whimsicalism 2 years ago
    
    I work at a major tech company and we do this.

crsv 2 years ago

Congrats to HF on their pathway to being acquired by Google.

rvnx 2 years ago

Most efficient route to shut it down.
esafak 2 years ago

It would presumably replace https://cloud.google.com/model-garden like how Youtube replaced Google Videos.
teaearlgraycold 2 years ago

Please no. Oh god please no.
Handy-Man 2 years ago

Meh they already have similar or better relations with Microsoft. If anyone, they would be the one acquiring HF. And it makes sense, aligned with GitHub in a way.

ixaxaar 2 years ago

Soooo, ai.azure.com coming soon for GCP?

refulgentis 2 years ago

Idk what this means but, yes, HF does have identical partnerships with AWS and MS Azure.

octacat 2 years ago

I still remember google partnering up with XMPP. Could be lead and consume strategy (or lead, make confusing and slow down development).

Though tensorflow was made by google and it is pretty good library.

mtkd 2 years ago

Embrace, extend, and extinguish

lordswork 2 years ago

Can you elaborate on how the internal strategy at Microsoft from 1996 applies here?

ijhuygft776 2 years ago

More AI bad news

lordswork 2 years ago

How so?
- genman 2 years ago
  
  Because the sides of the partnership are disproportionately unequal and this imbalance creates incentives that are in the opposite direction of more openness.

zelon88 2 years ago

Ahh the old "We've partnered with one of the largest proprietary software developers in the world to somehow improve open source things" trope never gets old.

onlyrealcuzzo 2 years ago

Google is also one of the largest open source developers in the world, too.
So partnering with Google on Open Source can make a lot of sense.
But it's in vogue to hate everything Google does on HN at the moment, so oh well.
- belval 2 years ago
  
  While the comment you are replying to does sound a bit cynical. The reality is HF already signed similar agreements with AWS and Azure several months ago.
BlackjackCF 2 years ago

I imagine that there’s not much Hugging Face can do about it if they’re getting pressure to make money by their investors.
hashtag-til 2 years ago

"We've partnered with $company, but we don't really know what that means, so stay tuned" is the oldest PR stunt ever.
neatze 2 years ago

if google is largest proprietary software developer, which company is not?
- jszymborski 2 years ago
  
  all the other ones, hence "largest".
  - neatze 2 years ago
    
    how sure are you that google is ranking lowest by contributions to open source projects ?
    
    Solvency 2 years ago
    
    Your original question leaves no other answer, rephrase perhaps.
    
    neatze 2 years ago
    
    can't it is looked, should have put more thought before typing, others already pointed out what I intended to say.
    
    jszymborski 2 years ago
    
    Look, I know this is going to paint me as the stereotypical "Pedantic HN commentor", but sincerely, the only response to
    "if google is largest proprietary software developer, which company is not?"
    is "all the other ones".
    I promise I'm not being purposefully obtuse, I really can't find any other meaning to that comment.
    I really don't know how Google "ranks" w.r.t OSS contributions.
    
    LeafItAlone 2 years ago
    
    I think there is a language issue here (doesn’t seem like English is their first language?), so I suspect the original comments intent was different than what it actually says
    
    dmoy 2 years ago
    
    I mean pedantically, both things could be simultaneously true:
    Company XYZ is the largest developer of proprietary software
    Company XYZ is the largest contributor to open source
    If we're talking about raw amounts, not %-of-company's-output

AureliusDreamer 2 years ago

A lot of folks here being critical of the partnership because Google hasn’t open sourced all of their models, etc.

Not too defend Google, but they have arguably the deepest AI knowledge in the industry and released many of the fundamental building blocks for today’s AI boom (transformers, Tensorflow, etc.)

hashtag-til 2 years ago

Yet, their models are always deemed as second class (see recently Gemini), so I think they are trying to pretend to be "catching up" with vague announcements like this.
Good for shareholders, that's all. Not really sure I believe their "open science" argument.
mvdtnz 2 years ago

If Google has such "deep AI knowledge" why do 100% of their AI products suck?
rvnx 2 years ago

They have the best unreleased model, for a few years already.
dbish 2 years ago

They certainly “had” a lot of the building blocks and creators but most of those people seem to have moved on, and libraries like TF have become less used. I don’t think you can say today’s Google with it’s hand wavy AI products (Gemini with fake demos and still unreleased) and lack of core open source ML tools has the deepest AI knowledge anymore (see Meta for who has overtaken in the FAANGs and a bunch of startups like OpenAI who have taken a lot of the other talent).
alephxyz 2 years ago

The proof is in the pudding as the saying goes, and right now Google's llm pudding is neither the tastiest nor the most popular around.

Settings

Hugging Face and Google partner for AI collaboration

Keyboard Shortcuts