Nvidia releases Alias-Free GAN code and pre-trained models, naming it StyleGAN3

243 points by polisteps 5 years ago · 62 comments

Reader

mkl 5 years ago

This produces a kind of artefact I haven't seen before, involving little chains of circles and diamonds, e.g. https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-..., https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-... (hair). I think they follow those glowing coordinate-ish lines from the internal representation.

It also seems to have given some faces contact lenses! https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-..., https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-...

captainmuon 5 years ago

Looks like lizard people...
I wonder if you can't just put a bunch of results with and without artifacts into two different bins, and do another round of training on them. But I don't know enough about how style transfer and retraining of these nets and all that modern stuff works to tell if that is feasible.
- WithinReason 5 years ago
  
  It already works that way in some sense
twofornone 5 years ago

I suspect because there's a lot of entropy in hair and because of the shape of the optimization function, (which might even have a spatial term) a regular pattern in such a noisy and hard to learn region falls into a local minimum while the rest of the image converges to the true minimum. There's a little meat left to optimize here, but you need to do it cleverly because there's no reason for a neural network to learn all the many combinations of hair pixels in this application. That could require as many parameters all the neurons involved in generating the faces, I'd bet.
- twofornone 5 years ago
  
  Thinking more about it, the shape of the solution space is sufficiently different for hair vs faces that any given combination of {optimization function, hyperparameters, training data} is unlikely to optimize for both. You probably need some other sort of special tuning, like a spatially local adaptive gradient for regions of hair.
smusamashah 5 years ago

Someone on Twitter pointed out that it has larger repeating patterns too https://twitter.com/Zergfriend/status/1408184663420510209?s=...
opwieurposiu 5 years ago

Also the some teeth are getting drawn in front of the lips and the pupils are not round. I think the not-round pupils make the eyes point two different directions, kinda unsettling.
https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-...
https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-...

avivo 5 years ago

I appreciate the section on "Synthetic image detection":

"While new generator approaches enable new media synthesis capabilities, they may also present a new challenge for AI forensics algorithms for detection and attribution of synthetic media. In collaboration with digital forensic researchers participating in DARPA's SemaFor program, we curated a synthetic image dataset that allowed the researchers to test and validate the performance of their image detectors in advance of the public release. Please see here for more details on detection" https://github.com/NVlabs/stylegan3-detector

It's important to see this sort of thing happening more and more.

captainmuon 5 years ago

> It's important to see this sort of thing happening more and more.
Why? If we insist on the authenticity of images, this is holding on to the old status quo in the same way we apply book and record copyright to digital content. We don't allow what the tech enables to the fullest, but we restrict it by pressing it into the old mold (e.g. by using DRM to make music a commodity).
I think "photographic proof" is a historical accident of the 20st century (and it was never perfect, those with resources could always manipulate pictures to some extent).
As a thought experiment, it might be interesting to imagine what happens when you "open up the dams" and are able to synthesize any image you can imagine! In the beginning, this will cause a lot of trouble (say with harrasment and fake news), but I believe society will adapt quickly. I think right now there is a real problem with the internet remembering too much (pervasive surveillance on the one hand, and constant risk of moral outrages for stupid things you did in your past). It would be an antidote if nobody could believe in any picture anymore.
- xg15 5 years ago
  
  > but I believe society will adapt quickly
  What gives you this impression? How exactly do you believe society would adapt?
  - captainmuon 5 years ago
    
    Well, I mean we coped before we had cameras, right? In a sense, we would go back to that. We would have to rely more on witnesses we trust than on evidence.
    One point of postmodernism is that often the facts (as in what happened when exactly) don't matter as much as the narrative (Unless you are a historian or a scientist). I think this is true to some extent, but we delude ourselves that only the facts matter. And then we can be manipulated by the narrative. It might be interesting to make the narrative explicit.
    Imagine a politician that is exposed to a scandal with an old sex tape. This could ruin their carreer, no matter how good their work is otherwise. But if the tape becomes degraded to mere hearsay - maybe it happened, maybe it didn't - then it is about which image we want to believe. Then the sex tape is one narrative, the campaign is another narrative, and the only concrete thing we really have to judge them on is their actual policies, and their actual recent work. All the "image" will be drowned out in noise.
    Same if you think about people posting stupid stuff on social media when they are young, and then having trouble when they try to find a job. If it is trivial and ubiquitous to fake drinking pictures and dumb old tweets, you can just shake it off with "oh yeah that is fake". The only thing that will count is your impression and your performance in the moment (and accounts from other trusted people).
    I'm not saying this would be a good development, or a bad one, just that I think it is a possible interesting consequence of current tech developments...
    
    ImprobableTruth 5 years ago
    
    >and the only concrete thing we really have to judge them on is their actual policies, and their actual recent work. All the "image" will be drowned out in noise.
    Completely disagree. What will instead matter is purely someone's image and their ability to fool people. Those who will benefit most from this aren't good people affected by smear campaigns, it's bad people who can easily avoid real criticism of actual wrong doings.
    Most political scandals I know of are not something inconsequential, but rather due to corrupt or truly immoral behavior, like the recent corruption affair in Austria. Making politicians immune to this seems like a drastic step backwards.
    >Well, I mean we coped before we had cameras, right?
    Well, yeah, humanity also coped with frequent famines. I'm not sure if going to back to something like that would count as 'adapting' to crop failures.
    
    shmel 5 years ago
    
    From what I've seen online in the last 5 years, just a mere accusation is sometimes enough to ruin someone's life or career.
    At the same time Trump and Johnson both showed that you can literally lie about something you said on live TV a week ago and people will suck it up.
  - feanaro 5 years ago
    
    The same way it worked before it had the ability to take pictures.

metagame 5 years ago

You can't use any of it commercially. Nothing within is under an acceptable software license (nor an open source license, nor a free software license). Advanced warning.

urthor 5 years ago

Just run the whole codebase through OpenAI Codex, then regenerate the source code.
Can't copyright an inferred artifact ;)
natch 5 years ago

You are talking about the models, right? If you train your own model on your own data without transfer learning (or with transfer learning from a liberally licensed third party model once those exist) then you can do whatever you want to with your model, no?
- metagame 5 years ago
  
  I'm talking about the code. Models are distributed separately.
  - natch 5 years ago
    
    But you can create models with the code. And then take those models you created and use them commercially. So I don't see a problem.
throwaway889900 5 years ago

Nobody will be able to tell if you stick it in a tiny thumbnail and give it enough jpeg artifacts.
- metagame 5 years ago
  
  Computer-generated artifacts are non-copyrightable. This is not the problem. (That said, as the law is written, no binary should be, but we already threw that baby out with the bathwater.)
  The problem is that the software is not free software, but encourages you to stop using its free predecessors and competition sneakily.
  - wruza 5 years ago
    
    Or it’s just a paranoia and they plan to license this to non-free use after a period of testing, leaving non-commercial use for those who want to play with it. That is basically what every non-free software does except the commercial support is not yet provided.
    Also, I second the question about free predecessors/competition, not to argue or compare, but out of pure curiosity.
  - dahart 5 years ago
    
    I’m not sure I understand your claim, why is this encouraging anyone to stop using any competition? The license says plainly it’s not for commercial use, what’s so sneaky?
    What do you define as free software? This software is open source, always free as in beer, and free as in freedom for research and evaluation purposes (and seems fairly permissive to researchers…)
    
    metagame 5 years ago
    
    It by definition is not open source. The term has a definition. This breaks literally the first rule.
    https://opensource.org/osd
    
    dahart 5 years ago
    
    I used, or maybe misused, the term open source. You used “free”. The license & project used neither, and made no claim to align with opensource.org’s philosophy or definition. Whatever you call it, the source code has been released for anyone to read and “evaluate”, that’s what I meant by ‘open’.
    You didn’t answer the question - how is this sneaky, and how does it prevent using previous projects?
    
    metagame 5 years ago
    
    The Open Source Initiative coined the term to begin with. Using it incorrectly is harmful, and is how we've ended up with "literally" meaning "figuratively" in modern English. By insisting on the correct definition, I'm trying to prevent the same from happening to open source. It's pretty offensive to act like it's not a big deal to use something so essential to computing freedom in a cavalier way to intentionally lessen freedom.
    
    kube-system 5 years ago
    
    OSI was not the first to use the phrase "open source". This phraseology was in commonplace use to refer to other types of publicly available material for decades prior to 1998, when OSI decided to use the term to describe software licenses.
    One example from 1971: https://www.google.com/books/edition/United_States_Code/3j2P...
    There are also other (quite valid) authorities on software licensing other than OSI which have differing opinions on which licenses specifically qualify.
    For example: most people would probably agree that BSD was open source, despite OSI's lack of approval on its original license. And I hardly think thats 'harmful' in any way.
    
    dahart 5 years ago
    
    Another problem with assuming that a non-commerce clause in the license automatically means software is not open source is that the US government defines commercial software as any software that is licensed to the public, which includes most open source software, even by OSI’s standards.
    “in nearly all cases, open source software is considered "commercial software" by U.S. law, the FAR, and the DFARS. DFARS 252.227-7014 specifically defines "commercial computer software" in a way that includes nearly all OSS”
    https://dodcio.defense.gov/open-source-software-faq/#Q:_Is_o...
    
    dahart 5 years ago
    
    Literally has meant figuratively for hundreds of years, Dickens used it that way. https://www.merriam-webster.com/words-at-play/misuse-of-lite...
    There is no “correct” definition of the term “open source”. People use it to mean many things. If it has any license other than “public domain”, then it limits some freedoms in some ways.
    You still didn’t back up your claims: what is sneaky, what predecessor does this license prevent use of?
    
    geofft 5 years ago
    
    I agree with you, but I don't think anyone from Nvidia called it "open source" (I agree that 'dahart incorrectly did so). It's a shame that GitHub allows non-open-source code, but it does, and nothing else about it implies that it's open-source.
    
    boulos 5 years ago
    
    Fwiw, Dave (dahart) currently works for NVIDIA :).
    
    dahart 5 years ago
    
    Hang on, that’s purely incidental in this context. I don’t represent this project in any way, and I only called it open source on accident here. Nobody associated with the project has suggested that it’s open source by OSI’s standards.
    
    mistrial9 5 years ago
    
    the word 'free' in English to decribe the software has been, and is problematic. It leads to a lot of heat minus light in conversation, it seems. I support direct GPL software, and, its important to make sure the person you are talking to, is using the same terms to mean the same thing, right away.
  - nimih 5 years ago
    
    Just use the code to build a source-code auto-completion model, then wire the model up to your text editor and write new source code.
- smlacy 5 years ago
  
  Oh really? https://nvlabs-fi-cdn.nvidia.com/stylegan3/images/stylegan3-...
  - throwaway889900 5 years ago
    
    Obviously if you're using the GAN equivalent of a Megamind meme you won't get away with it, but more reasonable ones you can.

timzaman 5 years ago

I worked with these guys at nvidia Helsinki office. They are super chill and just somehow crank out super research. Very interesting bunch.

anon012012 5 years ago

I know HN doesn't like hype, but as an AI neophyte, I find this incredible. Nvidia is doing it again. This is likely going to help with 3D generation, the next cornerstone. Imagine that we are solving the problems so fast.

foxfired 5 years ago

There are videos that show what they mean by "details glued to image coordinates" in StyleGAN2: https://nvlabs-fi-cdn.nvidia.com/stylegan3/videos/

tmabraham 5 years ago

The visualizer looks fun: https://twitter.com/minimaxir/status/1447679798822649856

minimaxir 5 years ago

I was joking, but it appears that StyleGAN3's new approach allowed it to develop unsupervised 3D maps of faces within the deeper layers, which might result in interesting things when hacked by researchers.

maCDzP 5 years ago

It really bums me out that they didn’t name it GANnamStyle.

fuzzythinker 5 years ago

And it fits pretty good too. GAN Nvidia Alias-free Models Style

atty 5 years ago

From the license file:

> 3.4 Patent Claims. If you bring or threaten to bring a patent claim against any Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce any patents that you allege are infringed by any Work, then your rights under this License from such Licensor (including the grant in Section 2.1) will terminate immediately.

Is such a clause legal? I have basically zero knowledge of such things, but it seems like it should be illegal to punish someone for a good faith patent claim.

teraflop 5 years ago

As defined in the license, the capitalized term "Work" means only the StyleGAN3 software and derivatives. So it means you can't use StyleGAN3 while simultaneously claiming it infringes one of your patents, but it doesn't mean Nvidia can use StyleGAN3 against you as leverage in an unrelated patent suit.
I'm not a lawyer, and I won't comment on whether this is legal, but I'll note that it's quite similar to the patent clause in section 3 of the Apache Public License.
https://www.apache.org/licenses/LICENSE-2.0
nerdponx 5 years ago

Apache 2.0 has a very similar clause. But there might be some subtle differences in the wording that makes this broader or stricter.
DannyBee 5 years ago

Yes they are legal, and I'm not sure I follow the argument that they shouldn't be
- metagame 5 years ago
  
  The argument against it, presumably, is that "If you try and make us pay for committing crime, you won't get access to our toys anymore" is very strange and seems illegal, since the ability to play with toys should not stop anyone from reporting violations of the law.
  But at the same time, it's definitely legal, for better or for worse, as is pretty much any stunt you pull with the joke that is US IP law.
  - DannyBee 5 years ago
    
    This is not illegal in any country I can think of. Your take around what it exists to do is, IMHO, over the top.
    I'm not aware of a country that requires you let people sue you in this sort of situation, or requires that you not terminate their contract if you do.
  - trhway 5 years ago
    
    Is patent violation a crime? My understanding it is a civil issue like you kick me out of your property, and I'll kick you out too.
    
    metagame 5 years ago
    
    It is a civil issue rather than a criminal one, albeit still violation of law, which is why I used "crime" in the same way that the US government calls piracy and copyright infringement a crime despite it generally being a civil offense.
metagame 5 years ago

It is legal, yes. That license is awful and proprietary (3.3), but it's most definitely legal.

knownjorbist 5 years ago

Everyone is nitpicking the licenses involved in this thread. Is this the right thing to do?

intricatedetail 5 years ago

Is minimum 12GB limit on purpose to make people buy new GPUs? It's sad that this growing area is becoming for privileged people only.

qayxc 5 years ago

It's not on purpose but a direct consequence of the computational requirements of the model.
I say this often and I cannot stress this enough: this is not magic! High quality results require a lot of computation and with it a lot of other resources like VRAM.
This area of research has always been resource intensive and no one was surprised by this back in the 1990s when some research required hardware that wasn't available to mere mortals (looking at you, SGI Indigo and Octane).
You should look at it be the other way around: it's astonishing that you can use off-the-shelf consumer hardware (here: mid-range gfx cards) to participate in and benefit from cutting edge research.
Even better yet, services like Google CoLab even allow you to meddle with this for free if you don't own the required hardware.
BTW, it's not the researchers' fault that the market is f'ed up again due to shortages and crypto mining. Otherwise you'd be able to buy a 12 GB card for around $330 (e.g. the RTX 3060), which doesn't sound particularly outrageous to me.
sydthrowaway 5 years ago

We need FOSS GPUs + compute!
- anaganisk 5 years ago
  
  Unless you discover a novel process to make a chip, economics of semi conductor never can make sense in FOSS. You will end up paying way more, than what the current market price is.

RNCTX 5 years ago

> This material is based upon work supported by the US Defense Advanced Research Projects Agency (DARPA) under Contracts No.R00112030005, HR001120C0123, HR001120C0124 and FA8750-20-2-1004 and the Air Force Research Laboratory (AFRL) under Contract No. FA8750-20-2-1004.

This is why AI is just a marketing term with no real future.

There isn't room for corporations to profit from it out of the gate + into the future indefinitely. No one is going to pay an AWS tax to use their models on every single API hit forever. No one is going to pay nVidia a license fee to use their image recognition tools forever. If the creators of HTML, CSS, and Javascript wanted license fees we wouldn't be using them right now either.

There are two groups of people, off the top of my head, who care about all of this:

1) The US Military, because the budget for their murder robots is theoretically infinite.

2) Google and Facebook because the budget for their spyware is theoretically infinite.

To everyone else, it's much ado about nothing.

Scaevolus 5 years ago

Nvidia makes most of their money selling GPUs as computational accelerators-- both generic CUDA and neural network applications.
They don't _need_ to profit off their ML models. It's a value added service. Ensuring their hardware has marketshare at the leading edge of the ecosystem is the main point.
- RNCTX 5 years ago
  
  They didn't profit off of ML models, they profited from the DARPA contract. See point above about murder robots.

Settings

Nvidia releases Alias-Free GAN code and pre-trained models, naming it StyleGAN3

Keyboard Shortcuts