Native Neural Style (Or Why Prisma is wrong!)

I’ve been doing image processing apps for roughly 7 years now. I’m CEO of Moonlighting Apps, a 10-person startup in Argentina that develops photo-editing apps for iOS, OSX, Android, and Windows. Our biggest success so far, SuperPhoto, has been downloaded millions of times. I want to share some thoughts about neural style apps (like Prisma) which are becoming really popular nowadays.

A (really short) story of Neural Style

Started by Google Deepdream and Gatys et al in mid-2015, neural style uses deep learning neural networks to process images. Just like deep learning revolutionized object recognition, neural style filters change images in completely unprecedented ways. Results are truly awesome. Soon, web-services came up to process images for the public: Deepart, Dreamscope, and Pikazo are some early examples that have been around for the last year.

The drawback is that huge computations are needed, since the actual filtering is carried out simulating a training of a neural network; for a ~1M pixels image it takes tens of minutes on a last-gen GPU.

Faster (but lower quality) approaches

A few months ago, new methods (Johnson et al, Ulyanov et al, Li et al) came up that can filter images much faster. From the original neural network (call it the sampler) they derive a second network (called the generator) that is trained to emulate the sampler. The generator is a common feed-forward network and can filter an image in a single pass, resulting in that filtering now goes from minutes to less than a second. The catch is that quality is worse than the original approach (but this research is really new and expected to improve!). The new app Prisma is surely developed from one of these methods, since the results look similar.

What’s wrong with Cloud-based neural style apps like Prisma

Even though the results provided by Prisma are awesome, there’s a big issue that so far I haven’t seen anywhere and that motivated me to write this blog post. You see, image processing is a creative, interactive, and private process. Photo editing apps (both mobile and desktop, like Photoshop), are tools used by users interactively, with previews of the processed image where users tweak parameters and sliders until the result is satisfactory. But if your image is edited in the cloud like Prisma, in a backend on the other side of the world, it becomes impossible to update the image each time you slide a control. And of course your images are no longer private (worse, have a look at the terms of use of Prisma, you’ll not even own the filtered images).

Other drawbacks of having the actual photo editing engine located in the cloud is bandwidth usage, since photos need to be uploaded & downloaded continuosly (something that on mobile or metered connections isn’t nice), and you actually share the backend with everyone: the more users there are, the slower the processing (and, eventually, collapsing the whole thing like it happened to Prisma since the Android launch). And they can’t provide HD either since that uses too much resources as well.

Prisma isn’t the product, you are

You see, Prisma, just like MSQRD before, are products engineered to be bought by Facebook or some other big company, so they only care about growing a lot of Instagram users, not about providing (and selling) enthusiast image-processing users a powerful tool for the long term. That’s why all filters are free, no ads, or any business strategy is applied. (There’s nothing wrong with that strategy per se but it has the problem that indie developers like us that actually make a living selling photo editing apps now are criticized by users that expect “everything for free, no ads, and super-fast rendering”. )

Native Neural Style

The right way, in my opinion, is to actually develop neural style filters that run right on your device, just like regular algorithms. Computing neural networks isn’t that big of a deal: in the end, it’s all about multiplying matrices, something that can be done efficiently (see Apple’s Accelerate library and BLAS’ GEMM). Yes, it’ll be slower on your iPhone than on a GPU cloud, but not that much slower. Why didn’t Prisma do it? Because it’s harder and would have taken more time and effort to code natively than the easy backend route.

But it can be done! We actually just launched an app called Painnt doing neural style natively (shameless plug!) and it’s reasonably fast on a last-gen iPhone 6S or iPad Pro. And you can do previews, tweaking, HD, and protect your privacy.

Since we care about serving our users in the long term, we have a sane business model: our app isn’t free, users have to pay a subscription to access all the features. I know that people will complain, but developing quality software takes time and effort, and there’s no way around that.

Comments? Agree/disagree on my thoughts? Write below! And if you want to try our app, here’s the iOS version and here’s the OSX version. Thanks for reading!

Update (8/27/2016): Prisma is launching offline filters! which is in line with my post here: image processing needs to happen in-device. Now, will they develop tweaking and previewing like our Painnt app does?