Interactive Dynamic Video
interactivedynamicvideo.comPokemon GO and Interactive Dynamic Video:
Thanks for this link. Really cool use case, and maybe Niantic will check it out, if they haven't already.
Video analysis is usually very slow. I'm curious what their runtime is.
Yes, my battery doesn't drain nearly fast enough when I'm playing. This would definitely fix that.
Funny comment. :)
But actually this is unlikely to make a big difference on battery drain. Pokemon GO appears to be built on Unity and Unity is pretty notorious for having poor battery management. Whether you're pegging CPU or not, Unity burns lots of power. So, the incremental battery life hit is likely to be low in doing image processing on top of a frame.
Seriously, fuck Unity. It's the indy game equivalent of vendor-lock in, with everything negative about it.
They can't be arsed to fix a simple VSync bug on Linux, meaning even simple sprite games, or even more insulting: TIS-100, a game about optimising assembly on fake retro-hardware, deliberately made to look like a terminal, always burns 25% CPU power on my laptop.
More importantly, as Paolo Pedercini[0] mentioned on Twitter recently:
> I can open my Flash crap from 15 years ago no problem, but a project I made last year in Unity 5.1 crashes silently if rebuilt in Unity 5.3 I'm telling you: today's indie game production is a Unity monoculture that will disappear at the next hardware architecture cycle
[0] https://twitter.com/molleindustria/status/760904169444274176
That's an awesome "real world" example of where this could be used.
When I first saw this (extracting audio from visual vibrations): http://news.mit.edu/2014/algorithm-recovers-speech-from-vibr... it seemed incredible that the signal the cameras could capture was strong enough. This work is even cooler, even more non obvious and seems closer to wizardry.
What exactly are the implications of this for construction engineering, which must already have some way of measuring vibrations?
I.e. does this measure something previously unmeasurable, or is it just that the visualization provides an extra channel for interpreting the data?
Or is it that it would make this kind of tool available to anyone with a digital camera?
The vibrational modes of a structure are a sort of signature. When it changes, that means something in the structure has changed. Perhaps a beam has rusted or buckled, or bolts have come loose, or concrete has cracked. This is traditionally tested by placing accelerometers in the structure, but it might be much cheaper if you could do it with a camera.
A big part of it is that measuring the vibrations of large structures, like buildings or bridges, can be inconvenient and expensive. There are limited cases where you can measure new stuff with video simply because you don't have to touch the object or shine a laser at it, but the big win is that by making vibration measurement cheap and simple/convenient, it means you can do a lot more of it. We can start taking more measurements on more structures, which will tell us where we should put resources to rebuild infrastructure.
It might also make quality assurance/ product testing cheaper but that's less save-the-world :-p
It's kind of the last option.
This kind of analysis is already done and important, but usually it takes installing a lot of sensors and giving an impulse to the structure. I have actually worked on a company that did this, using strain gauges.
The fact that this is one using a commercial camera is kind of impressive. I'll take a look on the papers someday, but the obvious way to do this would be to have a framerate at least the double of the frequencies being measured.
Being able to determine frequencies much higher than sample rate not only is non-trivial, but also alleviates immensely the cost of these measurements.
Here's TED a talk about the technology http://www.ted.com/talks/abe_davis_new_video_technology_that...
The implications of this technology are staggering. He doesn't even talk about online use-cases, but imagine (just as an example) if YouTube added this algorithm to all it's video library, and started allowing you to interact with videos in very different ways. Crazy stuff.
Yes, so many applications, stop motion would benefit from this incredibly.
The consumer use cases are interesting but the propagandist use cases are terrifying. Along with this (Real-time Face Capture and Reenactment): https://www.youtube.com/watch?v=ohmajJTcpNk
Which country (or organization) do you think will be (or has already been) the first to implement a program of creating physical mimics of humans for military and intelligence purposes? With technologies like this and CRISPR, video, testimony, or genetic evidence could be thrown out the window in criminal or military cases.
Kids would go bonkers if they could record their toys and dance with them like at the end of that video.
Signal Processing was that subject I found difficult and I found no reason to be interested in audio filtering, so I didn't find an incentive to work through that difficulty. Then I see applications of signal processing, like this, that goes beyond simple audio filtering, that makes me want to learn it again. Honestly, this looks like wizardry! It is no surprise, MIT is on the bleeding edge of signal processing and it mainly is due to Dr. Oppenheim . He wrote the textbook on DSP and runs the DSP Group at MIT.
Papers of their findings can be found at http://www.interactivedynamicvideo.com/publications.html
I wonder of you could use this on car commercials and thereby deduce the actual comparative quality of construction. Assuming that better built cars would vibrate less at speed. It would be interesting to understand where the vibrations occur.
A question for the authors. All the video is with vibrating but otherwise stationary objects. Must the object be stationary for these tools to work?
Probably not because in a lot of car videos there's no real car. Here's one simple example: https://www.youtube.com/watch?v=b7vTM4_rjhs
There's a company that makes a rig that's basically just an engine and 4 wheels. You film your entire commercial using it and they replace it with computer imagery of your car in post. I can't find the video at the moment, though.
Not one of the authors, but I had a look at their paper http://www.interactivedynamicvideo.com/ISMB_Davis_2015.pdf
They use optical flow to get displacement values for each pixel. This works well when you don't have very large movements between frames, no sudden illumination changes and no rotations.
Car commercials will probably have all of those and on top of that be highly edited (e.g. you have to deal with cuts in the video)
You could try to use some other approach for motion estimation, like identifying and tracking salient points, but that would require quite a lot of work on top of what this demo shows.
Not the author, but I surmise it would be much more difficult to assign vibration modes to subparts of the image in the image without being able to track them across frames.
Thats the next best thing to magic that I've ever seen
This does look fascinating. The video is a bit showbusiness-like (for a lack of better expression).
A few questions if the author is on HN: I wonder how long it takes to analyse a 5 second video? Also, it seems the algorithm only works with static images after the initial video analysis, or am I wrong? Also, how long does it take to render the "virtual" state of the object?
Hey, first author here. Good questions.
The simulation runs in realtime, and is interactive. Generating the simulation can take a while though.
The 5 second video took about a minute to process on my laptop. But longer videos, and higher-resolution videos, can take much longer. For instance, we've used the technique on high speed video to recover audible vibration modes, and this can take hours to process because the video itself is so big.
I wrote the processing code in Matlab, and the simulation in c++ and GLSL. The simulation is pretty well engineered because I wanted it to run in realtime. I didn't put a lot of time or effort into optimizing the video analysis part though, so it probably could be made faster.
Hi!
Thanks for responding to my questions! The whole project is amazing and I hope you keep working on this (and future projects) with the same enthusiasm as you do now! :D
Wow, more incredible image processing stuff from MIT. What's really cool is how it shows how much hidden information is out there there in world to be captured within a normal camera. We just need to be aware of it and analyze it properly. For me, it just re-injects a bit of wonder into the everyday world we live in.
This might make the VRML included in MPEG4 useful. Could possibly use it to pass along the physical constraints and other data needed to make it interactive. ref: https://en.wikipedia.org/wiki/MPEG-4_Part_11
This would be fantastic for getting a feel for the tactility of a product you've not yet seen in person.
I would love to see what this could do when combined with light field technology https://www.youtube.com/watch?v=xNJZHFZEkYQ
So if we can use this algorithm for Structural Health Monitoring (like a bridge shown in the video) - what's the major obstacle to replace all destructive testing in mechanical engineering with this kind of algorithm?
Dedicated host name for a single algorithm, TED talk, high-production value video with professional animations, narrator talking excitedly, more kitschy examples of applications than description of the algorithm itself.
Yup, MIT.
I am a grad student. If you consider how many hours I work, I probably make less than minimum wage.
I am also the first author of the paper described in these videos, which my colleagues and I published in the top academic journal in our field.
I also made all of the videos, did the voiceovers, and hand drew the animations (though I'm flattered you think they were professionally done). I wrote and presented the TED talk. I purchased the dedicated host name for a single algorithm (using money from my own grad student stipend no less) and created the webpage myself.
I am an academic first, and I take academic integrity very seriously. I also take education very seriously. I consider educating the public about research to be part of my job, and this is done best when people are excited about the research.
Also, this work IS exciting. If it weren't we wouldn't have spent so much time working on it and it wouldn't have been published in ACM TOG. But back to flashy videos and press...
Consider that in the past three days, nearly 100,000 people (and counting) sat through a video where I explain what vibration modes are. VIBRATION MODES. They may have been lured in by pokemon, but kids who remember that video won't have to ask their physics teachers "but why should I care about this? what is it useful for?" Hell, if it gets people excited I hope teachers show it to their students before they teach the topic. I'm not making money off of these videos, I'm just stoked this many people are getting excited about research. Our paper has 17 numbered equations in it - it's not exactly a page turner.
When scientists don't make an effort to communicate their work to the public, that responsibility falls on people outside of the academic community - people who often don't understand the work. When we make things harder for the press, we only encourage them to bastardize the work to make it more palatable to the general public. By taking an active role in how we present academic work to general audiences, we can better shape the message, manage expectations, and help prevent content from being sacrificed for click-bait.
Ok, I'm going to get off my soap box now. Cheers! -Abe Davis
I should also mention that the people at MIT CSAIL news are very good. They put a lot more attention into fact checking and making sure things are accurate than your average reporter. They set a pretty good example.
To the advantage of being at MIT, they are very good about preparing a press release and contacting other news outlets to get exposure. But as a broader point, when it comes to reporting on science, I think the MIT/CSAIL set a good example.
I think your work (and your promotion of that work!) is awesome! Please take heart in all the appreciation and encouragement in this thread (and elsewhere)!
Thanks so much! I really appreciate that :) And, in fact, I'm not at all upset about the original comment - I totally get where the skepticism comes from. Polish and substance are too often uncorrelated when it comes to science in the media. I think it's important not to assume they are anti-correlated though, because then we start discouraging scientists from trying to reach people beyond the academic community.
Sorry buddy! It was intended as a bit of dry humor. Didn't mean to pick a nerve. :) You're doing good work.
No worries. I totally understand (and invite) the skepticism. It's good to question this kind of press when you see it, and try to understand where the technology really is. I just think it's important not to assume that anything presented with polish has to be compromising on content.
Besides sounding bitter and petty, this comment is bizarre — the relevant papers are available right on the site (http://www.interactivedynamicvideo.com/publications.html).
We should encourage people to make their research accessible to a broader audience, not engage in catty sniping and whinging when they get recognized for their work.
Ah, come on, it's pretty cool. :) And they do have a publications page: http://www.interactivedynamicvideo.com/publications.html
I'm all for science being promoted/marketed, even if it's in a(n endearingly) dorky way. And there are some fun use cases for this to boot (see the Pokemon GO / AR vid below).
Jazzing up science seems to have more upside than not, as long as the quality and substance are maintained.
I recently got access to some hailed MIT technology with its own shiny videos, professional photos, patent pending and slick copy. It turned out to be a barely working prototype built on standard technology.
To me misleading the public is not the worst about it. Pretending scientific accomplishments can deter and demotivate other, 'competing' scientists in the field.
Which tech was it?
Sarcastic whining, complaining about genuinely innovative research, missing the point, nitpicking, trying to make self seem more important by criticising everything.
Yup, Hacker News Commenter.
;)
Well, it's the top comment, so it's apparently something that resonated with people. There's something quite grating about the way they try to make it... cool, I guess. Steals the focus from the content.
You know, I'm somewhat skeptical of the idea that "the best" actually does always float to the top around here.
Can't edit the comment so I'll just put this here: didn't expect this to get upvoted to be honest ;) I was trying to be bitter and petty as a kind of dark humour, but for the record I actually think it's pretty cool that students at MIT have that kind of support behind them.
That said, it's always sort of amusing to see the MIT PR-machine at work, because it's so _obvious_ once you get familiar with how they work. At most universities the idea of hiring someone to make a professional animation with perfect sound-studio recorded narrative for a single project/paper would just... not happen. On the other hand I can see why they do it, because it works.. it really changes how people perceive the work, and the importance of it. On the other hand, yes, it can be a little frustrating to see something sort of publicized as completely game-changing as if the work came leaped forth from the vacuum of space and directly into the minds of MIT graduate students, when plenty of other people at other universities are doing related work. (How many citations does the video have, for example?) Not saying that's the case here, but I'll admit that it's sometimes been how I've initially reacted to things from MIT that were closer to my research domain. Kind of like, "hey that's pretty similar to what I'm doing, how come I don't have a dedicated web page and professional video and thousands of hits coming from the top of hacker news and reddit?" And then it hits me: "oh right, because I didn't make one.."
tldr; MIT has PR down to a science. They take publicity seriously, and it makes a huge difference for them. That is not necessarily a bad thing, although it does raise the bar for everyone else, which can be kind of annoying when you have papers and theses to write.
This is ridiculous. The amount of time that went into this project is probably fairly large. In a more competitive world, making some short videos and a basic page seems quite obvious. If the people involved spent, I dunno, 8 weeks on the project, then spent another few days putting together some materials to explain their work, that seems totally obvious and helpful.
Why do you think professionals were hired? A basic microphone+blanket to damp echos, and a passing familiarity (perhaps a friend) with a video editor should be enough to put that together. (No offense to the team.) If you wanted to hire someone, I'm betting you could get it done for $50 on Fiverr. Another $10 for a domain, and a few bucks for hosting.
You're making it sound like this is some huge professional outfit with a coordinated marketing plan. It looks more like someone trying to show off their project that they spent a lot of time on.
As the primary author stated, MIT PR does not make these videos. I've seen plenty of dry, poorly-produced videos; what you're seeing reflects the dedication, talent, and charisma of the researcher more than anything else. Source: I was a grad student at MIT.