Physical Intelligence is building a brain for robots
bloomberg.comA 4 year old child has 16k wake hours x 3600 s/hour x 1^6 optical nerve fibers x 2 eyes x 10 bytes/s = 1^15 bytes (approximation by Yann LeCun).
Processing visual input is the current bottleneck for robots that want to make sense of the physical world. Glad somebody's looking into it (no pun intended). I just hope their plan is more sophisticated than throwing more computational power at the problem.
You need much less if you are fine with horse level understanding of 3d environments. Those get to a working level much faster (hours) and are still good enough to navigate complex environments safely and not step on children.
Then you realize the limitation isn't the training data but the base model that was trained from hundreds of millions of years of evolution, and you start to see the real potential hurdle we have to clear.
Something like half a billion years of pretraining if you start counting from the first brain.
And 10 or so billion years of preparing the training environment
Well, we get to use most of that for free.
> A 4 year old child has 16k wake hours x 3600 s/hour x 1^6 optical nerve fibers x 2 eyes x 10 bytes/s = 1^15 bytes (approximation by Yann LeCun).
Unfortunate typo. You meant 10^15 bytes at the end.
Thanks to your citation I was able to find a podcast transcript [1] with Yann LeCun's explanation:
> If you talk to developmental psychologists and they tell you a four-year-old has been awake for 16,000 hours in his or her life, and the amount of information that has reached the visual cortex of that child in four years is about 10 to 15 bytes.
The transcript is missing "the" (10 to the 15 bytes). The corresponding timestamp in the podcast on YouTube is 4:48.
How is it determined that the optic nerve transmits 20 MB/sec of data?
Is it? What if that 4-year-old child were blind? Obviously their concept of the physical world would be different, but is it any less accurate? If we remove the need for visual perception, thereby removing that bottleneck, how much faster would we be able to make progress?
I think it would be significantly less accurate. Their error rates for performing physical tasks would be different b/w they lack the sensors to accurately train decent world models. For instance, I don't think they could catch a ball at the same skill level as a sighted child no matter how hard they tried.
So the lack of that sensor will cause the brain to develop poor representations of motion in 3d space.
How lack of those representations would affect other representations is less clear; because seeing the fusion between the LLM (which similarly doesn't have an embodied world model representation) and the robot AI (which presumable does) obviously works really well.
Now, it's possible that the 2 models are just inter-communicating between their own features (apple the concept and apple the image/object) and then being able to connect that together. The point of this meaning that there could be benefits from separate training and then post-training connection to bridge any gaps in learned representations.
However, I'd think that ultimately a model that can train simultaneously on more sensory input vs less will have a better/more efficient world model with more useful & interesting cross-connections between that space and applied uses in non-physical domains.
Blind people are still wildly capable people. If your goal is to build a "wildly capable digital brain akin to a person", then the lower bound is much less than 10^15 proposed by LeCun's reasoning.
Clearly you have never known a blind person.
So maybe we should start with building a "pinball wizard". A "deaf, dumb, blind" system that plays by sense of touch - or in this case some accelerometers and pressure transducers? radically reduced bandwidth inputs...
If it's just bandwidth reduction you're after, Atari pixels?
Even for real world use — I found I could get a lot out of an even smaller resolution for an unrelated (non-AI) real-world task.
To keep the analogy going, we should be concerned about unregulated companies creating robots with superhuman capabilities and a four-year-old’s sense of the world.
Regulators need to get ahead of this and establish a federal framework for safe robotic entrepreneurship.
For example…does the second amendment give me the right to have a drone which is capable of autonomously shooting a deer? There will be tens of millions of people who disagree on that point alone.
And then we need international agreements - much like nuclear - governing what is “fair game” for the public to have access to.
We must pursue a robot-enhanced future, carefully.
> For example…does the second amendment give me the right to have a drone which is capable of autonomously shooting a deer? There will be tens of millions of people who disagree on that point alone.
IANAL but it seems this would fall under running a human controlled robot with a gun, which I believe is illegal
Sure, I’m squarely in the “private killer robots are bad” camp, but my point is that a lot of people will disagree there.
My state passed a law last week saying 18-year-old children can open carry a gun without a license. It’s very possible that the 2A people get energized and the Supreme Court is sufficiently out of touch to get this right.
There are no 18 year old children. Those are adults, with all the rights and responsibilities that come with that title.
Not all of them. They can't drink, be a senator, or (in some states) rent a car.
this is America!
18 years old + white -> child
12 years old + Black -> adult
I don't make the rules. I dont' understand them either
In my state (Idaho), and many others, simply using a camera drone to spot game is illegal (even using it to locate wounded/downed game, because it provides an easy excuse). Drone with a gun is an entirely different level and to my knowledge is broadly banned across the country.
The 3 things mentioned so far are governed by very different regulatory regimes:
- Using a camera drone to spot game will fall under the purview of both Federal and state level Fish and Game/Wildlife departments, eg https://en.wikipedia.org/wiki/California_Department_of_Fish_... AFAIK this is not federally illegal, so there should be plenty of places where you can do this in the US.
- Drone (I assume you mean some quadcopter UAV) with a gun falls under FAA guidelines. You can't intentionally destroy them in flight or attach weapons to them for the same reason that you can't do that with planes: they are aircraft and the FAA doesn't want to deal with you shooting down aircraft. Since the FAA is federal, you can't do this anywhere in the US.
- Robot with a gun falls under ATF guidelines, specifically ATF letters that indicate certain classes of electronic trigger are effectively a machine gun and fall under the purview of the NFA. Same as point 2, the ATF rules will apply federally. If you have the relevant licensing, which I think would fall under a Class 2 SOT FFL, you can hook a firearm up to a robot since you are legally allowed by the ATF to manufacture machine guns. Most (?) of the Youtubers who have given guns to those offbrand Spots are doing it legally under the supervision of a Class 2 SOT.
- The autonomous robot with a gun would fall under the third point, as I am unaware of any rulings about specifically autonomous stuff, though someone could potentially make the argument that past rulings on booby traps could apply.
None of this directly answers the OP question about whether the 2nd amendment applies, but broadly federal regulation has moved past "shall not be infringed", so what the relevant federal agencies actually de facto allow is more to the point.
Well you seemed prone to sensationalism to begin with, but this:
> 18-year-old children
just confirmed it. An 18-year-old is an adult with a right to own and bear a firearm by the 2nd amendment. A person is not a drone, and an 18-year-old is not a child.
I would say conceptual awareness is a far bigger bottleneck than visual perception data.
I think the state of the art is still bottlenecked on visual perception performance, even if there is sufficient data, and irregardless of any further questions about conceptual awareness.
If we could model visual streams accurately, fast, and at low compute cost, I think self-driving cars and autonomous mobile robots would be much more widely available.
LLMs arguably cracked conceptual awareness already, not to mention demonstrated that you can bootstrap it unsupervised by throwing enough data at it.
There are only a few thousand concepts worthy of thought, but there are way more potential pixels even in the current room that I am in.
1^15 bytes huh?
Non-paywalled version: https://archive.is/ZtfNh
Why is OpenAI investing in startups? They invest 100 million in another as well. I thought OpenAI was building their own things as a non profit, why are they investing? Is it normal for non profits to invest like this? Do they lack confidence in their own stuff, do OpenAI have too much money they don't know what to do with?
"Norwegian humanoid robot startup 1X Technologies recently raised $100 million with backing from OpenAI"
https://www.cnbc.com/2024/02/29/robot-startup-figure-valued-...
Yes, it's quite common, see for example:
this is not even a fair comparison. mozilla isn't using all of our data to build a product that will be metered
Apparently it is a separate entity and fund, they don't invest money from OpenAI but money raised specifically for investing. So this doesn't say much about OpenAI, they aren't investing money they could use themselves.
https://www.businessinsider.com/openai-vc-fund-raised-10-mil...
1. They're not a nonprofit 2. They had shut down their robotics wing so now their using investment to get back into the game. 3. What's with this negative tone?
OpenAI no longer operates as a non-profit. The "OpenAI" name has been a joke for years in tech circles for this reason.
I thought the joke was that OpenAI doesn't grant anything even resembling an "open" license (whether free as in freedom according to the FSF, open as in open source according to the OSI, or open as in Open Knowledge's definition [1]) to use OpenAI's models. (Tangential reminder: open is not merely access to the code and data but also the legal freedom to use and modify it for any purpose.)
Asking this question on a platform pretty much aligned with Sam Altman and expecting real nuanced answer is unrealistic.
Remember they shut down the thread regarding Sam Altman's sister alleging she was molested by him.
Its just incredibly sad to see how society is quick to overlook one's transgressions if it stands to benefit from that individual. Artists, CEOs, politicians, celebrities.
Just one sick world and this blatant disregard for "non-profit" because bunch of men feel they were chosen.