83% of browser features are used by under 1% of top websites
theregister.co.ukThere are web sites, and there are web apps. Most websites, think news sites, Wikipedia, etc.. just deliver static content and a whole bunch of ads, as someone noted here. Web applications are the ones that need all this functionality.
Why don't we do the following for HTML6, and introduce one profile for sites, and one for apps.
- Sites: HTML + CSS, Javascript if at all only for presentation purposes (like DHTML over a decade ago). Can be viewed with a radically stripped-down web browser. All you need is the layout engine and components for display and networking. No WebGL, no sound API, and no shenanigans like ambient light sensors or vibration (wtf!). Think of Google's AMP.
- Apps: The whole package that is offered nowadays. We can even go past this and rethink the division between web and native apps. Why can't a web app use sockets? Why can't a native app use the HTML layout engine or live in a tab? Google is planning to blur the gap between web and native with their new "instant apps".
This just keeps coming up on HN time and again; [1] [2] [3]
Whichever side you're on, browser vs. native, the sheer frequency of this discussion proves that at least a clear distinction IS needed. Continuing the status quo of web/browser/standards bloat cannot be good.
Somebody really needs to set down some global rules of thumb. I think that when your webpage starts needing sidebars and subwindows and popups and notifications (yes, looking at you, Facebook) then at that point it should just be a native app.
Let the web and its browsers focus on "sites" and "pages" and let the OS do "apps." After that it's up to the operating systems to make discovering and accessing apps as easy as typing in a website's address.
As a user, I want to sign in just once on each of my devices (iCloud/Apple ID lets me do that), and just type in an app's name (say Cmd+Space and "facebook") and then start using it right away as the OS begins downloading it incrementally, just as a browser does a website, except with full access to the OS's features, efficient use of my hardware and battery, and instant access to all my data without a separate login.
[1] https://news.ycombinator.com/item?id=11735770
No WebGL
One problem is that some interactive news articles can make very impressive use of WebGL, like the interactive climbing map that accompanied an article about the Dawn Wall freeclimbing record: http://www.nytimes.com/interactive/2015/01/09/sports/the-daw...
> Can be viewed with a radically stripped-down web browser.
Like Lynx?
In terms of sites versus apps, I think the browser vendors are responsible for making this happen. Adobe AIR was the closest effort I saw of marrying web technologies with apps.
Sadly AIR never took off and whilst I appreciate the intention, it left a huge looming question of what do we actually do with all this new web technology?.
One answer I came up with in recent years was what you suggested: of partitioning off the people who want to work with the technologies and keep them separate from the text+image+CSS based web we've all grown to love.
Similar to how the demo-scene was an offshoot of game development...
Electron and eventually Positron give us the app side.
So we're mostly lacking the site-only aspect. To some extent that can be achieved with addons that strip or block certain APIs.
I'm really happy to see that I'm not the only one with this dream. It would be good for developers and good for end users. Everybody wins.
Kind of a silly article. Most top websites just serve static content with tons of ad-network crap, they would never need 83% of the features. This is like saying "the most sold vehicles in the world don't use 90% of the horsepower they have". No shit, the most sold vehicles in the world are Corollas and Civics, ie. "sit in traffic and commute" type cars.
It doesn't cost anything to keep these features around, why kill them off? Code is cheap, it doesn't cost you, the user, anything to have the features in your browser...
It does cost you something if that code has security holes.
Most of these features also rely on user-approvals so it is not as if every website is going to jump on and start using GeoLocation APIs just because they can.
It is a nice study showing how far these niche new features have reached, but that shouldn't affect how we are working on the new ones (except perhaps improving the security models of these).
That's a big if. Anything could have security holes, seems pedantic to kill a feature because it might have a security hole at some point.
> That's a big if.
Reminder that we've been issuing so many CVEs last year we had to upgrade the numbering spec to allow for more than ten thousand a year.
With fewer features, new engines like Servo could be built faster.
And features add attack surface. But since people seem to want web rendering engines to replace operating systems, there's no way around it. You need WebBluetooth, WebRTC, WebAssembly and every other feature.
There's some good reasons to want this, in fact.
Don't forget WebBIOS and WebHPC
Edit: those were meant to be satirical, but apparently exist.
:/
Or stuff each origin into a container and provide access to libc.
This is why the linux desktop failed to penetrate the market. You can't say, "Well it does 80% of what most people need." Most people just surf the web, view photos, and other basic functionality. That 20% it can't do is a deal breaker. Not to mention the things most people never do, but businesses rely on like legacy/proprietary software support.
>It doesn't cost anything to keep these features around, why kill them off?
This is also why the features list of a basic Windows or Office install is miles long. Once developed, there's no cost other than maintenance of those features (updates, security, etc).
I think its hard to argue against complexity in software. The ultra-complex usually win for rational market reasons.
I don't know this to be true, but my assumption as someone not building the interpreter/vms for JS is that if you eliminated all of that shit you could probably get a much better optimization from your engine. ALso, security was touched on above.
APIs have nothing to do with the interpreting of the language.
Sure, but presumably a browser and the codebase has to support all of the apis and the entire specification. Would it not be able to optimize a browser if say, as the article claims, olnly ~17% of the features/apis need to be supported? Or would this totally not matter.
There is no way to be certain without digging in to the particular code in question. But "optimization" isn't just some magic pixie dust that you sprinkle on to a project and it makes things faster. Some things can't be optimized because they are already optimal. Some things shouldn't be "optimized" (e.g. for speed) because it would make the code less optimal (e.g. for understanding, maintaining, etc.).
The number of APIs available should be orthogonal to the issue of system performance. If it is not in particular cases, it's via bad overall system design, not the existence of the API in general.
> But "optimization" isn't just some magic pixie dust that you sprinkle on to a project and it makes things faster.
Correct, however periodically adding features by committee and backwardly processing and parsing tags and internal api implementations that are unused certainly must hamper performance. To some extent, I would assume the v8 or chakraCore engines are developed around interpreting javascript which includes languages pieces that have developed because of its close ties with the dom. So between the underlying interpreter engines, and all of the parse tooling for dom rendering and tag identification and css applications, I can only assume if it had 80% less overhead to expect, it could be faster.
So to the point > The number of APIs available should be orthogonal to the issue of system performance. If it is not in particular cases, it's via bad overall system design, not the existence of the API in general.
That was the question I was asking.
1) is it possible to design a better system with many less considerations.
2) is it a bad overall design?
Again, I have to assume that if we raieded the codebase and rewrote it minus 80% of the legacy shit, we could do a better job, but maybe not.
This is something I've noticed a lot when developing Servo. The vast majority of the time, when a site is broken in Servo, it's due to some CSS 2.1 bug or another (CSS2 has existed since 1998), or a broken DOM API that's been in the platform for years and years. Attention is disproportionately focused on the new stuff when the reality is that old standards still rule.
I wouldn't necessarily agree that the conclusion is to just rip stuff out of the Web platform, though (although there is plenty of stuff I'd love to drop). Rather, we need to implement the features in a secure way. This isn't rocket science. Notice, as usual, that the majority of these security issues are straightforward memory safety issues† in C++: e.g. https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=firefox+svg
† Food for thought for those claiming that "modern C++" solves these problems.
How many sites are on the internet again?
Around a billion.
1% would mean 10 million websites are using each possible feature. Okay, it's not quite that even, and a lot of popular sites (like many news sites) stick to the most basic features, but all these browser features are there for the other various cases. Like aforementioned web apps. Various tech demos. Sites with very specific use cases.
In other words, it's because the internet is massive and varied.
90% of words in the English language are not used by the majority of people. Maybe we should just get rid of those words?
99% of the world doesn't use Calculus. Maybe we should stop teaching it in school.
Lame article.
And for the websites that leverage browser features, those are the websites that stand out and shine.
It reminds me of Sturgeon's law https://en.wikipedia.org/wiki/Sturgeon's_law
If 99% of everything is crap, you can bet a sizeable wager the 1% wheat from the chaff are using HTML5 and Javascript APIs.
Also if a content silo counts as a 'top website' then this is an outlier and not to be included. Facebook is a walled garden and not indexable. Facebook is parasitical to the web. Twitter and Google are not the web either.
You mean like Hacker news? Because, I don't think this site leverages HTML 5.
The hard part of any design is knowing what features can be removed not added. HTML 5 and JavaScript is fluff for most websites.
PS: HN even uses <table id="hnmain"... o the horror.
> HTML 5 and JavaScript is fluff for most websites
Yeah but HN can be progressively enhanced and it's also one of the few exceptions to the rule.
Indeed.<table id="hnmain"..Worth noting the small eco-system of HN redesigns which all leverage HTML5+JS in some way :)
All sites should be like an electric stairs: the stairs still works when the power is cut off, but to the inconvenience of the users.
To propose removing these features instead of fixing them fundamentally misunderstands the modern purpose of the web. It's not just for document distribution anymore. It is and has been for many years an application deployment system. To get rid of these features would kill whatever chance we have of getting out of walled garden app stores.
I mean, a similar study would probably find "top 10,000 apps don't use 80% of OS features." Just because not a lot of people use a feature doesn't mean it shouldn't exist.
Right. And then when those features aren't available in the browser, that's used as a justification going back to proprietary (aka native) platforms. It's a circular argument made by people determined to return to the days before we had an Open Web, for whatever reasons. If you don't want to use the features, nobody is forcing you to do so.
> ALS, “ambient light events”, would let browsers respond to the light level the laptop, phone or desktop is exposed to if anybody used it.
Can someone give me a good, non-gimmicky example of what this would be used for?
Provided one can access sensor data fast enough, cross device tracking[0]. Display an ad that lights up the room in such a way that you can read it with the sensor. Communication across an airgap. If you have two devices with screens and said sensors in a dark room, they might be able to communicate by turning screens on and off.
Most speculatively, imaging the environment with compressive imaging[1]. One might be able to flash some patterns on the screen and look at light sensor output to take a picture.
Giving web browsers access to sensors on our devices is sort of scary.
[0] http://arstechnica.com/tech-policy/2015/11/beware-of-ads-tha... [1] http://arxiv.org/pdf/1305.7181.pdf
I like how you ripped out any positivity around the topic :D You are my hero
Hmm, perhaps a text-heavy site could switch between "night" and "day" modes for ease of reading
Oh god. No no no. I cannot hate this idea hard enough to overcome my fear that someday this kind of thing might actually happen! This would be the browser's job, under user control, not ever the site's job. My computer, my choice, no you don't get to decide that for me just because you are a web page designer.
It's difficult to pull this off with every website and keep it readable, without any hint from the site. This is why we have CSS for different screen sizes, rather than relying on mobile browsers to make all the decisions, along with an option to view the site in "desktop mode". I do agree that it would be annoying if sites made this decision for users.
But I think the more promising use for this could be enabling aesthetically pleasing color palettes that are power consumption optimized. Picture a HN that switches to darker tones of the current colors when you're on a mobile device.
I'm already constantly using "readability mode" to strip away as much of the custom styling as I can, because letting designers run riot - and building browsers that obey their wishes - has turned the web into such a mess. Tighter limits and more end-user control would be better. Different people have different preferences about their browsing experience and should be able to control that without having to rely on every last web designer to make helpful choices, because they really just don't.
My vapourware web browser is tentatively titled "fuck your web designer".
It will ignore site-provided CSS.
Interested?
Ha! I would like to give that a try.
Just last night I was happy to find that Safari Books Online had a 'night' feature, automatically switching would have saved me thirty minutes of burned retinas.
Do you need an example right now or can you wait until someone does something cool with it that's only enabled because the option exists?
If we stopped adding new capabilities to JavaScript even as new sensors and w/e go mainstream we'd start needing Flash all over again.
Before we add any new APIs we need a permissions interface for the web. Many new APIs add creepy data points that the user be aware that they are providing.
Yeah. Spotify would stop ads if the audio got too low. In the same way, ambient light could be used to make sure an ad is watched.
Not saying it's the most important case ever, but it would be important if you needed it
essentially something like f.lux presumably.
More advanced analytics, tracking your battery + ambient lights + other capabilities...
Use a white letters on black background theme in case the user is browsing your site in bed with his lights off? ^_^
This seems pretty obvious to me. And at the same time, entirely not a problem.
A great example that comes to mind is the drm component that Netflix require to deliver html5 instead of that spotlight .. thing. I cannot think of any other site that has needed it - or at least, any other site I visited before it was available, that suffered for it.
And still, I consider it a requirement. That feature that I require for exactly one site.
this was very interesting: "SVG, for example, has a problem however you look at it: on one hand more than 15 per cent of the sites use it, on the other hand, nearly 87 per cent of blockers block it, but it's had 14 security warnings (CVEs, Common Vulnerabilities and Exposures) in the last three years."
I am deeply suspicious of that number. I've never encountered a plugin or anything of the sort that provides for "blocking SVG", and it's fully supported in all recent browsers.
Paper author here. The blocking % referenced, and discussed in more detail in the paper, is the % of times a feature is used when someone visits the page, but ISNT used when you visit the page with AdBlock Plus and Ghostery installed.
In other words, its how often these popular blocking extensions prevent the JS APIs from firing.
Its not blocking SVG, its blocking (mostly fingerprinting) JS libraries from running SVG JS methods
Aha, thank you for the clarification. That's a lot less worrying.
^ Yeah. I use SVG heavily in all my web projects, and I've never seen a single user access one of my sites & had the SVG blocked. I wouldn't buy '87%' at all.
Just a note on this (I'm the lead author on the paper author), the blocking rate has to do with the reduction in JS usage when you install popular blocking extensions.
So its not that extensions block SVG directly, its that AdBlock Plus and Ghostery block a bunch of libraries, and those libraries use the SVG methods to finger print (and do other stuff)
Ok, this seems interesting. However, unless I am misunderstanding this, of the large table they used for the bulk of the articles content they only showed 6 features under utilized.
Obviously, we could prune the execution of some of the JS which is backwards compatible to the early 90s, and some of the html which is based on IBMs 1960s.
My super high-level first pass optimization rec for the w3c: If a feature exists for 3 years and a random sampling of 100,000 websites has usage stats of less than < 1% it is automatically deprecated. If it is >1% but less than 5%, it is automatically phased out in 2 years of the spec.
Unfortunately this would create a huge disincentive for developers to use any new features no matter how compelling. Would you learn any new features if there was a chance in 2 years all of your users would drop support for it?
That was the point. If a feature wasn't great enough to inspire 1 out of 20 developers to experiment with it, then I don't think it makes sense.
1. Retrocompatibility. It's impossible to assume that 99% of the users are using a modern browser with under 3 years.
2. We are not talking about experimentation, we are talking about deployment into production.
I agree with you. I was certainly t aggressive in time frame, but I still agree with the sentiment. We shouldn't have failed decade old features in a browser. As to point 1, you are correct and this is certainly true, however I wish it werent. I assume everyone on hacker news has either gone to college or is aware that a presedential term is 4 years long. Consider how much you as person changed in those 4 years, consider how much the world changed during a 4 year presedential term. China as a country sets guidance for itself, directly for its billion people, and indirectly for the rest of the world, on a rolling 5 yeat basis. I would like to challenge users (and enerprises) to upgrade their browsers over this timeframe and spend the multitude of seconds this takes them to not only increase their own well being, but the well being of developers writing the software they consume.
Consider the Jquery 3.0 that is launching now. It supports Internet Explorer 8, a browser released over 7 years ago [1]. Using jquery means, for example, you won't be using FormData for submitting forms. Heck, you won't even be using `.forEach()` on arrays.
Can anyone recommend an updated browser that skips most unused features to run faster, use less memory or be more secure?
Brave software. This was created by Brendan Eich the x ceo of mozilla. Eich does seem to have some problems with equality[0] which may have contributed to his ouster at Mozilla, but Brave is a top shelf browser. Super great team, pretty security conscious and given the Eich developed JS, well, they have a pretty good working knowledge of it. Brian Bondy is a super awesome guy (thanks for adding duckduckgo BTW), Yan Zhu is a pretty well known dev & security blogger and they had the woman who wrote a couple encrypted chat clients working there (apologies her name escapes me) but she isn't on the site. I assume the rest of the team is talented. So in essence, I ove Brave. I still wish they would make a goddamn search engine, but it is an awesome browser.
[0] NaN === NaN returns false?. 0.0001 + 0.0002 !== 0.0003? weird.
In addition to the reply with the pointer to the standards, sure NaN !== NaN.
NaN means "no idea what we've got here". When you get to say that phrase, would you expect it to be used for exactly one thing and one thing only? To me "I have no idea" means the possibilities are endless (infinite). We can only compare for equality when we know what we are looking at, otherwise it's just "status unknown". Yes, it might be equality - the chances are infinitely small though.
If you take the argument a step further and say "but I got the two NaNs that I compare doing the exact same operation, so even if I don't know what I've got from a mathematical point of view whatever it is it should be equal". In that case you are not actually comparing the NaNs but the path(s) that got you there.
I must say I find the whole NaN, null, 0 vs. undefined interesting on so many levels. There is a world of a difference between knowing you've got nothing (null or 0) and not knowing what've you've got at all.
"null": I have no bank account. "0": my balance is 0. "undefined" or "NaN": I lost my memory after yesterdays binge drinking and don't know who I am and if I got a bank account or not. Knowledge vs. no knowledge.
Again, I am aware of what the behavior is, although this was a nice refresher and I appreciate it ;)
The joke was, of course, surrounding the Mozilla incedent where Brendan Eich stepped down from his position as CEO apparently because of a perception that he was promoting inequality. This characterization was due to his support of prop 8 (a bill about marriage equality, or lake there of).
To completely ruin the joke's punchline[1], was that the "problems he had with equality" were actually his understanding of types and language development and are evident in javascript.
Of course that is untrue, was a joke, and Brendan Eich is a legendary programmer whose contributions I am grateful for.
[1] Joke's punchline could be ruined by it being both a bad joke needing an explanation, being in poor taste, and generally just not being a funny joke.
> and generally just not being a funny joke.
I thought it was damn funny. There was definitely an unintended outburst at my desk.
But then...... I also thought it was spectacularly hypocritical of Mozilla to fire him for having what amounted to... ideals. Which is what Mozilla supposedly works to protect.
> But then...... I also thought it was spectacularly hypocritical of Mozilla to fire him for having what amounted to... ideals.
They didn't fire him for having ideals. One, because they didn't fire him, and, more significantly, because the problem was not him having ideal, but him being unable to determine or unwilling to take the steps necessary to effectively manage a major PR incident affecting Mozilla's relationship with users, employees, and other important stakeholders. (Or, given that he actually resigned, maybe he was quite able to determine and willing to take the necessary steps, but those steps were inconsistent with him remaining as CEO.)
These are standard IEEE754 floating point behaviors.
I am aware. It was a bad joke.
The ultimate goal for Brave (I don't know if it's doing this yet) is to replace blocked ads with ones sold by the Brave company.
I cannot support this.
Couple of things:
1. Brave won't "sell ads", that's not how it works. Marketers buy space for ads and spend to create the ads themselves. Websites or publishers sell space for ads, sometimes directly to brands/agencies. If ads use no tracking and host the ads' images and other assets on a non-blocked domain, no problem.
2. Where we propose to do better is with "indirect" or programmatic ads. These are so out of control, publishers make money from them but disclaim responsibility when a piece of ransomware malvertisement places in the offered space. We aim to put tracker-free, sandboxed ads in these spaces and share revenue (equal shares to the user and to Brave: 15%) with the bulk of the revenue (55% direct, 70% in total per the default settings) to the website.
We're thus working to align incentives in the system that currently work against the user's interest.
We also support pure ad/tracker blocking, if you prefer.
On top of either mode, we're buildling an anonymous system for micropayments as well as aggregated ad metrics, so no one can re-identify a Brave user from a revenue stream.
Of course, we're not yet doing anything more than blocking all third party / indirect ads and trackers. To get better user-aligned revenue models in place, we'll need to scale up and win over some top publishers for trial/testing purposes.
So if you want the fastest (native code wins in perf and mem use) browser that blocks ads by default, I hope you will give Brave a try.
Honest question: If Brave doesn't sell ads, where do the sandboxed ads come from that are under #2? Presumably your software is stripping out the bad stuff and putting in something, which marketers have agreed to pay you for. Or how is it determined what goes there?
Short answer: we hope to sell blocked ad spaces on behalf of our users, but only a few per page, and always with par-with-Brave revenue share to our users.
Longer answer: we would be selling anonymous ad spaces based on matching tags (like standardized keywords) on-device/in-browser against a set of tags for ad deals whose buyers sign up to pay us only when those ads perform (have anonymously confirmed viewable impressions).
The tag matching is private -- no cookie or other identifier comes out of the device/browser, no tags out either.
The (few, three or four) tags for each ad space come from local inference, weak AI, running on-device. The goal of this AI is to work for you on your device, studying your data, to pick the best small set of tags for the ad space. Only a few, even just one, spaces per page might be filled. The ad serving is async as well as sandboxed, so it doesn't delay page load.
This private and device-local inference uses the sum of your browser state, which is a superset of what remote trackers see via cookies and fingerprinting. Think of tab opener relationships, absolute viewability/visibility, evolution from search and social interactions to browsing sites to buying. As with browser history, you can clear the summaries inferred from your data.
We are building cross-device, client-private-key-encrypted sync (as other browsers offer, although not always client-encrypted). To handle inevitable key loss, as with user multisig wallets, we will rely on a separate Key Recovery Service. So we won't have keys for your data or the results of inferences made from it. But you will have the benefit of inference studying all your browsing on all your devices, as they sync and the inference runs incrementally.
If we pull all of this off, we'll have built a private data platform where our users own their own data cross-device, along with valuable results from analyzing their data on their devices. Kind of an upside-down, individuated, private Google.
You're seeing movement toward "segment of one" marketing tech (companies who track you as an individual, across apps and even offline credit card purchases). These trackers may run afoul of regulators, as they zero in on PII.
We think Brave's way of defending your data on device is better: more precise and of course more private. Clustering among users and sync'ing with non-browser data can come later, and via ZKP protocols like the one we're using for the Brave Ledger (see https://brave.com/blogpost_3.html).
We also see IoT (home, car, internal/wearable computing) as a field in need of our "anti-cloud" approach. Of course backup is important, and some transacting with cloud services wins, but giving all your data away from the start puts you at a permanent disadvantage. It also accrues rising privacy and security risks.
Users won't get paid fairly for their attention and time (never mind their location and health data!) until they defend all such data from trackers. We're building a system for doing this, using a browser as first line of defense. (There will be other lines of defense.) Anonymous ads are just one way to get users their fair share.
Our principle of users getting their fair share means we take the same ad revenue cut as our users get, and pass the bulk on to the publisher or website (or account, if YouTube or the like). We'll follow this principle for other possible revenue including from search, if we can work that out.
Your data, on your devices, with a fair share to you, are key points of the brand value we hope to build.
@Brave has no future in a sense of gathering people just because they hate ads, ads market is very noisy world. This market will find answers to any questions no matter someone's software is blocking ads or not. Because majority of Internet's users aren't much hate those clunky "click-on-me" links at all. Ignorance is a blessing.
If you can give those users faster performance and lower data usage and opportunities to cash in on the value of their eyeballs, where/when they are willing, it doesn't seem like you even have to 'hate' them, you just have to prefer something with more positive features.
Honestly, I think Brave is good for everyone - including advertisers. There are interesting new patterns to develop here, I'd bet. With every change in tech comes new possibilities, Brave is opening the door to a lot of potentially new ideas/strategies here IMO.
Brave isn't relying on people who hate ads, but you should not underestimate that population size.
The big problem is how much of the digital advertising "rent" goes to Google and Facebook alone, for so little. Perhaps 80% of 70B/year in the US alone. This will not continue. What's next? We have an idea.
I use Pale Moon, a firefox fork. I currently have ~200 tabs loaded (in a session with ~500 tabs) and it's only using 2.7 GB of RAM.
> ... even though fairly close to Gecko-based browsers like Mozilla Firefox in the way it works, is based on a different layout engine and offers a different set of features. It aims to provide close adherence to official web standards and specifications in its implementation (with minimal compromise), and purposefully excludes a number of features to strike a balance between general use, performance, and technical advancements on the Web.
Not Chromodium/Chromodo Web Browser[0]
[0] https://thestack.com/security/2016/02/03/chromodo-browser-di...