Cloudflare R2 IA storage tier
blog.cloudflare.comSo pricing is 1c / GB-month, compared to S3 IA at 1.25c / GB-month, a decent saving but not massive, no archive or deep archive options though, I wonder if / when these will come.
What sort of negotiated rates can you get from AWS for bandwidth I wonder, at the moment, that’s seems like the only real benefit from CF I think.
Backblaze B2 is cheaper than Cloudflare R2 IA. Hmmm
Backblaze is unprofitable and publicly traded, a combination which cant last forever. They raised B2 prices 20% last year, I wouldnt be surprised to see more increases if they continue to burn through cash.
> Backblaze is unprofitable and publicly traded
So is Cloudflare
Cloudflare practically has a stranglehold on the modern Internet. I would bet money they would be immediately profitable if they killed their free tier. There aren't that many competitors, and I don't know if they've got the spare capacity to absorb all the exodus.
I'm not so sure about Backblaze. I don't even think they're the biggest player in that space (AWS is, I would guess). I would guess most people could migrate off if Backblaze turned south.
if they simply streamlined their sales pipeline and created a mid tier (somewhere between $500-$2500 a month for example) that unlocks some of the features that are behind "contact sales" banner they could boost revenue without changing any existing tiers, I'd wager.
I think the platform has a ton of potential and it already shows signs of real progress, but much like fly.io, its rough edges are incredibly rough.
I don't work in enterprise sales or nothing, but it seems to me that businesses whose only price tag is "call us" are the ones with the most revenue. Transparent pricing is great for SMBs, but the big bucks are in making yourself entrenched in giant enterprises.
We sell a B2B application. We've had "call us" prices since the start as far as I know.
Yet even though we're now dominant in our sector, we've got about 50% of the revenue from a couple of dozen very large customers, and the remaining from many hundred medium and small businesses, including many single-person shops.
A key ingredient is that we have a usage-based pricing element, so what we charge a customer monthly varies with their activity. And it's primarily this element that is tweaked between customers, so that it's affordable to both small and large, while still making it profitable for us to provide the software and support.
Having such a varied income stream has been quite good for us, and has allowed us to turn down potential lucrative customers which had unreasonable demands that could have killed us, or be flexible when certain customers really struggled under corona say, so they didn't have to go to a competitor.
I used to be quite negative to "call us" pricing, but got a new perspective after I started here. That said, I prefer transparent pricing when shopping software on my own.
This exists but isn’t really documented, we pay ~$1k/mo for a “light” enterprise version of Cloudflare.
A few jobs ago I was looking at switching from CloudFront to CloudFlare. I was basically told to come back when I have real money to spend. They essentially said that they don't even want to work with me and my $3k/month cloudfront bill, they start at $5k.
Who told you this? I'd love to see that email exchange (jgc @ cloudflare).
Unfortunately that email exchange is in the corporate archive of a company I have not worked for since 2021.
I'm not sure about Cloudflare. Their free tier vastly reduces the barrier to entry for setting up a DDoS as a service business (without it you'd need to have very expensive hardware for circumventing DDoS attacks, as otherwise you'd get DDoS'd by your competitors). This in turn increases the demand for Cloudflare's services to protect against DDoS attacks.
I'm not saying they're good for the internet, just that if I was going to make a bet on which one is more likely to survive a decade, it'd be Cloudflare.
I don't see how Cloudflare has a stranglehold. They've captured the bottom of the market by having low, low introductory prices and turnkey security. They have tons of huge and small competitors though. They have far less revenue than Akamai.
I went to go look at Akamai's site, and I don't think either party is interested in that transaction.
Akamai doesn't look like the kind of company that wants to deal with 3,000,000 tiny accounts, and I don't think the customers will be happy with the service they get.
I guess to put it another way, do you use Cloudflare currently? If they made the free tier $5-$10/month for as many sites as you want, would you pay them or put in the effort to migrate?
I think I've got 2 sites I actually care about enough to want a CDN and DDoS protection. I would probably just pay up. I'm sure I could go somewhere else for free, but my Cloudflare setup works and I don't want to have to redo my Let's Encrypt wildcard.
The amount of profit isn't always the most important number anyway. A lot of companies choose to not be profitable while they can spend their bank account growing their business (Cloudflare is one, and Backblaze may be another but I have no idea about their finances, historically Amazon and Salesforce both did this too).
If qoq and yoy revenue keeps going up, and cost of revenue stays the same or decreases (as a percentage) in the same time period, it makes sense to spend the bank account on growth. If the growth stops, that's when you start cutting expenses like R&D and operations to get the profit. Reasoning being: getting x% of a bigger revenue is better than getting x% of a smaller revenue.
Cloudflare is cash flow positive and profitable on a non-GAAP basis, while being unprofitable on a GAAP basis. https://www.cloudflare.com/press-releases/2024/cloudflare-an...
Cloudflare is cash-flow positive because a big chunk of employee compensation is paid thru the issuance of new shares (eg constantly raising more money and diluting existing owners / investors). If you include that comp as a cost they are not making money https://stockanalysis.com/stocks/net/financials/cash-flow-st...
Cloudflare is profitable. Whales subsidise retail.
You clearly haven't read the earnings report. In December 2023, their net income was -27.86M. The've been loosing 100M a year for a few years now. To be clear, I think this is the right move from a business perspective, I'm just saying it's a little unfair to knock backblaze without mentioning this nuance.
B2 can be great but it is missing a lot of features when compared to other object stores so it isn't a good solution for every scenario.
As an example I investigated, to put a custom domain in front of a B2 bucket they suggest using Cloudflare and CNAME-ing a bucket subdomain (eg f000.backblazeb2.com) https://www.backblaze.com/docs/cloud-storage-deliver-public-...
Well if f000.backblazeb2.com is used for any other people's buckets too, which appears to be the case, I guess I am now able to serve other people's files from my domain? This seems terrible.
I'm not sure I understand all of the nuances here (I'm no webmaster), but this is covered in the documentation you linked:
> You must configure page rules to allow Cloudflare to fetch only your Backblaze B2 bucket from your domain. ... Otherwise, someone could use your domain to fetch content from another customer's public bucket. To ensure this does not happen, Cloudflare lets you use page rules to scope requests to your bucket.
The example shows leaving your bucket name in the url as a way to filter out requests to other bucket names. If you want your static site to have http://mysite.com/bucketname/index.html then I guess that's ok. But again, careful configuration and still not for every situation.
I'm sure you can layer more rules to get it exactly right but I'd not be eager to layer on complex configuration through multiple service providers when it is avoidable, unless there is some very compelling overriding reason.
As far as I know, bucket names must be unique at other providers like AWS as well. [0]
I'm no expert but to try and protect my own domain, I use a transform rule to match a subdomain and append "/file/$MY_BUCKET_NAME" to each request. This should return a 404 for anybody who tries to inject their own bucket in the path. I could be wrong of course.
[0]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket...
Bolting a Cloudfront distribution onto a S3 bucket is pretty well-trod territory, though, and doesn't have these sharp edges. (Has a couple other ones, but they're less common.)
This is an easily solved problem. Backblaze has an example here: https://github.com/backblaze-b2-samples/cloudflare-b2
Does the solution involve using Cloudflare workers? Because, as I said, I'm sure it is possible but maybe we've gone off the deep end a bit. Just how crazy of a configuration do you want just to serve files from an object store?
This looks like an awful lot of setup for "easily solved". Easily solved is what S3 does where this isn't even a problem.
yes, it's a workercurl -sL https://github.com/backblaze-b2-samples/cloudflare-b2/raw/main/README.md | head -n 1 # Cloudflare Worker for Backblaze B2
and iDrive e2 is even cheaper. https://www.idrive.com/s3-storage-e2/
Isn't iDrive a BMW trademark for decades?
Trademarks only apply to specific classes of things. https://www.uspto.gov/trademarks/basics/goods-and-services
That's useful: thanks.
Downvoted for a question? How laughably degenerate.
Yeah it can get pretty ridiculous. Fortunately over time things tend to even out, but for whatever reason a lot of early HN participants downvote everything. My theory is that the people who would downvote a question like this one are the people who go around reading passive aggression in every little thing. Possibly people that assume every question is secretly an agenda to propagate an opinion they disagree with, and that there are hidden implications everywhere.
Yep, but i have been here long and the non-contributing downvote phenomenon seemed to me to start when HN got more widespread participation.
How did you figure out the conversation sniping to be earlier members?
Ah thank you, I must clarify! I was terribly and unintentionally ambiguous earlier.
I didn't mean earlier members as in people who joined HN in the early years, rather I meant the people who tend to react and comment within maybe the first 30 minutes after a post/comment.
I actually find the early members (as in the early 2010s and before) to be among the best. I have no idea if they are downvoters or not, but some of my favorite conversations on HN have been with these users.
As far as sourcing, definitely take with a big grain of salt because this is also purely anecdata that I've noticed from spending way too much time on HN, both repeatedly having my own comments downvoted initially while (usually) rising up over time, and observing the same phenomenon on many other people's comments.
Oh, yes, i have noticed the same.
iDrive is slow and charges extremely high fees for usage above provisioned. They also have a history of increasing prices for individual contracts only a few weeks before renewal, so you can't possibly have enough time to move your data.
Hetzner Storage Boxes (2.50-3 EUR per TB) is probably the sweet spot. B2 if you need an object storage API.
Do storage boxes have any availability risk? “Storage box” sounds like you have an actual VM with an attached spinning disk, which doesn’t seem tolerant to hardware failures. I couldn’t find any details on their website about this.
They're on triple redundancy Ceph as far as I know, not geo redundant, but at that price you can buy one in Germany and one in Finland and still come out cheaper than B2.
You also only get a very locked down shell.
It's not CEPH. Some type of raid: https://docs.hetzner.com/robot/storage-box/general/#reliabil...
From Hetzner's snapshot data loss email two years ago:
"The snapshot contents are distributed over multiple internal servers and data is stored in a way that allows up to two separate disks to fail without impacting data integrity. This means the snapshot can still be accessed, even if two disks fail at the same time."
> iDrive is slow and charges extremely high fees for usage above provisioned.
What is the fee?
Rounded to the nearest TB rather than metered.
This is expected. Like cloud providers, Cloudflare is intentionally not aggressive on pricing as it is a race to the bottom.
There are other ways to compete.
IIRC Backblaze B2 charges for egress, while CloudFlare does not.
They do not charge for egress either.
https://www.backblaze.com/cloud-storage/landing/ad/use-cases...
*Up to 3x of average monthly data stored, then $0.01/GB for additional egress.
Or free if you go through Cloudflare since they have the bandwidth aliance.
I can’t get over the fact how storage is still so expensive in 2024. Lowest you can get is probably $5 per tb a month from any of these companies. A new tb hdd is probably $25 today. Where does the money go, into c suite car payments?
Backblaze have written many blog posts on how they go from a few thousand hard drives to a business. For me, the most interesting part was they went from six generations of self designed storage pods to "fuck it, just buy Dell". Long story short: these businesses are surprisingly complicated.
You're comparing apples and oranges here....
A $25/TB drive is not the only expense that $5 goes towards:
* there's actually probably 2 or more HDs holding that TB, since the business is promising that the data won't be lost
* theres the computer(s) that hold that HD.
* theres the electicity, bandwidth and space rental costs for those computers
* theres the cost of employees to make sure that the computers keep running.
* theres the cost of the marketing so that you know that the service is available
* theres all the book-keeping, taxes, cc fees, etc that need to be paid on the recurring charge
* there's (hopefully) profit for the investors/owners
and so on.
Also, on your side you should consider several of those factors yourself to do the comparrison:
* how much do you consider the time spent managing your hdds to be worth? (if you're a business this is employee-hrs, if you're talking about for yourself privately, there's still a value you should attach to your own time)
* do you have backups? If so, what does it cost to put them offsite? (In terms of space rental or favors traded, and your time)
* electricity, etc
* how much is it going to cost you to learn to reliably store your data (in terms of up-front cost, time spent, etc)
* and of course hard drive costs
I am aware they are hosted on servers with mirrors or parity drives. It still makes no sense how these services seem to raise prices over time instead of seeing the savings from depreciating storage and hardware costs passed to the consumer, unless you realize they are squeezing you.
I really believe there's a missing market. I think there would be big business in building servers for homes. Where you prioritize low power usage and low noise, and do not need blazing speeds or frequent/high access. Cloud is great, but some things should still be stored at home. Home NAS systems are a bit odd and difficult to expand (nowhere near what a rack is).
My argument would be that this would be helpful with the high adaptation of things like Ring doorbells and other camera systems at home. Where people can store their own data and provides better security & privacy given you need not rely on a data connection to store that footage. It also would be extremely worthwhile if we are to see personalized LLMs become common and tools like home assistant. You wouldn't want that running off-site. In fact, I'd rather call home from my mobile LLM than call FAANG (or anyone else with teeth).
I just think buying used servers on ebay or trying to throw together a home rig is harder than it needs to be. I'm confident the demand exists but it is unfortunately a field of dreams scenario. Many people will not know they want it until it exists (I can say my parents would love this but they don't understand the first thing about technology so all they can do is complain about Google/Apple having all their data rather than express how they want to store their own).
The problem is tech illiteracy and CGNAT.
The product must be "a router" so people can access it outside of home. Or it doesn't have to, but then you'll have to proxy traffic through you and charge for it.
And your "router server" must have a decent AP, because the likelyhood people know how to bridge their "routeraps" is pretty low.
IPv6 would help for sure, but there's still "allow 443 to this box", static registrations.
This is before even building the product
I don't think you're exactly wrong, but I think it is too narrow.
You're perfectly right that there is far too little tech literacy. Even with the example of my parents. But they're an example of someone who I think would especially benefit from this. Because they wouldn't get it out of their own desire, but because I their child would install it for them. Because I don't want to build and piece together everything. Because I'm used to the general tech support of them calling me up, and having to figure out literally everything on the fly because the only time I touch a Windows system is my yearly Christmas visit.
I've ran a NAS in their home before and the reason they stopped is because they got a new router and "it broke." Prior to that I was able to ssh into their network because I had a pi laying around.
But the problem you specify is not the problem you think it is. It is UI/UX. Many of these things can be set up automatically. The reason PGP is a disaster is because it's cumbersome to use. Google making it default and not having to think about it solved that. Signal, iMessage, and WhatsApp made encryption trivial for people who wouldn't have done it before because "it is too hard." I'm unconvinced this is anything different. Where if you take a family member only basically tech literate, can help them do the initial setup, and away you go. You just have to make it as easy as WhatsApp (or even a lot less), and I believe you could.
I say this as someone who is a researcher and does a lot of backend programming. I know we give UI/UX people a lot of shit (and quite often they do deserve it. There are a lot of annoying useless changes), but they do also play a huge role in making technology accessible. Really, that is their main role. And truth be told, the environment has dramatically changed where now a days there's many custom distros that make things easier and even these days my Grandma can use Linux. There's definitely a hardware and backend problem here, but I'm actually convinced the biggest issue is design. Which, let's be real, is what made computers prolific in the first place.
Edit: misunderstood your premise. You meant bespoke solely single household servers at home. Like homelabs but without the hassle.
Wuala[1][2] did something similar more than a decade ago, in that users become distributed storage for other users which made the service free for those participating (otherwise was a paid subscription). They were then acquired and stopped their most unique feature before closing for good.
[1] https://en.wikipedia.org/wiki/Wuala
[2] https://arstechnica.com/uncategorized/2008/08/first-look-wua...
Servers still fail, regardless of how much redundancy you build in.
Especially in a home, where kids spill a gallon of fruit juice and don’t think to tell anyone until 2 days later, pets knock things off tables, fires happen, power outages happen, theft happens, and so on.
There still needs to be a plan for when the server is gone. So, buy two home servers and run them in different locations? Back to cloud? Or what’s the plan?
I'm not sure what your argument is. Yes? Is not the standard suggestion 3 copies and one off site? And consider that the post is about low egress storage. And what, you're going to tell me that business people don't want to sell more things?
Well, spinning rust HDDs stuck in your server have no actual parallelism and aren't highly available, replicated, etc.
How much do you think 1TB of storage should cost?
Even factoring parity drives its still absurd. $55 a year per tb per life and we somehow never see depreciation in spinning rust storage prices hit these plans. If anything they get more costly over time. Why? Their overhead literally goes down every year with depreciating costs of drives and all the other hardware they currently use for their storage arrays.
It’s availability that’s expensive.
Availability, durability, etc.
All those nines cost extra.
Does anybody know which consistency model Cloudflare offers compared to AWS S3?
Here are Cloudflare's docs on R2 consistency model https://developers.cloudflare.com/r2/reference/consistency/
We're onboarding to Cloudflare MagicWan and want to use them for logging, which they do to 's3-compliant' buckets.... on Google or Amazon.
I was pretty surprised at the lack of dogfooding, wondered if it's an oversight, on somebody's Gantt, or just not something R2 can handle for some reason.
Yeah, the integration and production readiness of their non-core offerings is not perfect. I'm dealing with R2 and another service and you can tell they fell more like... specifically integrated features, rather than fully modular services you can choose to use as you want. Like the workers have possible R2 bindings, but you can't use those in a fetch() call - you have to use S3 compatible endpoint instead.
AWS has its own issues, but the push to have everything talking over API did wonders for the ability to use them as you want.
> Like the workers have possible R2 bindings, but you can't use those in a fetch() call - you have to use S3 compatible endpoint instead.
Sorry, could you please elaborate? Why can you not use a binding to an R2 bucket – and perform operations on its objects – in a `fetch()` handler of a worker? Or did I misunderstand this statement?
I meant that in your worker handler, you can only run fetch(s3-endpoint-for-bucket) rather than something like fetch(env.MY_BUCKET...)
This matters for their image resizing which needs to be used as options on fetch().
been waiting for that event trigering for a loooong time. I'll give it a go
How is data stored in this tier? Is it just on big slow SMR disks?
Anecdotally, we have found R2 to be nearly half as slow to respond as the same request to S3 _proxied through Cloudflare_.
So... something isn't right here. Maybe a mechanical turk where a live human is fetching the object using Windows Explorer behind the scenes?
Did you mean "half as fast"? The way you worded it sounds like you mean to say that s3 is faster, but it says that r2 is twice as fast as s3.
Apologies, yes. Half as fast. S3 was typically around 20-40ms, R2 was typically around 60-80ms iirc
Any chance you could convert ‘nearly half as slow’ to a percentage of the original response time? This reads like a reading comprehension puzzle and I don’t even know if my hunch is correct.
Is the “data processing fee” any different from an egress fee in practice? Seems a little deceptive.
At least magnetic disks are iops constrained, lower iops loads conceivably allow higher density, or packing different load patterns to the same devices. Say a 8 TB / 100 iops disk reserves 90 iops for a 1 TB a database service, that's 87% of the disk's capacity sitting free but only 10 iops to serve it with. Adding what is effectively an iops tax to discourage frequent reads is one way to make a mixture like this work (or another way to think of it - subtracting an iops discount)
Obviously example above is contrived, but same principle applies to a pool of 1000 disks as it would 1. You also don't escape this issue with regular hot storage either, there is still a (((iops * replication count) / average traffic) / max latency) type problem lurking, which would still necessitate either limiting density or increasing redundancy according to expected IO rate. This is one reason why some S3 alternatives with weaker latency bounds (not naming names, they're great but it's just not the same service) can often be made substantially cheaper, and why at least one of S3's storage classes may be implemented entirely as an accounting trick with no data movement or hardware changes at all
Yes. You can process it once to the standard tier, and egress as much as you want for free.
The differences stack up for say, a 1GB video that becomes viral and triggers terabytes in egress. You pay for 1GB, not terabytes.
It’s also an optional tier.
> The differences stack up for say, a 1GB video that becomes viral and triggers terabytes in egress. You pay for 1GB, not terabytes.
Under the condition that you actively monitor the usage and manage to "process it once" on time (and then "process it back"). Because otherwise you pay for terabytes - not in egress fees, but in processing fees. Or am I missing something?
The whole point of IA is cheaper storage that is infrequently accessed, and there is a price to accessing it. If you need / want frequent access just use the regular storage class.
All object stores out there have a flavor of IA class with an access fee that should be far lower than the storage class savings for scenarios where you would even consider using this. If you don't want or understand this cost optimization you simply don't use it.
Yes, because in a well-designed setup files that are frequently accessed would be restored to standard tier. Ideally you'd only pay the data processing fee once when files transition from infrequently accessed to frequently accessed. There's a breakeven point at a data access rate of once every two months.
Maybe the cold-to-hot migration "tax" is partially to prevent abuse?
> "Data retrieval is charged per GB when data in the Infrequent Access storage class is retrieved and is what allows us to provide storage at a lower price. It reflects the additional computational resources required to fetch data from underlying storage optimized for less frequent access."
I like the "automatic storage classes" idea as well.
> "…you can define an object lifecycle policy to move data to Infrequent Access after a period of time goes by and you no longer need to access your data as often. In the future, we plan to automatically optimize storage classes for data so you can avoid manually creating rules and better adapt to changing data access patterns."
AWS already give you intelligent tiering for this, it's a very nice product but it's also just a nice way of hiding the same fees. Your $0.004/GB becomes $0.023/GB on first read for 1 month then $0.0125/GB for 2 months, so the average cost of storing it over those 3 months becomes $0.016/GB, and that's before considering monitoring fees
You could also implement tiering yourself, depending on your workload of course. If you know you're storing objects for long-term archival reasons (or backups), you could opt for using S3 Glacier Instant Retrieval at $0.004/GB.
does anyone know which products of cloudflare have the most revenue?
probably the CFO of Cloudflare inc knows
A fairly unrelated point, but its so strange how companies that underpin a lot of the internet struggle in the stock market. While we all wish we had sold our tech stocks in 2021, Cloudflare still hasn't recovered.
Cloudflare has a very dysfunctional sales pipeline. Their free, premium and self-serve offerings might underpin the internet, but the highly profitable offerings that are gated behind their sales teams are not getting sold. Too many of the clients that they should be selling to.
Magic Transit (bring your own ASN), classic website DDoS protection (above the Business $200 tier, which has low, undisclosed data limits in regions like New Zealand) and ilk all require interacting with the sales rep, and unless your paying 5 figures a month they are disinterested.
There is a whole market out there between $300 to $2000 a month that Cloudflare could tap without making new infrastructure but is actively being ignored.
This.
They lock a lot of features behind an Enterprise plan where they could allow them to be added to a lower plan.
In general, I just hate working with sales reps and would rather avoid a company altogether if I can’t sign up without talking to them.
Not to mention they have on multiple occasions made significant internal changes (including layoffs) to their sales organization. I have a feeling if the public were to get an introspection into their sales pipeline it would be eye opening, and not in a good way
> undisclosed data limits in regions like New Zealand
Can you please explain what this means?
Hit the nail on the head.
Wanted to byt their SASE DLP & Remote Browser Isolation as a startup. Sales wouldn't even talk to us
I believe Cloudflare (and many other cos like it) have never produced operating income. They are growing and obviously important and potentially very profitable in the future, but when discount rates are much higher and you add in some uncertainty, one could argue they don't look as hot as they used to.
It is bizzare. All the old guard foundations of society type companies that the world relies on for modernity have stocks that barely budge but pay out decent dividends. Maybe tech stocks that have grown to such a position should consider paying out dividends instead of failing to chase exponential stock price growth while still clearly doing a lot of productive things. I expect the shareholder boards prefer the chance of exponential wealth over steady returns and prevent this mindset from emerging.