Cloudflare’s Zero Egress Fee Object Storage, R2, Is Now GA
blog.cloudflare.comOne thing I really like about CloudFlare is that they seem to have people who can correctly identify friction-points for developers and have a solid plan on how to solve them. Looking forward to messing with this!
Thanks for the compliment. Keep the pain points coming. I monitor Discord pretty regularly for feedback.
I wish we had the same pricing model for Cloudflare Images. I have never understood the CF Images pricing model [1]
1. https://www.cloudflare.com/en-gb/products/cloudflare-images/....
I find it much easier to reason about pure storage-based pricing compared to storage and egress-based pricing. I can much easier limit how much people can store in my application than add something much harder to understand like transfer quotas. So independent of how R2 compares purely on price I think having a big entry with a much simpler pricing scheme is a win already.
While conceptually I love the idea of not having to explicitly set the region of an object I'm storing, I feel like (especially in a distributed team or product) this could end up with a mishmash of data distributed all over the place with a bunch of different and unpredictable access time and latency characteristics.
Maybe the solution here is "just make sure the asset is cached on the edge" but for first access there has still got to be some impact no?
I'd love to see some test/benchmarks on access latency for stuff uploaded by say a colleague or app hosted in the EU or Asia with me in the US.
If you can cache your assets then the region typically doesn't matter too much (depending on workload).
That being said, I put up a spec to let you provide hints. We won't necessarily honor it today in some cases and in the future we may ignore it altogether, but the thinking is you provide the hint & can retrieve which geographic region the bucket is in.
We also have latency improvements coming down the pipe.
Vitali, how does serving directly from R2 via https work right now under the hood?
Is the data replicated across geos?
Does all or a portion of it get cached at a local edge once requested?
Does it basically behave as if it was handled by CF cache?
These parts are really confusing for me right now.
Loving R2. I am having an issue of uploading larger files though, like 100MB+. The error I get is:
Unable to write file at location: JrF3FnkA9W.webm. An exception occurred while uploading parts to a multipart upload. The following parts had errors: - Part 17: Error executing "UploadPart" on {URL}
with the message:
"Reduce your concurrent request rate for the same object."
Is this an issue on my end or CloudFlare's? I'm not doing anything aggressive, trying to upload 1 video at a time using Laravel's S3 filesystem driver. It works great on smaller files.
Known issue. Currently multipart uploads can only be uploaded at 2 parts per upload ID concurrently. We have a fix pending that should fix that bottleneck within the next month or so (maybe sooner). The change will show up on https://developers.cloudflare.com/r2/platform/changelog/
For now there are typically settings you can configure in whatever client you're using to lower the concurrency for uploads.
Hey does this mean there's currently a max upload size of 1GB since each part can only be 500MB and you can only upload in 2 parts?
To answer my question the answer is no, this is about concurrent requests, so it's 2 at once but it will do upto 500 for each as many times as it needs to, to upload a 20GB file.
Awesome. thank you for the answer :)
Using the same price for read requests, regardless of size feels weird to me (S3 is the same for internal use). The cost to the provider of serving a 100kB file and a 100GB file must be quite different, so why price them the same to the user?
The cost for the egress/ingress link is fixed by how large the pipe is not by how much you use the pipe.
I’m storing 87TB on https://wasabi.com/ for ~$515 a month
If you're using it for hot object storage and not as a cold backup, Wasabi has by far the worst egress situation of any object storage provider. They have "free" egress because egress is capped at 1 full retrieval per month, and they'll just ban you if you use more. R2 has actual, legitimately free egress. I think there's almost no overlap in use cases between Wasabi and R2 even though they're ostensibly both object storage providers.
The trick with Wasabi is that you are generally expected to not retrieve more than you’ve input in the same month, no?
“Less egress” would essentially be the trick on wasabi where zero egress cost is the defining feature of R2. Of course, there must be limits to it but it is interesting.
Wonder if at this point teams start to consider different S3 providers for different weekends
> The trick with Wasabi is that you are generally expected to not retrieve more than you’ve input in the same month, no?
No. It's basically one download of all your data in 1 month [1].
> For example, if you store 100 TB with Wasabi and download (egress) 100 TB or less within a monthly billing cycle, then your storage use case is a good fit for our policy. If your monthly downloads exceed 100 TB, then your use case is not a good fit.
In my experience it's an excellent fit for backups if you run disaster recovery tests quarterly on each set and have enough sets to run on a rotating, monthly schedule. You're only downloading about 25% per month at that point.
I think it shows market failure when someone can offer exactly the same service as a competitor (an S3 API) for an 80% lower price, and not almost immediately take over the whole market.
I think governments need to step in and require that compute platforms like AWS are split up into constituent parts, and there is no cost disadvantage to mix-n-matching between suppliers. Eg. VM's on Azure and storage across the road in AWS should not require payment of egress fees that wouldn't be payable within either providers network.
There's a natural stickiness to cloud infra and SaaS which lends providers a pseudo monopolistic pricing power, even when competitors are present.
Some regulation requiring a common API and one click solution to transfer between providers would help solve this. Needs to be implemented intelligently though
One simple way to do it would be if the FTC announced:
> From January 1st 2023, we will consider it anti-competitive for cloud providers to price internal service bandwidth at a rate lower than internet bandwidth to a competing service.
Big companies like Amazon, Google or Microsoft could set the price to zero, and their smaller competitors would be losing a lot of money each month? Basically an easy way to get rid of the competition
Sounds like it would mostly benefit the large companies
It would be better if internet was considered basic infrastructure and funded by taxes like roads.
Yeah, a lot of thought needs to be put into the actual rule, but something along these lines... While accounting for unintended consequences
For storage you need trust, an S3 competitor doesn't help me if they lose my data or are unreliable. It takes time to earn that trust, and it's far from easy for small competitors to do that.
There's also performance / availability / capacity things to consider as well. It may be for some, but the $/storage isn't typically the whole story.
Oh, wow, upvotes keep coming in, but fell of of front page really fast. Fell out of HN algorithm grace really fast.
The automatic region thing is problematic for many companies.
I would much rather be able to explicitly choose this and know that customers data is where I told them it would be.
From the blog:
> ... we know that data locality is important for a good deal of compliance use cases. Jurisdictional restrictions will allow developers to set a jurisdiction like the ‘EU’ that would prevent data from leaving the jurisdiction.
It's coming: we understand the importance of data locality & residency requirements.
(I work at CF)
Does anyone from the R2 team happen to know if there's a roadmap ETA on this one yet?
https://community.cloudflare.com/t/r2-per-bucket-token/41105...
The fact that you can't separate data for prod and dev with a product that's in GA now is kind of nuts.
I really want to use this, but sadly the one thing that's missing is any sort of bucket access logging.
Unless I'm missing something with how this fits in with Cloudflare's other services.
I'm excited to see more details on how R2 data is going to be replicated across different data centers in the future. I had assumed this was already operational based on previous blog posts so I'm a little disappointed to learn that is still TBD. It's a major reason I chose R2 over S3 as I don't want to manage moving around data for different tenants myself.
Strongly consistent replication is a harder problem. Working on it but it'll take a while. You can get replication today if you put a custom domain in front of your bucket and turn on cache.
With R2 you can also use a bucket or few buckets per tenant whereas with S3 that's not possible (even if you have a fat ENT contract with them from what I've heard). We've extended the S3 spec to make it possible to list more than 1000 buckets [1]. Currently we still ask that if you need more than 1k buckets that you open a customer support request for us to discuss your use-case.
[1] https://developers.cloudflare.com/r2/data-access/s3-api/exte...
Still no word on durability guarantees even for GA?