Settings

Theme

Ask HN: Suggestions to host 10TB data with a monthly +100TB bandwidth

203 points by magikstm 3 years ago · 198 comments · 1 min read


I'm looking for suggestions to possibly replace dedicated servers that would be cost-effective considering the bandwidth.

kens 3 years ago

That reminds me of the entertaining "I just want to serve 5 terabytes. Why is this so difficult?" video that someone made inside Google. It satirizes the difficulty of getting things done at production scale.

https://www.youtube.com/watch?v=3t6L-FlfeaI

  • trhr 3 years ago

    Nothing in that video is about scale. Or the difficulty of serving 5TB. It's about the difficulty of implementing n+1 redundancy with graceful failover inside cloud providers.

    User: "I want to serve 5TB."

    Guru: "Throw it in a GKE PV and put nginx in front of it."

    Congratulations, you are already serving 5TB at production scale.

  • ergocoder 3 years ago

    I thought that was making fun of how difficult to implement anything at Google.

  • sacnoradhq 3 years ago

    The interesting thing is there also paradoxes of large scale: things that get more difficult with increasing size.

    Medium- and smaller-scale can often be more flexible because they don't have to incur the pain of nonuniformity as scale increases. While they may not be able to afford optimizations or discounts with larger, standardized purchases, they can provide more personalized services large scale cannot hope to provide.

    • hedora 3 years ago

      On a related note, providers that have independent instances for each customer (so no multi-tenancy) typically get about 3 more nines than, say, AWS. On prem enterprise is a typical example of this, and it is still used in safety critical industries for this reason.

      Eventually, all outages are black swan events. If you have 1000 independent instances (i.e., 1000 customers), when the unexpected thing hits, you’re still 99.9% available during the time when the impacted instance is down.

      Also, you can probably permanently prevent the black swan from hitting again before it hits again.

  • foobarbecue 3 years ago

    This video has always made me wonder: "Why would someone cut open a wet bag of groceries?"

  • hardware2win 3 years ago

    It satirizes corporate ways of doing stuff

wongarsu 3 years ago

Depends on what exactly you want to do with it. Hetzner has very cheap Storage boxes (10TB for $20/month with unlimited traffic) but those are closer to FTP boxes with a 10 connection limit. They are also down semi-regularly for maintenance.

For rock-solid public hosting Cloudflare is probably a much better bet, but you're also paying 7 times the price. More than a dedicated server to host the files, but you get more on other metrics.

  • KomoD 3 years ago

    > Hetzner has very cheap Storage boxes (10TB for $20/month with unlimited traffic)

    * based on fair use

    at 250 TB/mo:

    > In order to continue hosting your servers with us, the traffic use will need to be drastically reduced. Please check your servers and confirm what is using so much traffic, making sure it is nothing abusive, and then find ways of reducing it.

    • coverband 3 years ago

      Thanks. It’s important to be very much aware of this when being enticed by the promise of unlimited bandwidth.

    • j45 3 years ago

      Thanks for finding this.

      Unlimited rarely is.

      Looks like backblaze (other post below) has a free bandwidth and cheap storage solution

  • fragmede 3 years ago

    That's if you use their CDN. Cloudflare R2 doesn't charge for egress bandwidth. If you have 100TB/mo to serve, try it and see what happens. I haven't heard of anyone being kicked off of R2 for using too much egress bandwidth yet.

    At scale, you'll pay a couple thousand dollars for Class B operations on R2, and another bunch for storing the 10 TB in the first place, but that's relatively cheap compared to other offerings where you'd pay for metered egress bandwidth.

    https://developers.cloudflare.com/r2/pricing/ https://r2-calculator.cloudflare.com/

    • hatf0 3 years ago

      CF is not particularly fond of non-Enterprise customers serving more than a few TB/mo. Source: $corp serves 150 TB/mo via CF and pays somewhere north of 50k+/yr for it

  • yakubin 3 years ago

    Wow! It's the first time I'm hearing of this Hetzner offering. It's ideal for my offsite backup needs. Thanks!

    • stavros 3 years ago

      I think rsync.net is much cheaper. They frequently run offers, I think I pay $40/yr for 2 TB or something.

      • yakubin 3 years ago

        I don't know about any special offers, but looking at standard pricing on rsync.net it would cost me $15/month for 1TB, while on Hetzner the same would cost me €3.94/month.

      • qingcharles 3 years ago

        2TB is currently $324/yr if paid yearly at rsync.net.

        I would love to use them as they hang out here, but their prices are way too expensive for my needs.

        • dano 3 years ago

          I use Wasabi at $4.99 per month for 1TB and backup using borg and rclone. I serve only a few files, so I don't know the egress limitations.

      • kiwijamo 3 years ago

        I back up to both rsync and Hetzner. Both have reasonable costs and rsync has given me good long term discounts too without even asking.

  • codersfocus 3 years ago

    I wouldn't call Cloudflare rocksolid, assuming you mean their R2 offering. It goes down pretty regularly.

  • throwaway2990 3 years ago

    Been looking at hetzner for their arm cloud offering. But I can’t figure out if instances upgraded or do you need to terminate and rebuild.

    support seems non existent cos no one answers emails or web chat…

    • chwzr 3 years ago

      You can rescale cloud servers at hetzner, but you will need to shutdown the server in the process

  • novok 3 years ago

    Hetzner is also fairly slow network bandwidth wise unless you're in Europe.

psychphysic 3 years ago

I'd suggest looking into "seedboxes" which are intended for torrenting.

I suspect the storage will be a bigger concern.

Seedhost.eu has dedicated boxes with 8TB storage and 100TB bandwidth for €30/month. Perhaps you could have that and a lower spec one to make up the space.

Prices are negotiable so you can always see if they can meet your needs for cheaper than two separate boxes.

  • dspillett 3 years ago

    > I'd suggest looking into "seedboxes" which are intended for torrenting.

    Though be aware that many (most?) seedbox arrangements have no redundancy, in fact some are running off RAID0 arrays or similar. Host has a problem like a dead drive: bang goes your data. Some are very open about this, afterall for the main use case cheap space is worth the risk, some far less so…

    Of course if the data is well backed up elsewhere or otherwise easy to reproduce or reobtain this may not be a massive issue and you've just got restore time to worry about (unless one of your backups can be quickly made primary so restore time is as little as a bit of DNS & other configuration work).

  • GOATS- 3 years ago

    Yep, resellers of dedicated machines rent servers in bulk so you can often get boxes for way cheaper than you would directly from the host. Take a look at https://hostingby.design as an example.

    • nuclearsugar 3 years ago

      I've been using a HostingBy.Design seedbox (formerly Seedbox.io) for years to distribute content to my patrons for 3 years. They have excellent uptime and their customer service is knowledgeable.

  • KomoD 3 years ago

    Ultra.cc is pretty great too.

jedberg 3 years ago

It's impossible to answer this question without more information. What is the use profile of your system? How many clients, how often, what's the burst rate, what kind of reliability do you need? These all change the answer.

  • aledalgrande 3 years ago

    And what kind of latency is needed, what geo areas are involved? Budget? Engineers available?

  • omniglottal 3 years ago

    "Impossible", yet many others have succeeded commendably... explore what they can do but you cannot. Or else offer examples wherein your constraints exist and drive another solution. "No solution without more info" is a cop-out.

    • jedberg 3 years ago

      I'm sorry, let me clarify since you seem to be very pedantic. It's impossible to answer well without a bunch more information. Yes, there are other answers in this thread, but I would argue they aren't particularly helpful to either OP or any other reader.

      It's kind of like someone going to a group of doctors and saying "I'm in pain", and then the doctors start throwing out reasons the person may be in pain and solutions to that pain.

      Sure, there may be some interesting ideas there, but it doesn't really do OP any good without describing where the pain is, when it started, if they have any other known conditions, etc. etc.

      I know you think you were helping with this comment, but you really weren't.

      • halJordan 3 years ago

        The comment that is unhelpful is the one that has to be voiced but refuses to participate. You're just creating a clamor where a conversation used to be by adding your noise. If you aren't going to participate in the answer beyond saying "I'm not going to answer." Then just don't.

qeternity 3 years ago

Hetzner auctions have a 4x 6TB server for €39.70/mo.

Throw that in RAID10 and you'll have 12TB usable space with > 300TB bandwidth.

  • dinvlad 3 years ago

    Storage Boxes are even cheaper, 20 EUR (or less if you’re outside Europe) for 10TB + unlimited bandwidth.

    • wongarsu 3 years ago

      If we are talking about serving files publicly I'd go with the €40 server for flexibility (the storage boxes are kind of limited), but still get a €20 Storage Box to have a backup of the data. Then add more servers as bandwidth and redundancy requires.

      But if splitting your traffic across multiple servers is possible you can also get the €20 storage box and put a couple Hetzner Cloud servers with a caching reverse proxy in front (that's like 10 lines of Nginx config). The cheapest Hetzner Cloud option is the CAX11 with 4GB RAM, 40GB SSD and 20TB traffic for €3,79. Six of those plus the Storage Box gives you the traffic you need, lots of bandwidth for usage peaks, SSD cache for frequently requested files, and easily upgradable storage in the Storage Box, all for €42. Also scales well at $3,79 for every additional 20TB traffic, or $1/TB if you forget and pay fees for the excess traffic instead.

      You will be babysitting this more than the $150/month cloudflare solution, but even if you factor in the cost of your time you should come out ahead.

      • dinvlad 3 years ago

        Exactly, and also you get to actually understand how it all works together, unlike a bunch of proprietary APIs that only tie you to their particular platform.

        (for those not on the same page, I’m talking from a position of substantial experience with all 3 major clouds)

        Plus, these days the maintenance burden of the OS layer is really heavily overstated. With certain self-updating open-source container OSes one doesn’t even really have to think about patches and all that ancient crap.

        The real appeal of the big players in my mind is only in one use case - scale. If you need 10k servers for heavy “big” data processing like in genomics or ‘AI’ (whatever that means), only then they start to be indisposable. Otherwise, the considerable burden of training all personnel on proprietary APIs is just not worth it - it literally costs less to buy and configure your own system (or a traditional VPS or dedicated server). Cloud architects ain’t cheap!

      • aledalgrande 3 years ago

        > even if you factor in the cost of your time you should come out ahead

        There is always the hidden cost of not spending time on activities that are core to your business (if this is indeed for a business) that would make multiples of the money CF costs you.

        • midasuni 3 years ago

          Yup. Managing the CF account is a hidden cost you need to account for.

        • j45 3 years ago

          There is more and more that can be done like proxmox to help make self hosted as much as an appliance as anything else.

          • dinvlad 3 years ago

            That, and also NixOS - I’m discovering it for myself now, and it’s been a revelation! Configuring absolutely everything declaratively from scratch, even the disk partitions - a dream for reproducibility. It even has configurable “micro-VMs”, which would not be as easy to do via Proxmox (not counting LXC), since they would have to be built manually. Though Proxmox does have some nice benefits over it as well, especially considering their ecosystem with PBS, mail server etc

            • j45 3 years ago

              Thanks for the intro to NixOS. I was trying to remember one I had seen & forgotten and I think this may have been it.

              I have been playing more and more with UTM in the Mac world and it's encouraging how mature it seems already and hopefully can be picked up into NixOS, Sandstorm, etc.

              I like proxmox more personally, however have changed my stance recently where nix and sandstorm could just be run in a proxmox vm and then provide more of an IaaS role. The newer versions of ProxMox are even easier and they were pretty OK the past 5-7 years.

    • midasuni 3 years ago

      1gbit is 300T a month, 10g is 3000T a month.

      There’s always a limit, that might be measured in TB, PB or EB, and may be what you determine practical or not, but it’s there

      • dinvlad 3 years ago

        I think it’s more about peace of mind, unlimited really means I won’t wake up tomorrow with a $10k bill, as it happened many times (not to me) on AWS and the like. That is the disgusting practice the big cloud providers like to impose, for no apparent reason but to keep you in their roach motel and pay up. Disgusting!

      • abigail95 3 years ago

        I'm wonderinng if anyone even gets close to 1gbs from a Hetzner storage box

      • j45 3 years ago

        Really good way of putting it.

  • princevegeta89 3 years ago

    Isn't that enterprise HDD going to be slower for a cloud instance though?

    • Hamuko 3 years ago

      SSDs are definitely not "cost-effective" for 10 TB if that's what you're suggesting.

      • dehrmann 3 years ago

        It looks like HDD prices are at ~40% of SSD prices (new drives, not cloud-hosted). SSDs are starting to make sense for more things, now.

      • princevegeta89 3 years ago

        I mean to say, for general self-hosting services and apps, HDDs seem to have that performance problem and latency, which could lead to a negative experience?

      • yread 3 years ago

        Unless you need the performance. if you compare a single ssd to a raid 6 with 5+ drives to lower the seek times the ssd will always come on top

feifan 3 years ago

Consider storing the data on Backblaze B2 ($0.005/GB/month) and serving content via Cloudflare (egress from B2 to Cloudflare is free through their Bandwidth Alliance).

(No affiliation with either; just a happy customer for a tiny personal project)

  • rewgs 3 years ago

    Man, thanks so much for this. I’m using Wasabi with a Yarkon front end right now and it’s great, but Backblaze/Cloudflare is looking like a serious contender.

    • hardcopy 3 years ago

      FYI, video files (and potentially anything not text/HTML) is prohibited through Cloudlare in their TOS (some services like r2 excluded)

      • graton 3 years ago

        They have updated their TOS a few days ago:

        https://blog.cloudflare.com/updated-tos/

        The prohibition on non-HTML content seems to only apply to CDN usage now.

        • tgtweak 3 years ago

          That is exactly the use case, which is hosting the files on b2 (not cdn capable) and caching+serving from cloudflare. Unless the files in question are webpages or static webpage content (doubtful) then it would definitely be exactly the target of these new TOS updates.

        • sdfhbdf 3 years ago

          AFAIK Serving content from Backblaze via CloudFlare is exactly CDN Usage

indigodaddy 3 years ago

BuyVM has been around a long time and have a good reputation. I’ve used them on and off for quite a while.

They have very reasonably priced KVM instances with unmetered 1G (10G for long-standing customers) bandwidth that you can attach “storage slabs” up to 10TB ($5 per TB/mo). Doubt you will find better value than this for block storage.

https://buyvm.net/block-storage-slabs/

noja 3 years ago

To host it for what? A backup? Downloading to a single client? Millions of globally distributed clients uploading and downloading traffic? Bittorrent?

wwwtyro 3 years ago

If it fits your model, WebTorrent[0] can offload a lot of bandwidth to peers.

[0] https://github.com/webtorrent/webtorrent

  • giantrobot 3 years ago

    At some point you still need a seed for that 10TB of data with some level of reliability. WebTorrent only solves the monthly bandwidth iff you've got some high capacity seeds (your servers or long-term peers).

bosch_mind 3 years ago

Cloudflare R2 has free egress, cheap storage

  • ttul 3 years ago

    And they just added TCP client sockets in Workers. We are just one step step away from being able to serve literally anything on their amazing platform (listener sockets).

    • itake 3 years ago

      Does this mean you can self host ngrok?

      • ttul 3 years ago

        Only client sockets are available. So what you can do is build a worker that receives HTTP requests and then uses TCP sockets to fetch data from wherever, returning it over HTTP somehow.

  • slashdev 3 years ago

    Cloudflare has free bandwidth up to a point and then they will charge you. That’s not really that surprising though.

    • FBISurveillance 3 years ago

      PSA: The informal point at which they'll reach out used to be about 300TB/month about 2 years ago.

      • Atlas22 3 years ago

        It may depend on the makeup of data or something. They "requested" one of my prior projects go on the enterprise plan after about 50TB, granted the overwhelming majority of transfer was for distributing binary executables so I was in pretty blatant violation of their policy. This was 2015ish, so the limit could also have gone up over time as bandwidth gets cheaper too.

      • slashdev 3 years ago

        Thanks, I’ve heard about this, but never actually with numbers. It’s good to know.

    • ozr 3 years ago

      I think that's the case with their free CDN/DNS/proxy offering, not R2.

    • camhart 3 years ago

      R2 egress is free. You pay per request and per gb-month.

  • gok 3 years ago

    Looks like it would be $150/month for OP's needs?

bombcar 3 years ago

That’s not even saturating a Gb/s line. Many places offer dedicated with that kind of bandwidth.

andai 3 years ago

If it's for internal use, I have had good results with Resilio Sync (formerly BitTorrent Sync).

It's like Dropbox except peer to peer. So it's free, limited only by your client side storage.

The catch is it's only peer to peer (unless they added a managed option), so at least one other peer must be online for sync to take place.

  • microtonal 3 years ago

    They don't really maintain the regular Sync client anymore, only the expensive enterprise Connect option. My wife and I used Resilio Sync for years, but had to migrate away, since it had bugs and issues with newer OS versions, but they didn't care to fix them. Let alone develop new features.

callamdelaney 3 years ago

Wasabi, unlimited bandwidth. $5.99/tb/month. Though it is object storage.

See: https://wasabi.com/cloud-storage-pricing/#cost-estimates

They could really do with making the bandwidth option on this calculator better.

johnklos 3 years ago

If price is a consideration, you might consider two 10 TB hard drives on machines on two home gbps Internet connections. It's highly unlikely that both would go down at the same time, unless they were in the same area, on the same ISP.

  • sireat 3 years ago

    How do you set up load balancing for those two connections?

    That is yourdomain.com -> IP_ISP1, IP_ISP2

    Going the other way from yourserver -> outside would indicate some sort of bonding setup.

    It is not trivial for a home lab.

    I use 3 ISPs at home and just keep each network separate (different hardware on each) even though in theory the redundancy would be nice.

    • johnklos 3 years ago

      Just use two A records for the one DNS name, and let the clients choose.

      The other way is to have two names, like dl1 and dl2, and have your download web page offer alternating links, depending on how the downloads are handled.

      You very rarely can do multi-ISP bonding, often not even with multiple lines from the same ISP, unfortunately.

walthamstow 3 years ago

I would also like to ask everyone about suggestions for deep storage of personal data, media etc. 10TB with no need for access unless in case of emergency data loss. I'm currently using S3 intelligent tiering.

  • Atlas22 3 years ago

    I like to use rsync.net for backups. You can use something like borg, rsync, or just sftp/sshfs mount. Its not as cheap as something like S3 deep (in terms of storage) but it is pretty convient. The owner is a absolute machine and frequently visits HN too.

  • ericpauley 3 years ago

    S3 is tough to beat on storage price. Another plus is that the business model is transparent, i.e., you don't need to worry about the pricing being a teaser rate or something.

    Of course the downside is that, if you need to download that 10TB, you'll be out $900! If you're worried about recovering specific files only this isn't as big an issue.

  • NilsIRL 3 years ago

    OVH Cloud Archive seems to have very attractive prices if you're not accessing the data often https://www.ovhcloud.com/en-gb/public-cloud/prices/#473

  • msh 3 years ago

    Hertzner storagebox or back blaze b2 is the cheapest options.

  • harrymit907 3 years ago

    Wasabi is the best option for you. 10TB would be around 60$/month and they offer free egress as much as your storage. So you can download upto 10TB per month.

  • seized 3 years ago

    Glacier Deep Archive is exactly what you want for this, that would be something like $11/month ongoing, then about $90/TB in the event of retrieval download. Works well except for tiny (<150KB) files.

    Note that there is Glacier and Glacier Deep Archive. The latter is cheaper but longer minimum storage periods. You can use it as a life cycle rule.

  • jedberg 3 years ago

    AWS Glacier. That's where I keep (one copy of) my wedding video, all the important family photos, etc.

  • Hamuko 3 years ago

    I do my archiving to S3 Glacier Deep Archive but my data volume is still so low that Amazon doesn't bother charging my card.

    • hossbeast 3 years ago

      Does it accumulate month to month and eventually they charge you once a threshold is reached?

      • Hamuko 3 years ago

        I think they'll charge me only when my current monthly statement is enough to charge. Pretty sure I've never been charged so far with my monthly statement being like 0.02€.

        • kuratkull 3 years ago

          I have the same situation with backblaze B2, cents a month bill, they roll that into one ~ yearly charge.

    • xrisk 3 years ago

      How do you achieve deduplication with S3?

    • m-p-3 3 years ago

      how much data are we talking about?

      • Hamuko 3 years ago

        Some tens of gigabytes at this point? It's definitely not a lot. Mostly just some stuff that doesn't make sense to keep locally but I still want to have a copy in case a disaster strikes.

  • NotYourLawyer 3 years ago

    Tarsnap.

    • sacnoradhq 3 years ago

      Too expensive for all but critical use-cases. MEGA and Backblaze are way, way cheaper.

  • oefrha 3 years ago

    <s>GSuite</s> Google Workplace business plan is still essentially unlimited, for 10TB anyway.

rapjr9 3 years ago

I helped run a wireless research data archive for a while. We made smaller data sets available via internet download but for the larger data sets we asked people to send us a hard drive to get a copy. Sneakernet can be faster and cheaper than using the internet. Even if you wanted to distribute 10TB of _new_ data every month, mailing hard drives would probably be faster and cheaper, unless all your customers are on Internet2 or unlimited fiber.

bakugo 3 years ago

The answer to this question depends entirely on the details of the use case. For example, if we're talking about an HTTP server where a small number of files are more popular and are accessed significantly more frequently than most others, you can get a bunch of cheap VPS with low storage/specs but a lot of cheap bandwidth to use as cache servers to significantly reduce the bandwidth usage on your backend.

ez_mmk 3 years ago

Had 300tb of traffic on a Hetzner server up and down no problem with much storage

chaxor 3 years ago

I always assumed having a raspberry pi with a couple HDs in raid1 with IPFS or torrent would be the best way to do this.

Giving another one of these raid1 rpis to a friend could make it reasonably available.

I am very interested to know if there are good tools around this though, such as a good way to serve a filesystem (nfs-like for example) via torrent/ipfs and if the directories could be password protected in different ways, like with an ACL. That would be the revolutionary tech to replace huggingface/dockerhub, or Dropbox, etc.

Anyone know of or are working on such tech?

  • Voklen 3 years ago

    If you just want to be able to sync a directory between multiple devices with encryption options I'd recommend Syncthing. It's dead easy to set up, I've currently got it on a rpi backing up all my photos from my phone while syncing my Obsidian vault beteen my phone and desktop.

    • chaxor 3 years ago

      Yeah that's a good suggestion for that use case. I was thinking a bit more along the lines of 2 other use cases: 1) If you have a file locally and want to send a link to friend/family member (yourself or even some random person on the internet) for a 1TB or 1MB file to download, but for it to be optionally password protected to download. 2) You want to set up a package/script to automatically download a file when started (NN weights for example) and for the download to retrieve like IPFS torrent from everyone who has that file (I.e. is running the package).

      The system in (2) works ok for downloading a Dockerfile that points to an IPFS file if you put the link there; however, a considerable number of things don't fit the suggestions in (2), such as not automatically becoming a seeder of that file when downloaded or when running the package. There is also a great amount of opportunity in making the process of uploading files to IPFS much simpler. One example for the code idea would be something like git hooks, such that any time a major version of a git commit was made, a set of files would be added to IPFS for this type of distribution. Ultimately a 'plug-n-play' package to add in a specified way e.g. setup.py would be the best way to get something like that going. Then perhaps a simple program like synching or miniserve that operates on top of that functionality would allow for something more like (1).

    • lostmsu 3 years ago

      I use Syncthing, but how prone is it to data loss of any kind? Bitrot?

      • chaxor 3 years ago

        The biggest problems in aware of is sync conflicts, which just make things a little difficult. If they're text files it's not so bad, since vimdiff can easily merge. But if they're encrypted or more complex formats ... :/

trhr 3 years ago

Hell, make me a fair offer and I'll throw it up on ye olde garage cluster. That thing has battery backup, a dedicated 5 Gbps pipe, and about 40 TB free space on Ceph. I'll even toss in free incident response if your URL fails to resolve. But it'll probably be your fault, cause I haven't needed a maintenance window on that thing in like three years.

jayonsoftware 3 years ago

Spend some time on https://www.webhostingtalk.com/ and you will find a lot of info. For example https://www.fdcservers.net/ can give you 10TB storage and 100GB bw for around $300....but keep in mind the lower the price you pay, the lower the quality...just like any other products.

tgtweak 3 years ago

OVH is probably your best bet and should be the cheapest both for hosting and serving the files. You'd be hard pressed to beat the value there without buying your own servers and colocating in eastern Europe.

Most of their storage servers have 1gbps unmetered public bandwidth options and that should be sufficient to serve ~4TB per day, reliably.

bicijay 3 years ago

Backblaze B2 + CDN on top of it

  • gravitronic 3 years ago

    Cloudflare on top would be free bandwidth (bandwidth alliance)

    • Atlas22 3 years ago

      Unless its 100TB/mo of pure HTML/CSS/JS (lol) cloudflare will demand you be on enterprise plan long before 100TB/mo. The fine print makes it near useless for any significant volume.

scottmas 3 years ago

Surprised no one has said Cloudflare Pages. Might not work though depending on your requirements since there’s a max of 20,000 files of no more than 25 mb per project. But if you can fit under that, it’s basically free. If your requirements let you break it up by domain, you can split your data across multiple projects too. Latency is amazing too since all the data is on their CDN.

jacob019 3 years ago

Smaller VPS providers are a good value for this. I'm currently using ServaRICA for a 2TB box, $7/mo. I use it for some hosting, but mostly for incremental ZFS backups. Storage speed isn't amazing, but it suits my use case.

I'm using cloudflare R2 for a couple hundred GB, where I needed something faster.

bullen 3 years ago

I think 2x 1 Gb/s symmetric home fibers + SuperMicro 12x SATA Atom Mini-ITX with Samsung drives can solve this fairly cheaply and durably depending on write intensity.

That said above 80 TB is looking hard for eternity, unless you can provide backup power and endure noise of spinning drives.

ignoramous 3 years ago

One way I can think of is to serve files over BitTorrent with a (HTTP) web seed stored across blob stores at Cloudflare R2, Backblaze B2, Wasabi.

I briefly looked at services selling storage on FileCoin / IPFS and Chia, but couldn't find anything that inspired confidence.

justinclift 3 years ago

Any idea on how many files/objects, and how often they change?

Also, any idea on the number of users (both average, and peak) you'd expect to be downloading at once?

Does latency of their downloads matter? eg do downloads need to start quickly like a CDN, or as long as they work is good enough?

roetlich 3 years ago

As a previous employee there I'm very biased, but I think bunny.net has pretty good pricing :)

cheeseprocedure 3 years ago

Datapacket has a number of dedicated server configurations in various locations and offers unmetered connections:

https://www.datapacket.com/pricing

j45 3 years ago

The OP request would benefit from details, but the solution depends on what format the data is and how to be shared.

Assuming the simplest need is making files available :

1) Sync.com provides unlimited hosting and file sharing from it.

Sync is a decent Dropbox replacement with a few more bells and whistles.

2) BackBlaze business let’s you deliver files for free via their CDN. $5/TB per month storage plus free egress via their CDN.

https://www.backblaze.com/b2/solutions/developers.html

Backblaze seems to be 70-80% cheaper than S3 as it claims.

Traditional best practice cloud paths are optimized to be a best practice to generate profit for the cloud provider.

Luckily it’s nice to rarely be alone or the first to have a need.

ericlewis 3 years ago

Cloudflare R2, egress is free. Storing 10TB would be about $150 a month.

mindcrash 3 years ago

Several huge infrastructure providers offer decent VPS servers and bare metal with free bandwidth for pretty reasonable prices nowadays.

You might want to check out OVH or - like mentioned before - Hetzner.

api 3 years ago

Bare metal hosting with bandwidth priced by pipe size rather than gigabyte. Check out DataPacket, FDCServers, Reliablesite, Hivelocity, etc.

Cloud bandwidth is absolutely enormously overpriced.

FBISurveillance 3 years ago

Hetzner would work too.

sacnoradhq 3 years ago

Personally, at home, I have ~600 TiB and 2 Gbps without a data cap.

I can't justify colo unless I can get 10U for $300/month with 2kW of PDU, 1500 kWh, and 1 GbE uncapped.

winrid 3 years ago

You could do this for about $1k/mo with Linode and Wasabi.

For FastComments we store assets in Wasabi and have services in Linode that act as an in-memory+on disk LRU cache.

We have terabytes of data but only pay $6/mo for Wasabi, because the cache hit ratio is high and Wasabi doesn't charge for egress until your egress is more than your storage or something like that.

The rest of the cost is egress on Linode.

The nice thing about this is we gets lots of storage and downloads are fairly fast - most assets are served from memory in userspace.

Following thread to look for even cheaper options without using cloudflare lol

  • qeternity 3 years ago

    > You could do this for about $1k/mo with Linode and Wasabi.

    This is still crazy expensive. Cloud providers have really warped people’s expectations.

    • winrid 3 years ago

      Well, for us it's actually really cheap because we really just want the compute. The bandwidth is just a bonus.

      Actually, since the Akami acquisition it would be even cheaper.

      $800/mo to serve 100TB with fairly high bandwidth and low latency from cold storage is a good deal IMO. I know companies paying millions a year to serve less than a third of that through AWS when you include compute, DB, and storage.

      • qeternity 3 years ago

        Fine, but now you’re changing the comparison. Spending millions on compute with low bandwidth requirements doesn’t make it stupid. It probably still is, but that’s a different conversation.

        • winrid 3 years ago

          lol no, it sure is. But OP didn't give a price range or durability requirements...

    • winrid 3 years ago

      You could do it through interserver for $495/mo (5 20gb sata disks, 150tb free bandwidth). 10gbps link. 128gb ram for page cache.

      Backups probably wouldn't be much more.

  • winrid 3 years ago

    Actually, with Interserver and Wasabi you could probably get it under $100/mo.

    You could just run Varnish with the S3 backend. Popular files will be cached locally on the server, and you'll pay a lot less for egress from Wasabi.

  • winrid 3 years ago

    EDIT it would be ~$500 through linode + wasabi. I was thinking of included bandwidth, but if you just pay overages it's cheaper.

k8sToGo 3 years ago

People here suggest Hetzner. Just be aware that their routing is maybe not as good as you get for more expensive bandwidth.

  • GC_tris 3 years ago

    Can you be more specific?

    Hetzner has excellent connectivity: https://www.hetzner.com/unternehmen/rechenzentrum/ They are always working to increase their connectivity. I'd even go so far to claim that in many parts of the world they outperform certain hyperscalers.

    • k8sToGo 3 years ago

      I used to have a dedicated server there and what happened to me is that my uploads were fast, but my downloads were slow. Looking at an MTR route, it was clear that the route back to me was different (perhaps cheaper?). With google drive for example I could always max out my gbit connection. Same with rsync.net

      Also I know that some cheaper Home ISPs also cheap out on peering.

      Now, this was some time ago, so things might have changed, just as you suggested.

pierat 3 years ago

Sounds like you could find someone with a 1Gbps symmetric fiber net connection, and pay them for it and colo. I have 1Gbps and push that bandwidth every month. You know, for yar har har.

And that's only 309Mbits/s (or 39MB/s).

And a used refurbished server you can easily get loads or ram, cores out the wazoo, and dozens of TB's for under $1000. You'll need a rack, router, switch, and batt backup. Shouldn't cost much more than $2000 for this.

jstx1 3 years ago

Is there a reason people aren’t suggesting some cloud object storage service like S3, GCS or Azure storage?

  • Hamuko 3 years ago

    I once had a Hetzner dedicated server that held about 1 TB of content and did some terabytes of traffic per month (record being 1 TB/24 hours). Hetzner charged me 25€/month for that server and S3 would've been like $90/day at peak traffic.

  • abatilo 3 years ago

    Those bandwidth costs are going to be so expensive

  • hdjjhhvvhga 3 years ago

    Because they are an order of magnitude more expensive.

KaiserPro 3 years ago

whats your budget?

who are you serving it to?

how often does the data change?

is it read only?

What are you optimising for, speed, cost or availability? (pick two)

dark-star 3 years ago

you can definitely do this at home on the cheap. As long as you have a decent internet connection, that is ;) 10TB+ harddisks are not expensive, you can put them in an old enclosure together with a small industrial or NUC PC in your basement

  • sacnoradhq 3 years ago

    I current have 45 WUH721414ALE6L4 drives in a Supermicro JBOD SC847E26 (SAS2 is way cheaper than SAS3) connected to an LSI 9206-16e controller (HCL reasons) via hybrid Mini SAS2 to Mini SAS3 cables. The SAS expanders in the JBOD are also LSI and qualified for the card. The hard drives are also qualified for the SAS expanders.

    I tried this using Pine ROCKPro64 to possibly install Ceph across 2-5 RAID1 NAS enclosures. The problem is I can't get any of their dusty Linux forks to recognize the storage controller, so they're $200 paperweights.

    I wrote a SATA HDD "top" utility that brings in data from SMART, mdadm, lvm, xfs, and the Linux SCSI layer. I set monitoring to look for elevated temperature, seek errors, scan errors, reallocation counts, offline reallocation, and probational count.

    • InvaderFizz 3 years ago

      What's your redundancy setup in the 45 drive configuration? I would guess 20-22 mirrors with 1-5 hotspares, but it's not clear.

jszymborski 3 years ago

Wasabi.com is usually a good bet when your primary cost is bandwidth.

  • slashdev 3 years ago

    > If your monthly egress data transfer is less than or equal to your active storage volume, then your storage use case is a good fit for Wasabi’s free egress policy

    > If your monthly egress data transfer is greater than your active storage volume, then your storage use case is not a good fit for Wasabi’s free egress policy.

    https://wasabi.com/paygo-pricing-faq/

risyachka 3 years ago

Find a bare metal server with 1GBit connection and you are all set.

snihalani 3 years ago

Cloudflare R2 or Oracle Cloud Infra or Hetzner

bluedino 3 years ago

What are you using now and what does it cost?

influx 3 years ago

I would do it in S3 + Cloudfront if your budget supported it. Mostly because I don't want to maintain a server somewhere.

  • ignoramous 3 years ago

    This is one way, but wouldn't be cost-effective bandwidth wise (which is OPs main concern).

sacnoradhq 3 years ago

Backblaze, MEGA, or S3 RRS.

pollux1997 3 years ago

Copy the data to disk and ship them to the place they need to be used.

rozenmd 3 years ago

Cloudflare R2?

wahnfrieden 3 years ago

bittorrent and some home servers

kokizzu5 3 years ago

contabo

subhro 3 years ago

tarsnap

delduca 3 years ago

I would go with BunnyCDN at front of some storage.

aborsy 3 years ago

S3 is probably the highest quality. It’s enterprise grade : fast, secure with a lot of tiers and controls.

If you recover only small data, it’s also not expensive. The only problem is if you recover large data. That would be a major problem.

  • capableweb 3 years ago

    10TB storage + 100TB bandwidth and S3 will easily be +1000 USD per month, while there are solutions out there that are fast and secure with unrestricted bandwidth for less than 100 USD per month. Magnitude cheaper with same grade in "enterprisey".

    • aborsy 3 years ago

      Well, I said, if you store small data. For large data, sure, prohibitively expensive!

      I don’t think many other solutions are equally fast and secure.

      AWS operation is pretty transparent, documented, audited and used by governments. You can lock it down heavily with IAM and a CMK KMS key, and audit the repository. The physical security is also pretty tight, and there is location redundancy.

      Even hetzner doesn’t have proper redundancy in place. Other major providers in France burned down (apparently with with data loss), or had security problems with hard drives stolen in transport.

      I don’t work for AWS, don’t have much data in there, just saying. GCP and Azure are probably also good.

      • charcircuit 3 years ago

        >Well, I said, if you store small data.

        Well, the OP said he would be using >100 TB a month.

        >GCP and Azure are probably also good.

        They similarly charge 100x for bandwidth. No they are not a good option either.

      • KomoD 3 years ago

        OPs post is literally about 10TB storage and 100+TB/mo

      • capableweb 3 years ago

        > Ask HN: Suggestions to host 10TB data with a monthly +100TB bandwidth

  • sacnoradhq 3 years ago

    Too expensive. RRS would be the only consideration, and it's not a for-sure thing compared to other options.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection