Settings

Theme

Ask HN: Does a dynamically scaling cloud workstation exist somewhere?

7 points by ekns 4 years ago · 2 comments · 1 min read


I frequently work with some 'data science' projects that go from some MB to hundreds of GB.

Ideally I'd have a cloud terminal I'd connect to which could scale its RAM to fit my process RAM usage (and possibly scale up CPUs transparently too).

I know that you can scale up various cloud instances, but managing the runtime state is a problem. I'd like to avoid ever having to kill whatever processes I have running.

Something like Google's Live Migration would also be a good match here, if it enabled migrating to a bigger machine type without rebooting, or without otherwise losing process state.

Ideally I'm looking for something that I could transparently scale up and down, and which I could always SSH into without having to manually start/shutdown the instances.

Bonus points if GPUs could be added/removed in the same manner.

tgdn 4 years ago

Have you looked into Spark? There are managed Spark options on AWS/GCP (for example Databricks). Spark lets you do exactly what you are saying.

Define minimum/maximum number of nodes, the machine capacity (RAM/CPU) and let Spark handle the scaling for you.

It gives you a Jupyter-like runtime to work on possibly massive datasets. Spark is perhaps too much for what you're looking for. Kubernetes could possibly be used with Airflow/DBT possibly, for example for ETL/ELT pipelines.

  • eknsOP 4 years ago

    Ideally I'd like to extend at least the illusion of an ad hoc PC/workstation to the cloud. For me it seems like it would be less effort until I reach some ridiculous scale that requires more engineering and setup anyway.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection