Scaling Big Data Processing in the Cloud: Proven Practices with Spark & Ray
How to build shared, elastic and isolated data pipelines on the cloud using Python, Spark and Ray
Nov 19, 20258 min read1

Search for a command to run...
Articles tagged with #distributed-system
How to build shared, elastic and isolated data pipelines on the cloud using Python, Spark and Ray

To explore large-scale data systems, from storage to processing, from the interface to implementation, I'll be writing a variety of articles. This is the first one -- Dataset. Logically, we treat all the data to be processed as a dataset. Representa...