Latency Tolerant Software Distributed Shared Memory
We present Grappa, a modern take on software distributed shared memory (DSM) for in-memory data-intensive applications. Grappa enables users to program a cluster as if it were a single, large, non-uniform memory access (NUMA) machine. Performance scales up even for applications that have poor locality and input-dependent load distribution. Grappa addresses deficiencies of previous DSM systems by exploiting application parallelism, trading off latency for throughput. We evaluate Grappa with an in-memory MapReduce framework (10? faster than Spark a vertex-centric framework inspired by GraphLab (1.33? faster than native GraphLab and a relational query execution engine (12.5? faster than Shark . All these frameworks required only 60-690 lines of Grappa code.