Execution Repair for Spark Programs by Active Maintenance of Partition Dependency

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

View graph of relations

Related Research Unit(s)


Original languageEnglish
Pages (from-to)101555-101573
Journal / PublicationIEEE Access
Online published16 Jul 2021
Publication statusPublished - 2021



Spark programs typically codify to reuse some of their generated datasets, called partition instances, to make their subsequent computations complete in a reasonable time. At runtime, however, the underlying Spark platform may independently delete such instances or accidentally cause these instances inaccessible to the program executions. They will invalidate the efficient computation assumption based on the presence of such depending instances made in writing these programs. In this paper, we present FAR, a novel execution repair framework to effectively maintain the partition instance dependencies for Spark program executions. FAR monitors the partition instance lifecycle activities at all levels, and determines from the execution plan of the current Spark action in the current program execution on whether a partition instance will have a dependency relation with a later one underlying the computation of that action. The experimental results showed that with the active execution repair mechanism of FAR, programs can achieve 7.3x to 67.0x speedup when some dependency partition instances were inaccessible. The results also interestingly revealed that the program executions actively repaired by FAR can run to successful completion in environments with 1.7x-2.0x fewer available memory.

Research Area(s)

  • Big Data, Dataset Dependency, Debugging, Distributed databases, Execution Repair, Licenses, Maintenance engineering, Runtime, Sparks, Urban areas

Download Statistics

No data available