Efficient Deterministic Record and Replay of Multithreaded Programs

多線程程序的高效確定性記錄和重放

Student thesis: Doctoral Thesis

View graph of relations

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date26 May 2021

Abstract

Deterministic replay enables the record of execution traces of programs and the subsequent reproduction of the traces usually for testing and debugging purposes. Existing software-based deterministic replay techniques in the literature record varying information levels during the record phase such as shared memory access interleavings, values of read and write events, system call events and program control flow statements. All replay techniques should address the main technical issues which determine their effectiveness or their practicality including log size, replay probability, record overhead, replay overhead, implementation cost and probe effect. In this thesis, we address three technical challenges by presenting four novel techniques. Some existing techniques record a sub-category of shared memory access interleavings to address record slowdown and log size. This often results in less determinism during the replay phase due to less information being available for replay. Other techniques also record the full set of shared memory access interleavings which negatively affects record slowdown and log size, representing a trade-off between log size, record slowdown and deterministic replay.

The first contribution seeks to alleviate the log size challenge by formulating a novel technique TPLAY which is based on the division of an execution trace into segments. TPLAY segments execution traces of threads into sequential code blocks called transactions. Our insight was that there are usually few to no atomicity violations reported during program executions. Based on this insight, TPLAY records thread access interleavings on shared memory locations at the transactional level. TPLAY also generates an artificial pair of interleavings when an atomicity violation is reported on a transaction. TPLAY achieved the lowest log size during the experiment compared to the existing state of the art.

The second contribution is a novel deterministic replay technique called AggrePlay. AggrePlay segregates a trace using shared write events, and aggregates multiple read-write memory access interleavings into a vector clocks to reduce the record log size whilst maintaining the full set of memory access interleavings. In the replay phase, AggrePlay uses a thread-local scheduling constraint strategy to reduce run-time. In our experiment, AggrePlay outperformed existing state-of-the-art replay techniques in replay probability and delivered average performance in record log size and slowdown.

The third contribution extends the concept of AggrePlay by presenting AggrePlay+, a novel deterministic replay technique. AggrePlay+ reduce the record log size and memory overhead by recording read-write interleavings using variable read count vector clocks. In the experiments, AggrePlay+ incurred less overhead costs and improved replay probability compared to AggrePlay.

The fourth contribution addresses replay run-time overhead with ConPlay, a probabilistic replay technique. ConPlay transforms a set of interleavings into a data structure suitable for concurrent replay, then uses the interleavings as thread-local scheduling constraints. ConPlay decreases replay run-time overhead by enabling threads to run independent of each other and outperformed existing techniques during our experiments in terms of replay slowdown.

In summary, this body of work presents four major contributions. The first contribution, TPLAY, addresses log size overhead in the record phase by proposing a transaction-based recording strategy based on the segmentation of a trace using shared write events. The second contribution, AggrePlay, aggregates read events in read-write access orders to create vector-based read-write interleavings. AggrePlay also increases replay probability by using write-read, write-write and the vector-based read-write interleavings as scheduling constraints. The third contribution presents AggrePlay+, an extension of AggrePlay which improves overhead costs and increases replay probability over AggrePlay. The fourth contribution of this thesis is ConPlay, a novel probabilistic replay technique which reduces replay run-time by scheduling threads using thread-local constraints.

    Research areas

  • Atomicity, Concurrency, Deterministic Replay, Multi-threading, Transactions