Scheduling and sampling technologies for sensor data
Student thesis: Master's Thesis
Related Research Unit(s)
This thesis addresses three important problems related to sensor data processing with the purpose to improve the correctness of results in execution of sensor queries. The first problem focuses on how to schedule updates to maintain the temporal validity of sensor data with minimal workload. The second problem is how to select the right set of sensors for sensor data aggregation to obtain data values that are precise enough to meet the probabilistic requirements of sensor queries. The third problem is how to guarantee the accuracy of the query results without incurring significant update cost in the context of Location Dependent Continuous Query (LDCQ). In the first part, we study the real-time scheduling algorithms for update transactions associated with sensor data to minimize CPU utilization. Different from the traditional real-time scheduling approaches which adopt periodic update transaction model, we propose a novel algorithm, namely deferrable scheduling algorithm for fixed priority transactions (DS-FP), in which update transactions are scheduled following a sporadic task model. DS-FP exploits the semantics of temporal validity constraint of sensor data by judiciously deferring the sampling times of update transaction jobs as late as possible. The schedulability of the algorithm is examined and a sufficient condition is presented in this thesis. We also present a theoretical analysis of its CPU utilization and prove that DS-FP is optimal for fixed priority schedules in terms of minimizing processor workload. To reduce the on-line scheduling overhead of DS-FP, we further propose two hyperperiod-based approaches with lower scheduling overhead and also satisfying the validity constraint. The first algorithm, namely DEferrable Scheduling with Hyperperiod by Schedule Construction (DESH-SC), searches for a hyperperiod in the DS-FP schedule, and repeats the hyperperiod infinitely. The second algorithm, namely DEferrable Scheduling with Hyperperiod by Schedule Adjustment (DESHSA), adjusts the DS-FP schedule in an interval so that the adjusted schedule in the interval can be repeated infinitely. In this manner, we reduce the on-line scheduling complexity to constant while achieving the CPU utilization close to that of DS-FP. In the second part, we study the problem to select an appropriate set of sensors to guarantee the accuracy requirement of the probabilistic queries in the sensor network. With the aid of continuous probabilistic query (CPQ), we first derive different forms of probabilistic requirements for several common aggregate queries. After analyzing the impact of data on accuracy of query results, we propose algorithms to determine the accuracy requirements of individual data items being queried. The algorithms are based on the partitioning of the sensor network into regions with techniques developed to determine (1) the sampling period for each region adaptively; (2) the sample size and the set of sensors for multiple sensor aggregations within a region at a certain sampling time. Experimental results demonstrate that our statistics-based sensor selection scheme is a very effective approach and it can provide accurate and robust query results. In the third part, we study the problem to maintain the fidelity of the query results of Location Dependent Continuous Query (LDCQ) without incurring significant update cost. We propose a probabilistic update method which makes use of two types of updates, one to keep the uncertainty of the mobile object’s position, and the other using probability that the moving object’s location uncertainty will affect the query result as the threshold to decide whether a location update will be issued. In this part, we derive the required update rate to support the user specified query accuracy based on the motion information of the moving objects. Experimental results demonstrate the effectiveness of our approach compared to traditional deadreckoning based techniques as well as adaptive monitoring schemes.
- Real-time data processing, Detectors