Artificial Intelligence Through Convex Optimization and Crowdsourced Computation
通過優化理論和眾籌計算實現人工智能
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 17 Sept 2021 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(9ae57a90-8719-4c57-96d4-941ddce12168).html |
---|---|
Other link(s) | Links |
Abstract
Today computers and human minds are working together closer than ever. We deploy specialized algorithms to facilitate and assist humans in complex tasks, and we are also harnessing the human intelligence to improve computational models and solve problems that are not yet possible to be solved by computers alone. This close and reciprocal relationship allows humans and computers to empower and complement each other in achieving greatness. Inspired by this, in this thesis we present three closely related works that try to fill the gap between human intelligence and the raw computational power of computers.
Proving and disproving information inequalities is a fundamental task in information theory. When an inequality involves more than a few random variables, it is often tedious or impossible to prove it manually. In the first topic, we propose a linear programming framework to automate the process. By inspecting the Lagrange duality of the optimization problem, we can obtain hints to construct an analytic proof or a counterexample to disprove the inequality. We then develop a scalable iterative algorithm to efficiently solve the specific optimization problem. Our experiments demonstrate that our algorithm gives superior performance (up to ten times faster) than any existing state of the art. We also develop a web service that allows users to prove their inequalities without installing any software packages or maintaining the dependencies. Utilizing the crowd-sourced computation techniques, the service can partially share the computation requested by different users to minimize the running time.
In the second topic, we develop a statistical machine learning method to automatically grade multiple-choice questions (MCQs) without having access to the correct answers beforehand. The unsupervised algorithm can learn each student’s accuracy in answering the questions and tune the parameters over time, and using the learned parameters it can accurately recover the correct MCQ answers given the answers from a sufficient number of students. We also demonstrate that the same technique can be applied to grade other question types that are seemingly unrelated to MCQs.
Rank aggregation is the process of uncovering the underlying "true" ranking of items from a set of pairwise comparison results. In the last topic, we apply the technique to peer-grading and develop a rank aggregation method that can reliably estimate the true ranking even with noisy crowd-sourced comparison data. This method complements the MCQ grading method in the second topic and handles more challenging question types. Our theoretical analysis and numerical simulation confirm that the accuracy of our method increases with the problem size, which makes it the perfect solution for peer-grading in Massive Open Online Courses (MOOC).
Proving and disproving information inequalities is a fundamental task in information theory. When an inequality involves more than a few random variables, it is often tedious or impossible to prove it manually. In the first topic, we propose a linear programming framework to automate the process. By inspecting the Lagrange duality of the optimization problem, we can obtain hints to construct an analytic proof or a counterexample to disprove the inequality. We then develop a scalable iterative algorithm to efficiently solve the specific optimization problem. Our experiments demonstrate that our algorithm gives superior performance (up to ten times faster) than any existing state of the art. We also develop a web service that allows users to prove their inequalities without installing any software packages or maintaining the dependencies. Utilizing the crowd-sourced computation techniques, the service can partially share the computation requested by different users to minimize the running time.
In the second topic, we develop a statistical machine learning method to automatically grade multiple-choice questions (MCQs) without having access to the correct answers beforehand. The unsupervised algorithm can learn each student’s accuracy in answering the questions and tune the parameters over time, and using the learned parameters it can accurately recover the correct MCQ answers given the answers from a sufficient number of students. We also demonstrate that the same technique can be applied to grade other question types that are seemingly unrelated to MCQs.
Rank aggregation is the process of uncovering the underlying "true" ranking of items from a set of pairwise comparison results. In the last topic, we apply the technique to peer-grading and develop a rank aggregation method that can reliably estimate the true ranking even with noisy crowd-sourced comparison data. This method complements the MCQ grading method in the second topic and handles more challenging question types. Our theoretical analysis and numerical simulation confirm that the accuracy of our method increases with the problem size, which makes it the perfect solution for peer-grading in Massive Open Online Courses (MOOC).