3D motion data recognition and its application on interactive dancing game
三維人體動作識別及其在交互舞蹈遊戲上的應用
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 3 Oct 2012 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(acc32b99-c594-4677-bf39-b8b18dcc432d).html |
---|---|
Other link(s) | Links |
Abstract
Human motion capture (mocap) has gained an increasing attention in many applications,
such as advanced human-machine interaction, computer animations, digital
films, interactive games etc. A motion capture system is able to record and digitalize
a sequence of human postures representing by the 3D coordinates of a set of body
joints across time. However, as the motions are usually captured continuously, it is
a time- and labor-consuming work to segment and label the data manually. On the
other hand, in some collaborative applications, such as human-machine interaction, it
is also required that the input motions should be learnt automatically and in real time,
which in turn can drive the computer with corresponding react. Hence, efficient methods
to recognize motion data are necessary. In this thesis, we develop new methods
to address the mocap data recognition problem, which includes three sub-problems,
i.e., isolated motion patterns recognition, sequential motion pattern recognition and
real-time motion stream recognition.
For isolated motion patterns recognition, the challenges are mainly caused by the
high dimensionality and great variation of the data. Principle component analysis
(PCA) is an efficient tool to reduce the dimension and extract features. However, it
cannot retain the temporal information of the data points in samples when applied to
time series data such as mocap data. Motivated by this, we propose two singular value
decomposition (SVD) based methods named segmenal SVD (SegSVD) and bidirectional
segmental SVD (Bi-SegSVD). They first segment the motion data into a certain
number of sub-segments, and then process them with SVD in an accumulative manner
along the forward direction for SegSVD and both forward and backward directions for Bi-SegSVD. Based on the segmental features, we calculate the similarity of two samples
using a weighted dynamic time warping (DTW) based measure. The measure is
further extended into a kernel function for support vector machine (SVM) classifier to
classify the motion patterns.
In sequential motion pattern recognition, an input motion is composed of multiple
motion patterns with their categories and boundaries unknown in advance. Thus, an
additional challenge, i.e., to detect the start and end points of the embedded patterns, is
imposed on this task. To address this problem, two new approaches are proposed. First,
we exploit an open-end DTW (OE-DTW) based scheme motivated by the fact that OEDTW
is efficient in matching complete patterns with incomplete ones. By regarding
the input motion as a complete pattern, and taking each of the template patterns as an
incomplete pattern, we apply OE-DTW to find their optimal matched part, according
to which, the embedded patterns are detected and recognized sequentially. Second,
we take advantage of the SegSVD structure that is with multiple levels and detect the
end points of the embedded patterns by referring to top levels of the template patterns
using a new penalty based level matching scheme.
In real-time motion stream recognition, it requires not only to identify and recognize
the embedded patterns in input motions, but also to detect the unwanted motions,
and the task should be finished in real time. Motivated by the fast speed and efficiency
of the content-based indexing techniques, we introduce an body partition index map
based approach for this problem. Noting that human motions are composed of the
sub-motions of upper limbs, legs and torso, we partition the motions into five parts according
to the five body partitions, and process the submotions with standard clustering
techniques separately. A generalized model for each motion class is trained by integrating
the projected cluster node strings of the training trials and five body partition
index maps are constructed. During recognition, the input frames are projected into the
clusters and then used to look up the index maps. With a flexible voting scheme and a
set of end point detection conditions, the input motions are segmented and recognized
as legal patterns or unwanted motions in real time.
Finally, we apply the real-time recognition approach and develop an interactive
dancing game system, in which users’ dance motions are lively captured and recognized,
and according to the recognition result, the corresponding interactive motions
are determined and used to drive the avatar’s animation. Hence, it provides an immersive
environment that users can dance with the avatars (i.e. virtual partners) collaboratively.
- Digital techniques, Computer games, Computer simulation, Image processing, Three-dimensional imaging, Computer vision, Design, Human locomotion