Dr. PO Lai Man (布禮文)


Visiting address
Phone: +852 34427779

Author IDs

Willing to take PhD students: yes


Dr. Lai-Man Po received his BSc degree with First Class Honors and his PhD degree from City University of Hong Kong in 1988 and 1991, respectively. In 1988, he won the First Prize in the Paper Contest for Students and Non-corporate Members organized by the Institute of Electronics and Radio Engineers of Hong Kong. In the same year, he also obtained a 3-year Postgraduate Fellowship from the Sir Edward Youde Memorial Council for his postgraduate studies in City University of Hong Kong. After he obtained the Ph.D. degree, he joined the Department of Electronic Engineering, City University of Hong Kong. Currently, Dr. Po is Associate Professor and Lab Director of Texas Instruments Educational Training Centre in the Electronic Engineering Department. He has published over 160 technical journal and conference papers with more than 4,500 citations.


Dr. Po was the chairman of the IEEE Hong Kong Chapter of Signal Processing Chapter in 2011 and 2012. He is a member of the Technical Committee on Multimedia Systems and Applications, IEEE Circuits and Systems Society and an Associate Editor of the HKIE Transactions. He also served on the organizing committees of the IEEE International Conference on Acoustics, Speech and Signal Processing in 2003, the IEEE International Conference on Image Processing in 2010, and other conferences.


Research Interests/Areas

My recent research interests mainly focus on:

  • Video Coding
  • Content-based Video Copy Detection
  • Image/Video Quality Assessment
  • Biomedical Signal Processing
  • Face Liveness Detection

Recent Research Projects

Biomedical Signal Processing

Remote PPG Signal Estimation from Human Face

Remote Photoplethysmography (rPPG) can achieve contactless monitoring of human vital signs. However, the robustness to a subject's motion is a challenging problem for rPPG, especially in facial video-based rPPG. Based on the optical properties of human skin, we build an optical rPPG signal model in which the origins of the rPPG signal and motion artifacts can be clearly described. The region of interest (ROI) of the skin is regarded as a Lambertian radiator. By considering a digital color camera as a simple spectrometer, we propose an adaptive color difference operation between the green and red channels to reduce motion artifacts. Based on the spectral characteristics of PPG signals, we propose an adaptive bandpass filter to remove residual motion artifacts of rPPG. We also combine ROI selection on the subject's cheeks with speeded-up robust features points tracking to improve the rPPG signal quality.


Image/Video Quality Assessment

No-Reference Video Quality Assessment with Shearlet Transform and Neural Networks

In this work we propose an efficient general-purpose no-reference (NR) video quality assessment (VQA) framework which is based on 3D shearlet transform and Convolutional Neural Network (CNN). Taking video blocks as input, simple and efficient primary spatiotemporal features are extracted by 3D shearlet transform, which are capable of capturing the Natural Scene Statistics (NSS) properties. Then, CNN and logistic regression are concatenated to exaggerate the discriminative parts of the primary features and predict a perceptual quality score. The resulting algorithm, which we name SACONVA (SheArlet and COnvolutional neural network based No-reference Video quality Assessment), is tested on well-known VQA databases of LIVE, IVPL and CSIQ. The testing results have demonstrated SACONVA performs well in predicting video quality and is competitive with current state-of-the-art full-reference VQA methods and general-purpose NR-VQA algorithms. Besides, SACONVA is extended to classify different video distortion types in these three databases and achieves excellent classification accuracy.

Video Coding

Motion Compensation Prediction Algorithms for H.264/AVC

In multiview video coding (MVC), disparity-compensated prediction (DCP) exploits the correlation among dif- ferent views. A common approach is to use block-based motion- compensated prediction (MCP) tools to predict the disparity effect among different views. However, some regions in different views may have various deformations due to nonconstant depth. Thus, performance of DCP is not satisfactory with the simple translational model assumed in conventional block-based MCP tools. Previous attempts to achieve better disparity prediction were usually too complex for practical use. In this project, horizontal scaling and shearing (HSS) effects are investigated to increase interview prediction accuracy for stereo video. HSS deformations are common among images of horizontally aligned views, due to horizontal and vertical flat surfaces that are not parallel with projection image planes. To achieve HSS-based DCP with minimal complexity, an efficient subsampled block-matching technique is adopted and integrated into MVC extension of H.264/AVC in stereo profile. Affine parameters estimation and additional frame buffers are not required and the overall increase of computational complexity and memory requirements are moderate. Experimental results show that the new technique can achieve up to 5.25% bitrate reduction in i nterview prediction using JM17.0 reference software implementation.


Face Liveness Detection

Face Liveness Detection Using Shearlet Based Feature Descriptors

Face recognition is a widely used biometric technology due to its convenience but it is vulnerable to spoofing attacks made by non-real faces such as a photograph or video of valid user. Anti-spoof problem must be well resolved before widely applying face recognition in our daily life. In which, face liveness detection is a core technology to make sure that the input face is a live person. However, this is still very challenging using conventional liveness detection approaches of texture analysis and motion detection. The aim of this paper is to propose a feature descriptor and an efficient framework which can be used to effectively deal with face liveness detection problem. In this framework, new feature descriptors are defined using a multiscale directional transform (shearlet transform). Then, stacked autoencoders and softmax classifier are concatenated to detect face liveness. We evaluated this approach using CASIA Face Anti-Spoofing Database and Replay-Attack database. The experimental results show that our approach performs better than state-of-the-art techniques following the provided evaluation protocols of these databases, and is possible to significantly enhance the security of face recognition biometric system. In addition, experimental results also demonstrate that this framework can be easily extended to classify different spoofing attacks.


Undergraduate and Postgraduate Courses


Undergraduate Final Year Projects (2016/17)

  1. Video-Sense: Content-based Video Copy Detection System
    • LAU Kin Wai, Steven


  2. Video Photoplethysmorgaphy based Heart Monitoring Mobile Apps
    • WONG Ka Kit, David


  3. Fuzzy Systems Theory based Stock Trading Algorithms
    • CHOY Cheuk Piu, Richard


  4. Stroke-based Chinese Input Method for Microsoft Windows-10 Devices
    • Man Chi Chun, Thomas


  5. An Interactive Vocabulary Mobile Learning Game
    • CHEUNG Ho Lun, Lewis
    • LEE Cheuk Tung



MSc Dissertation Projects (2015/16)

  1. Spatial-Transform Image Hashing Algorithms for Image Retrieval and Video Copy Detection Applications
    • YU Wenye


  2. Remote PPG based Automatic Facial Spirit and Complexion Classification for Traditional Chinese Medicine Applications
    • SONG Wenyi
    • RUI Sun


  3. Portfolio Trading Strategy for Automatic Stock Trading System
    • Shen Ling