Leveraging Heterogeneous Research Social Network for Dynamic Science Classification Schemes Mapping


Student thesis: Doctoral Thesis

View graph of relations


  • Wei DU

Related Research Unit(s)


Awarding Institution
Award date26 Aug 2016


Science classification schemes (SCSs, often referred to as foundational ontology) play an important role in categorizing scientific resources such as research projects, publications and patents. The existence of different SCSs creates information communication barrier across scientific databases. Inconsistencies arise due to uncertainties of SCSs mapping in research management (e.g. evaluation of overall research performance in terms of research projects, publications and patents). Several methods have been proposed to map inconsistent classification schemes, e.g. content-based method and structure-based method, but they can hardly meet the practical needs in big data era where the mapping relations are changing dynamically.
User-generated content on research social network platforms provides rich and dynamic relations among various scientific resources, which makes the dynamic mapping among different science classification schemes a reality. Therefore, this dissertation proposes a Social Network Empowered Dynamic Mapping (SNEDM) method based on a research social network platform. In a research social network platform, researchers are connected with various scientific resources (e.g. research publications, projects and patents). SNEDM first constructs a heterogeneous research social network by connecting researchers and various scientific resources with research activities and social activities. A structural similarity measure (i.e. W-T SimRank) is designed to calculate the temporal relatedness between two heterogeneous individuals in the network. A parallel computing algorithm based on enhanced-MapReduce is introduced to improve the calculation efficiency in large networks. The method is implemented on the largest research social network platform in China, i.e. ScholarMate, to realize dynamic mapping among SCSs. A case study of dynamic mapping between the classification scheme of National Natural Science Foundation of China (NSFC), the largest government research funding agency in China, and that of Web of Science, the most popular bibliographic database in science, is provided and the experiments are conducted to reveal the performance of the proposed method. Experiments show that the proposed parallel SNEDM method performs well in terms of both effectiveness and efficiency. Dynamic mapping results show the consistency of mapped (i.e. most relevant) classes across different time periods, and also reveal the changes of mapping patterns. The dynamic mapping results can also be used for trending analysis.
The proposed SNEDM method is a new attempt to perform dynamic mapping among SCSs in a research social network platform. It has shown to be effective in scientific information sharing, research management and knowledge transfer applications. The proposed method can be extended with focus on the implementation of parallel calculation using Spark on top of RDD (Resilient Distributed Datasets) to improve efficiency in iterative calculation. Additional experiments will be conducted to further evaluate the performance of applications supported by the proposed SNEDM method. An increasing number of companies are adopting research social network platforms for research and innovation activities, and SNEDM can be customized for the mapping between science and industry classification schemes to facilitate academia-industry collaboration.