Clustering is an important stage in the knowledge discovery process for identifyingpossible groupings of data into different subsets, each of which contains similar entities.Due to the unsupervised nature of the process, the clustering results are sensitive to thepresence of noise and outliers, and there is a strong motivation of leveraging extrinsicinformation and/or auxiliary datasets to enhance the results.To address the challenge of optimizing this cross-dataset clustering process, we proposeto adopt an information retrieval perspective, through which a dataset at hand isregarded as a search query. Using this query, multiple auxiliary datasets with similarcluster structures are retrieved from an external archive. The structural informationfrom these external data is then leveraged via an unsupervised transfer learningframework to enhance the clustering results.A critical idea in facilitating the query/auxiliary dataset comparison during retrieval isthe association of each dataset with a multi-dimensional feature vector. The vectorcomponents represent the conformance of that dataset to multiple exemplar clusteringsolutions, with each solution summarizing a different structural aspect of the auxiliarydatasets in the archive. In this way, the cluster structures of the query/auxiliarydatasets can be compared from multiple perspectives corresponding to the differentsolutions.We propose to search for these exemplar solutions through a multi-objectiveevolutionary optimization process, which takes into consideration both the criteria ofclustering quality and diversity. In this way, a population of representative yetcomplementary clustering solutions can be discovered. In view of the scarcity of workson cross-dataset clustering, and the lack of a systematic approach for leveragingextrinsic information to enhance clustering results, the proposed framework representsan important step toward automatizing data interpretation/understanding for knowledgediscovery.With the exponential rate of data growth in many areas, and the resulting availability oflarge data archives which contain, e.g. multiple spatiotemporal snapshots of an evolvingprocess, the proposed framework represents a bridge between the previous disparatefields of information retrieval, cross-domain unsupervised learning, and multi-viewclustering, such that the potential huge amount of encapsulated information in archiveddata can be unlocked based on this novel perspective of addressing the clusteringproblem. In this way, the framework can be applied to a broad range of areas such ascrowd movement pattern segmentation in surveillance video understanding, 3D pointcloud segmentation in computer graphics, gene expression clustering in bioinformatics,and community detection in social media analysis.