The present invention discloses a computer program product for data clustering of a target domain that is guided by relevant data clustering of a source domain, and for evaluating cross-domain clusterability of target domain data set and source domain data set. Conventional k-means data clustering generates clusters based only on intrinsic nature of data in the target domain. Due to lack of guidance in clustering data in the target domain, conventional k-means data clustering often results in clusters that are not useful to human users in devising text analytics solutions.