Probabilistic Graphical Model Based Highly Scalable Directed Community Detection Algorithm
Community detection algorithms have essential applications for character statistics in complex network which could contribute to the study of the real network, such as the online social network and the logistics distribution network. But traditional community detection algorithms could not handle the significant characteristic of directionality in real network for only concentrating on undirected network. Based on Information Transfer Probability method of classic Probabilistic Graphical Model (PGM) theory from Turing Award Owner Pearl, we propose an efficient local directed community detection method named Information Transfer Gain (ITG) from basic information transfer triangles which composed the core structure of community. Then, aiming at processing the large scale directed social network with high efficiency, we propose the scalable and distributed algorithm of Distributed Information Transfer Gain (DITG) based on GraphX model in Spark. Finally, with extensive experiment on directed artificial network dataset and real social network dataset, we prove that our algorithm have good precision and efficiency in distributed environment compared with some classical directed detection algorithms such as FastGN, OSLOM and Infomap.