Discovering Community Structure of Complex Networks using Genetic Algorithms

Published:

Discovering Community Structure of Complex Networks using Genetic Algorithms by Anwar Saeed (2016)

In the real world, there are various kinds of complex systems around us. These systems can be represented as a networks, consisting of nodes and links. Nodes represent individuals while links represent the relationship between these individuals. Network science is the study of networks with respect to its theoretical foundation of dynamic behavior and its applications. Using network science methods, we can extract useful information from these networks. An important feature of these real world networks is called communities. Communities are the sub-groups of networks which are densely intra-connected and sparsely interconnected. Communities can reveal abundant hidden information of complex networks that are not easy to detect by simple observation. To analyze and understand the structure of these networks, community detection techniques are widely used by researchers. The researchers have introduced a number of techniques to discover communities from complex networks. These techniques are based on two major properties of the networks which are node community and link community techniques. Node community techniques are those which consider nodes attributes like internal and external degrees, labels and so on, while link community techniques consider links attributes like common neighbors, edge labels and so on. Some of these techniques force every node to be assigned to a single community while some of them find overlapping communities. Many community detection methods are based on Modularity Optimization Techniques (MOT). MOT is based on the optimization of the modularity of a community. One of the category of these techniques uses genetic algorithms for discovering community structure of complex networks. Existing genetic-based techniques based on the random initialization for generating initial population, which generate low fitness valued population. We observe that clustering coefficient is important to use for generating initial population to discover better community structure of complex networks, because clustering coefficient split the networks into bridges in the initial stage of the algorithm and divide networks into different communities in such a way that densely connected nodes are grouped together into a single community. In this dissertation, we present an approach for discovering community structure based on Genetic Algorithms and Clustering Coefficient i.e., community detection to overcome the problem associated with random initialization of population in existing genetic-based techniques. Genetic algorithms provide the advantage of competitive and evolutionary behavior to identify community structure, and clustering coefficient generates a better quality population. We represent the networks using locus-based adjacency representation which helps us to discover better community structure. We also use adaptive mutation to optimize our search space and extend a standard genetic algorithm to improve our results. We perform experiments on seventeen different real-world and synthetic networks to measure the effectiveness of our proposed algorithm using three well-known evaluation measures i.e., modularity, NMI and partitions density. We use modularity and partitions density for real-world networks and NMI for synthetic networks. We compare the results of our proposed algorithm against six widely used techniques. The results reveal that on both types of networks, our proposed approach outperforms other methods using modularity measure and performs competitively using NMI measure.

Paper published