An Optimized Document Information Retrieval Framework Using Clustering Techniques Integrated with Bacterial Foraging Optimization
Keywords:
Cluster, swarm intelligence, centroid, information retrieval, and chemotaxis.Abstract
In this study, a new Document Information Retrieval (DIR) framework is designed using K-means clustering method with Bacterial Foraging Algorithm which will overcome the scalability and cost computation required while adaptation with large document. The documents are first pre-processed using vector space representation with TF-IDF weighting and partitioned into clusters to reduce the search space dimensionality. Followed by this BFA used for smart doc exploration which takes the support of chemotaxis, swarming, reproduction, and elimination-dispersal to find the related doc based on the user query. Integrating clustering with swarm intelligence can improve retrieval accuracy and reduce redundancy and time. The DIR-BFA model outperforms the existing approaches (EQS and Firefly-based retrievals) in the performance evaluation based on standard metrics (precision, recall and f-measure). Experiment results showed that the retrieval accuracy was improved and the run time was reduced effectively.




