Improving Graph Edit Distance
Time | 2013 — 2016 |
Funding | Hasler Foundation |
Researchers | Kaspar Riesen |
Abstract: One of the most flexible methods for graph dissimilarity computation is based on the edit distance of graphs. Yet, the computation of graph edit distance is known to be exponential in the number of nodes. The applicant of the present project developed a novel approximate graph edit distance framework recently. The basic idea is that on optimal assignments of the nodes of both graphs, the edit distance is eventually derived. It has been empirically verified that our framework approximates small distances with a high precision while relatively large distances are often overestimated. The reason for this overestimation is that our procedure considers the nodes and their local structure only in the initial node assignment. The major objective of the present project is to substantially reduce this overestimation. Three lines of research are planned to meet this goal The first line of research considers the integration of topological descriptors in the node labels in order to improve the initial matching process. There is eligible confidence that if this information is considered in the initial node matching a reduction of overestimations can be expected. The second line of research pursues the idea of matching subgraphs rather than single nodes in order to prevent unnecessary edge operations in the derived edit distance computation. To this end, the graphs under consideration are partitioned with standard graph clustering algorithms. Given that the subgraphs are densely structured by means of edges, this procedure substantially increases the influence of edge structure in the initial node mapping. Thus, with this extension it is likely to improve our matching precision and reduce the overestimation. The third line of research consists of answering several theoretical questions about our framework which are still open. Answers to these question potentially lead to heuristics for further reductions of the overestimations.