Thursday, 15 November 2012

Multimatching and Clustering Techniques

In a bid to complete some older work before moving onto Dynamic graph drawing problems, I have focused  on the multimatching once again. As a form of clustering, there is much research already out there which identifies the most efficient methods, the most useful for graph preservation, most useful in given scenarios for each graph type. Unfortunately, there is far too much information to read through for such a small task in my research.

In regard to this, I had implemented a multimatching form of clustering which when given a matching number, m, the matching/coarsening will group m vertices to form a new vertex in a coarser representation of the graph (as opposed to edge contraction or identifying a MIVS). To recap, I was identifying a way in which to reduce the time required to draw sparse star-type graphs, and to achieve an approximation tree similar to that of the Octree (primarily the star graph thing though).

Having revisited the multimatching results; there is some relation between the matching number and the average degree of vertices within the graph, albeit, making only a small difference to run-time (a few seconds under the 30-40 second run-time for add32). Nowhere near the reduction I wanted.

Following these "bad" results, I investigated other approaches for clustering, with the aim of reducing the run-time further (and hopefully improve drawing quality as well). With multimatching being the first, the second was using brute clustering - where all adjacent vertices, if not matched, are coarsened to form super nodes  regardless of the degree. This is theoretically bad, as it will generate several heavy nodes, and many smaller nodes (composed of the left over vertices). The approach focuses mostly on star type graphs as opposed to sparse graphs.

The third is oddly named trail finders. Analyzing the degree of vertices within graphs which cause the MGF approximation scheme issue, shows that the average degree is 1 or 2 vertices (with very few having 3 or more). Therefore, it is assumed (and backed by analysis of drawings and chaco files) that trails of vertices exist, where one vertex is connected to only one or two others to form strings of vertices. Therefore, multimatching and brute clustering will make little difference to standard edge contraction ( when edge contraction contracts the graph to a star graph, it also becomes obselete). Finding these trails and coarsening them before other matching techniques, means [in theory], the algorithm finds a better matching.

Currently, I am awaiting results which test each of these in order to identify the best approach for the graphs add20 and add32 (with the aim of identifying similar graphs through analysis of the their degree when being read from the chaco file). It should be noted, the adaptive matching described in Hu's work (which changes from edge contraction to MIVS when edge contraction fails) has not be implemented or tested due to difficulty integrating it into the MGF approximation scheme. 

Initial drawings show an improvement in quality when using trail findings, but runtime has not been compared yet. As this is a relatively small topic in terms of thesis, further research is unlikely, and therefore the best of those matching techniques given above will be used (unless otherwise specified by my supervisors).

No comments:

Post a Comment