MPhillog: Carl's MPhil Progression: 2013

Friday, 20 December 2013

The End!

Sad to say I am coming to the end of my research!

This sad news means I will soon be discontinuing the blog, and as such, a goodbye will be soon.

Having reached my writing up period about two months ago, I have spent the past weeks writing and running my experiments in order to provide enough evidence to defend my theoretical works. It also means I am now on the search for a job! Apparently this is a lot harder than it seems as postgraduate studies isn't really experience, nor is it graduate level work due to its specialism...

Anyhow, I am currently writing up! and my goodness is it boring... Writing about everything you known ten times over is not exactly enthralling, on the other hand, pretty pictures/layouts make it all worth while. So instead of boring you any longer, I've chosen to upload a few of the best over the past three years!

Is it a swan, is it a thing, what is it? I choose swan because I like birds. (It's actually a graph called data representing an aeronautical body but the drawing method drew it peculiarly).

A drawing of the graph 3elt, representing the airflow around an aerofoil, the layout itself doesn't follow drawings by other algorithms but shows the ability of MGF to draw "arms" of data. If only that "tongue" was better placed.

The layout of add32, except different! The layout here was a result of my "primitive" research which tried to generate layout of star type graphs (one central vertex connected to all others). The layout didn't fare so well aesthetically but left an unusual look for the graph showing the connectivity of vertices. It seems one half is very connected, the other, very loosely connected.

Investigating unusual shapes of layouts, a 55x55 grid should be very square. This layout wasn't, and apparently the cause is the approximation of forces as part of the multilevel scheme (lay mans terms: the graph was split in two, and the different between the two caused the bulge you see as one side is bigger).

Not exactly a different layout, in fact its very much what you would expect from a drawing algorithm. I just like how the components of the graph are clustered, showing the parts that made up "add32".

The graph "finan521", normally visualised as a circle, the weakened repulsive forces generated a layout where the circle folded on itself. The smaller structures are also more flared than compared to other generated layouts.

What I would consider a perfect layout of "4elt", with compression around the white spaces, avoiding overlap (note the "hole" on the left, which would otherwise have "excess" layout overlapping the surrounding layout, more extensively than the top of the hole).

Pretty colours! A 55x55 grid with coloured partitioning. The layout itself is somehwhat warped as a result of the approximation of MGF, and the partitions are unequal (different sizes, see the red and pink partitions for difference). However, the ability to colour partitions is simple but important as a visualisation tool.

For now that is all! I will comment again with a final summary of all of my work once I am done with the writing up stage (or once I have completed my Viva). I hope you've enjoyed following my development from innocent undergraduate to specialised and scary postgraduate weirdo!

For old times sake, a kitty;

Friday, 12 July 2013

Some interesting stuff with VisualMGF (and other tales of bad research)

Another two weeks gone by and what to I have to show for it? Well for starters I'm rather embarrassed, for you see my implementation of "Visual MGF" does not quite work as planned, to such an extent I am genuinely surprised it does what it does at all.

So what disastrous discovery comes this week? For several months it has been explained to my supervisors that the value of k (ideal edge length) for different levels of the multilevel scheme is appropriate following the research in static graph drawing. Technically, this is true and it was expected that the movement between levels (when using the dynamic drawing algorithm) was fine due to the speed at which layout is generated in comparison to a pure multilevel approach.

Well... I am partially wrong. The value itself is fine, my interpolation of layout, not so much. During one iteration of the VisualMGF cycle, each level of the multilevel scheme has force directed placement applied to it once, passing the displacement of vertices to their children in finer graphs (with some small difference to avoid singularities), for the sake of names we call this Cycle A (for Awesome). When the finest graph has layout applied, the algorithm runs in reverse, updating the parent vertices in coarser levels with the averaged layout of their finer children (in order to overcome inaccurate approximations which cause compression in the layout), named Cycle B (for B*****ks). These two update cycles do not work well together, in fact, Cycle A is relatively useless and only aids in expansion of the graph. As Cycle B updates vertices, the layout of the finer graphs is completely re-written to form a coarser layout of the original graph (which at this point can be considered bad layout).

So how do I get "good" layouts? The MGF Approximation structure maintains approximation of vertex positions and continues to use these to overcome minima. Although this leads to problems with minima (both global and local), the layouts inevitably end up rather appreciable after some time. Removing the extra calculations of applying VisualMGF to the coarser levels (only applying to the original graph) proves that layout in coarser graphs is irrelevant - and provides a noticeable improvement in time required to find layout.

Initially I thought this would only be a problem with the VisualMGF algorithm but it appears to have a worse effect in the standard VisualML (visual multilevel only). Although displacement in the finer graphs is applied, and a good layout can be applied in coarser levels, the layout in the finer levels contradicts the coarser layout and so does not find a good layout so easily (if at all).

So... it seems Veldhuizen does have some tricks up his sleeve which requires yet another revisit to his publication in dynamic graph drawing. So the trick now is to once again identify a method of achieving global layout appropriately, but having used the multilevel scheme already for approximation, what comes next? a multi multilevel scheme? How deep does this rabbit hole go?

Friday, 28 June 2013

My assumptions were wrong? So much surprise...

So following on from the development of the Visual MGF framework, I have spent the past 3 days ~~playing with~~ investigating the abilities of the application and how much I can change, and inevitably, how does whatever I change affect the layouts of the graphs.

The biggest changes so far come from adding variables to give priority to one of the cooling schedules (to either repulsive or attractive forces). Changing this has allowed me to find some preliminary values which provide good global layout more often for most graphs (a problem described in previous posts). The bad global layout had to be for a reason though (treating the symptoms is only one thing), so investigation led me to think "lets spice the pretty pictures up with some colour".

So, a few hours later, here I am with the ability to colour the graphs, highlighting each of the matchings of vertices. So highlighting the two collections of vertices which make the coarsest graph, gives the following picture (for the graph 55grid/3025).

Notice that one of the "partitions" is much larger than the other, this sometimes happens due to the random nature of the matching process. You may also notice the crease in the layout between the two collections - interestingly this has been observed previously when I first implemented Walshaws multilevel with standard grid approximation, and shows that the global repulsive force is not pushing the two collections apart (and as such, the two collections overrun and crease). This suggests either changing the value of the spring length (k) to something more appropriate or copying the fix on the previous work and changing the strength of repulsive forces.

Either way, that is only slightly interesting. The more interesting, and sadly more disturbing news comes from observing some of the matchings that occur. The following image shows matchings on the same graph but in at a different (coarser) level in the multilevel scheme.

The image shows that some matchings include a large number of vertices stretching out across the graph. Previously I have made the assumption that matching connected vertices means the graph should be matched into reasonably uniform partitions. Apparently, this is not the case, and as such, the global layout is often skewed so that the larger partitions occupy more space than others.

This is due to abandoning the priority given to lesser connected vertices. Changing this should provide more appropriate partitions and generate a more uniform global layout. Shows the use of this blog, I only just thought of that as being a problem - might have ignored it otherwise, so thank you blogger!

Tuesday, 25 June 2013

Videos, just because.

Last post I discussed my progress with the visualisation of the MGF FDP algorithm. Now that some of the kinks have been ironed out - here's some videos of it in action.

3elt drawn using visual MGF FDP. Much of the layout is found relatively quickly in the first few seconds, however, global layout takes much longer, and due to the dual cooling schedules used to smooth vertex movement, the algorithm takes much longer than static drawing algorithms.

3025, the 55x55 grid given layout using MGF FDP. The video shows difficulty finding global layout due to the changes to the cooling schedule. This occurs in many of the drawings encountered suggesting a problem with the current algorithm as discussed further below.

3025 again, however, global layout is somewhat achieved. Unfortunately a group of vertices appears to get "stuck" in the layout. Normally, the forces would pull this grouping outwards and converge to a good layout, however, the vertices remain in this position, ignoring much of the forces. As mentioned above, global layout is not always achieved, and when coupled with this behaviour (which occurs in other graphs too), suggests a problem with the FDP (or interpolation of layout). The root of the cause is yet to be identified, but it wouldn't be fun if it were easy right?

Same algorithm, different graph. This time 4elt is the guinea pig, and with the problems of global layout given above, the entire drawing process is not visualised. Instead, the algorithm was left running for 10minutes in order to determine if larger graphs can be given layout that matches their static counterparts. The answer? very much a yes. As the video shows, the layout is very much comparable to 3D static drawings.

As the size of a graph becomes bigger however, the time taken to perform one iteration of FDP increases and so the visualisation is no longer smooth, with a noticeable delay between placement of vertices. However, in comparison to the videos presented in Veldhuizens work, the speed is equivalent, if not better. Further decrease in FDP runtime would therefore come from further optimisation techniques.

Visualisation of layout generation is good but not even half of the story. The other half is operations, which have previously been shown in other videos (which showed how terrible it is to apply the Fruchterman and Reingold algorithm to a layout after changes have been made - the oscillation from the cooling schedule is not a particularly pleasing attribute). So without further ado, the video below shows changes to a 20x20 grid having been given a layout (sorry, forgot to record the layout generation). The changes turn the grid into a tube like structure through the addition of edges (similar to an example given in Veldhuizens work), and then returning to its original structure by their removal.

The layout generated for the graph is initially quite good (some uniformity issues but nothing too drastic), however, as the edges are introduced, the middle of the graph/tube tends to be more affected by expansion, causing it to bow. Although not the best or smoothest layout, possibly due to no changes to the MGF structure, the operations perform their task in connecting distant parts of a graph and bringing them closer to one another in the layout.

Lots of stuff to refine and check but some progress appears to be made. Anyway, pretty videos, enjoy!

Friday, 21 June 2013

Multilevel Global Force Relative Motion Graph Drawing is a go

Last post I mentioned regarded my implementation of the Multilevel Global Force Relative Motion Graph Drawing algorithm (no idea why it's now called that but until a better name comes up, it'' have to do). This week I have improved upon the implementation and fixed the NaN bug which has plagued it.

The NaN errors, as expected, came from division by zero. Due to the way in which forces are calculated (repulsive forces are not calculated for vertices connected by springs), a graph with only one edge will inevitably have zero repulsive forces acting upon it. When calculating vertex displacement, this causes division by zero and so ruins the game.

Why hasn't this come up before? Well, previously we have only used one cooling schedule (that described by Davidson and Harel) and as such, spring and repulsive forces are combined (so even if one is zero, the other will inevitably have a value). Moving to the smooth FR algorithm splits them so if one of the forces is zero, NaN begins its takeover. It's stressed that this only occurs in graphs with only two vertices and one edge, meaning only one force is calculated.

Anyway, moving on, the algorithm now works and shows massive speed improvement from brute force and multilevel approaches. Some graphs exhibit the structures you would generally see in static graph drawing and global layout is often achieved after some period of time. Due to the dual cooling schedule and instant visualisation, this takes much longer than with static graph drawing.

Changing the algorithm to use single cooling schedule but still retain the real time visualisation, you would expect to see good layout in much shorter time (but with more erratic vertex movement). However, this is not the case, as each iteration finds layout for the entire graph, and inevitably, the cooling schedule finishes before a low energy layout is found.

In any case, there are some bugs still observed in the drawing (for example, the graph drifting out of the viewing area and some vertices unable to move from their position) which require fixing, and of course, combining the operations framework with the algorithm so that we can begin analysis of dynamic matching. But, we are now seeing a working algorithm similar to Veldhuizens using techniques already present in the graph drawing literature (no fancy dynamics as such), so yay for us.

That's all for now as its Friday and I need a drink... catch you next time!

Wednesday, 19 June 2013

"I feel a little better..."

Good news and bad news!

Good news, I'm in a happy mood. Also, I have managed to get the "smooth" Fuchterman and Reingold spring embedder to work with the multilevel scheme (mentioned previously and used to visualise the placement of vertices in a smooth fashion similar to Eades). Even luckier, I have managed to get it working with the MGF approximation - so much so that we can semi-visualise the layout being given to graphs as big as 4elt.

Now the bad news, there is only a 5% chance of it working due to some unknown issues generating positions which are NaN. Previously a problem with the distance between them being zero but it doesn't appear to be the cause here, so that's for me to investigate further. As always, there's some bug in there that is waiting for my attention.

Now the tricky part, the visualisations observed this far show the generation of good local layout but lacking in the global layout side of things (see the image below - when visualised this was moving but had slowed so much that it was unlikely to find a good global layout). The reason? the cooling schedule is great for smooth visualisation but makes generation of layout much longer as we wait for it to overcome all the local minima. This just gave me an idea to try with the original Frucherman and Reingold algorithm and see how that works out.

So the aim now? fix this damn NaN problem. After? play around with the visualisation, see if it can be improved and then get on with testing the dynamic matching stuff. For now, that's all I want to say until it works appropriately, so I'll keep you up to date, enjoy your Wednesday!

Saturday, 15 June 2013

Another paper, luck requested!

Another rather late post to add to the collection of sparser and sparser updates.

So what have I been up to? Another conference paper! This time for the prestigious Graph Drawing conference, which is a rather big and intimidating deal. Unfortunately, the work submitted is only an addition and not the dynamic stuff we've been working on - but I guess that's what happens when you leave the big bits to the last minute.

So the work - leaf processing! We have successfully proven that processing edges in a graph reduces runtime due to the smaller graph size, and when using MGF approximation, reduced depth of the coarsening scheme (resulting in less approximations). We also suggest a simple Leaf Placement Algorithm for the layout of these vertices without the use of iteration (no FDP for you!). Unfortunately the LPA does increase the number of edge crossings somewhat, as the placement only accounts for the centre of mass and not for other nearby vertices, but this can be changed by implementing different algorithms. The concept is that we can control the layout of leafy vertices with ease - and by doing some simple preprocessing, can reduce the runtime required to generate layout for leafy graphs.

For more information you'll have to wait and see if the paper is accepted (mostly because I dont want to jinx it).

Anyway, moving on, what's next? The Dynamic stuff!! hurrah! We have the changes and operations currently implements (may not be optimised but they do their duty) and the framework for altering the coarsening scheme after these changes exists (but has not been tested so who knows what bugs are waiting for me there). More importantly, the cooling schedule mentioned in the previous post is implemented, so now its time to port that to the multilevel algorithms, so we can speed up the drawing.

A plan has been developed on scraps of paper scattered across the 5 corners of the world, and the code has been prepared in a special ceremony - so I guess now its down to me. Let us begin. I will update again in the future with a progress report. So luck be with me!

Saturday, 1 June 2013

Dynamic stuff?

So another long delay to an update, so what's been going on?

Primitives

Sadly, no good news here. The primitives work, which intended to reduce the number of levels in a graph and as such, reduce the runtime required for approximating global repulsive forces has had little impact. This is primarily down to primitives only being identified too late on in the coarsening phase (when graphs only consist of 100 or fewer vertices). As a result, the time saved generating this layout is insignificant - and with limits on the extent of approximations, there is no real use for this approach unless a primitive with thousands of vertices is identified.

So, this is apparently a no-go.

Leaf Coarsening

A little better news but still not amazing. The leaf coarsening works as expected, with most graphs being unaffected by the approach, and those graphs which do have leaves in their structure, coming back with reduced runtime (for example, add32 is given a layout in 15% less time).

The layout however, although higher in the number of edge crossings, makes changes more available to users. Leaves are currently laid out using the centroid of the graph to push them outwards (separated by small differences if coincident), however, a technique using a "cone of freedom" allows for multiple coincident vertices to be displayed evenly around an anchoring vertex or between other edges, giving them more freedom to escape repulsive forces and to be displayed as the user requires (through changing the angle of separation between them). Though the technique is not perfected and is more specific to tree and star type graphs, the approach is relatively simple and can provide a more contextual layout.

But unfortunately again, this is no real benefit to general graph drawing and is just a mechanism than only optimises the MGF structure when applicable.

Multimatching

Another disappointment - grouping more than 2 vertices was expected to reduce the number of levels in the multilevel scheme (and MGF tree) making for a faster and more efficient approximation which closer resembles the structure of the quadtree. The results say... sometimes, but each graph has a "preferred" matching number which is cannot be identified without checking each matching number individually (a long process). This essentially rules out multi-matching as a standard mechanism to improve the algorithm.

Summary of MGF Optimisation through Coarsening

Only the leaf coarsening has any promise - and the promise is not particularly linked to general graphs. The primitives layout is only useful for smaller graphs that require perfect layout, and even then, FDP offers competitive speeds. Multimatching backs up the use of matching only singular edges (two vertices), higher matching numbers provide an increase in levels (in general) and an increase in runtime.

Spring Embedder without Cooling Schedule

Previously I have complained that the Fruchterman and Reingold cooling schedule does not meet my demands, and as such, does not bode well for dynamic graph drawing (thus everyone uses Eades spring embedder or similar). The problem comes from the large movement in the beginning of the cooling schedule, and the stopping when the tolerance is met (as the cooling schedule acts as a limit on the number of iterations). Removing such limitation and allowing the algorithm to run indefinitely does not provide good layout as the large movements are never restricted.

Well, finally some good-ish news. Thanks to implementations of the modified spring embedder on GPU's, there is a cooling schedule which allows for continious layout generation without stopping limits, without the erratic movement and with (relatively) good layout. The approach uses two cooling schedules and treats the movement of the two forces seperately (as opposed to one singular movement). After a quick implementation, the layout is achieved smoothly and nicely, albeit with much more noticeable peripheral effect. Hopefully, this discovery will aid my design of the relative dynamic algorithm.

The modified spring embedder without cooling schedule showing the erratic movement as repulsive and attractive forces continually fight each other. The graph drawing is a thee dimensional 7x7x7 vertex grid. No multilevel scheme is used.

The modified spring embedder with two cooling schemes. The erratic movement disappears and allows for continuous layout generation (a layout is achieved and does not change). Unfortunately, as the cooling schedule has not yet been fitted to my own algorithm, the graph expands to a much larger size than required. The same 7x7x7 grid is drawn. Unfortunately, this approach is much slower than the single cooling schedule as it takes longer for vertices to pass out of local minima.

Current Dynamic Framework

Some progress in the dynamic framework. Mechanisms and data structures have been prepared for changes to a graph and updating the layout periodically has been achieved. In the video below you can view the graph 55grid (3025, a 55x55 vertex grid) having vertices removed and the layout updating to show these changes. The structure/mental map of the graph remains until nearly 75% of all edges are removed.

The erratic movement after the layout is updated comes from the cooling schedule as mentioned above. I am currently implementing the new approach to smoothen this out and so the redrawn layout does not have to be requested but is visualised immediately. An alternate and possible faster approach has been suggested by my supervisor which would be to only show the stages of layout generation where vertex movement is below some tolerance - making the image and changes much smoother.

Saturday, 11 May 2013

Ouch, that's what happens when you're cocky

Big ouch this time.

You may or may not know that the implementation for testing my work on Primitives and Multimatching uses an alternate method of storing edges (storing them within each vertex as a collection of adjacent vertices as opposed to edge components) - so in short, they're different. For the purposes of making things easier, see the definitions below:

Algorithm A : Vertices are stored as "Component" objects with their coordinates, weight and position in the coarsening tree. Edges are also stored as "Component" objects, with pointers to their connecting vertices. Both collections of vertices and edges are stored as a "Graph" object.
Algorithm B: Vertices are stored as "Component" objects with their coordinates, weight and position in the coarsening tree. Edges are stored within each vertex as a collection of pointers to vertices in the vertices collection (storing edges twice in both connecting vertices)

For context, it should be known that the coarsening algorithm randomly shuffles the edges in order to provide random matching (this uses Java native code and has negligible effect on runtime). For Algorithm A, this only shuffles the collection of Edges, however for Algorithm B, the collection of Vertices is shuffled. After each test has run, the coordinates of the collection of vertices (in either algorithm) is output into a file, and foolishly, output in the order they appear in the collection.

For Algorithm A, this has no issue as the collection of vertices is in its correct order. For Algorithm B, however, the collection is shuffled for the coarsening scheme. So when analysing the layouts, the coordinates are fed in the order they are expected - but with no account of the randomisation considered - ergo, very bad layouts with lots of edge crossings (yet when drawn, have some shape to them). As such, many of the results collected for the multimatching and primitive work, have to be rerun with a mechanism to reorder the vertices (thankfully an easy job).

So once again, negligence does not pay. Heh, nevermind, onward to new results!

<EDIT>

In the first dynamic implementation, there was a problem of the graph "wandering" out of view, and eventually collapsing in on itself due to some issue with the forces, resulting in a bad layout after changes (number of change were indeterminate). The reason for this is due to the update of the positions of vertices in the approximation - without updating these, the graph uses old positions, which eventually overcome the remaining forces and push the graph in some direction. Simple PostOrder update fixes this (after some minor changes in the code).

The tree, in this case, still holds information relating to all vertices - therefore the structure is no different and repulsive force approximation still includes vertices which may not exist any longer. Roll on the experimentation!

</EDIT>

Friday, 10 May 2013

Skittish

Another few weeks since my last update (or it feels like). So what have I been doing?

Analysis of Runtime - much of the runtime analysis has been written into the draft chapters of my Thesis, with the general idea that my algorithm is faster than a quadtree and equivalent to a dual tree - with the hopeful benefit of having better quality (equal to that of the quadtree or better). Much of the runtime results for experimentation have also been written up.
Analysis of Quality - oddly enough the more time consuming task (due to my unnecessary and possibly wasteful time researching speedy methods for counting edge crossings). Thankfully, after some aid from Dr Chris Walshaw and some Googling, I have implemented a far more efficient method of counting edge crossings and reduced the time required by more than half of my previous methods. Unfortunately this still takes a lot of time for huge graphs but better 20minutes than an hour... With this done, I am not awaiting results so I can write them up and hopefully get my first chapter ready for checking.
Dynamic framework - having designed and implemented a chaco-style method of importing "operations" into the application, I now have a simplistic dynamic graph drawing algorithm ready and working - featuring changes based on the users input (through the operations file) and some adaptability in terms of when to update the layout etc. Still needs a lot of work though, however, the main concept is there - next I will be refining this and implementing a means of measuring the dynamic matching I've been waiting to complete. Also, there are some peculiar behaviours I would like to investigate - so this may inhibit plans. I would also like to implement a similar method of graph drawing to that published by Veldhuizen - so we will see how it goes.

Other various tasks have been completed but I cant remember them, so hopefully nothing important. I will update again when I have more information regarding anything of the above, for now, worky worky work work, lots to do and not enough time to do it - challenge accepted!

Thursday, 25 April 2013

Quantum Particle Layout Generation with Plasma Dynamics

If that has ever been a title, the writer probably cried. Unfortunately, this post is about nothing as interesting (though comes from some interesting research into forces between atoms). The main point of this post - put some of my ideas down so I don't have to keep thinking over them.

The problem: dynamic graph drawing is the changing and visualisation of a graph in interactive time (or as close to as possible)
Most techniques for smaller graphs are derived from Eades and Sugiyamas work (essentially a Spring Embedder which outputs vertex positions to a plotter after each iteration)
For small graphs, this works well because the vertex movement can be calculated between animation slides. Large graphs - not so lucky, the higher number of force calculations requires more time and cant be fit between animation slides
In practice, instant changes to a graph are bad as the reader does not have enough time to follow what is happening - for smaller graphs (9cube) a time dilation of about 200ms per iteration is required, for larger graphs (55grid) FDP takes too long (and without Multilevel, the layout is awful anyway)
Typical "easy" response from me would be to run MGF on a graph, and after a change (or collection of changes), reapply MGF, outputting the new layout to an additional time frame
This doesn't quite work for two reasons - 1. takes too long (essentially generating an entirely new layout per time slice - so not really a time slice, more of an iterative layout thingy) 2. the vertices do not gradually move into new positions, and so between frames, there could be a very different graph
You could alter the FDP technique to cut down the number of calculations required (as a good layout is already applied, it just needs to be altered), or even cut out all calculations in the multilevel scheme so only the final (original) graph is altered
This brings us to the dynamic matching - how many levels before layout suffers during the coarsening process, similarly, how many levels required change as a result of FDP before the layout suffers
This suggests that inevitably, a layout will have to be generated for a large graph through typical static graph drawing techniques. Only the original graph is then a concern - until you wish to start updating the approximation structure (again dynamic matching)
Can the levels still be used to approximate forces if the higher levels are not having their positions updated? How slow will this be to visualise?
The dreaded cooling schedule - everything above is all well and good, however, visualisation of the cooling schedule in action shows large amounts of movement (oscillation of vertex positions) - Veldhuizen uses "damping" to reduce this effect, so its assumed we could do the same (assumed...)
Considering the cooling schedule is so limiting, another approach would be to change this (that way we can really see whether MGF is better or worse than the Octree in terms of which converges first)

Cant really think of much more as I now have to change the printer ink (talk about interrupting a train of thought...) but I will no doubt update this list when I can think of more things. Good news though, this already answers a few of the questions I had for myself.

Tuesday, 16 April 2013

Progress Something Something

It's been a while since my last post but hopefully that means I've been working hard. I say hopefully, I know whether I've been working hard or not, but its fun to keep the mystery alive.

So to cover my tracks, what have I been doing?

Multilevel Global Force Approximation - this again? you betcha. Ever since the fiasco with the randomised matching I had problems with, I spent some time going back and checking everything was correct. I have now updated the model/algorithm and have begun testing various bits and pieces to ensure it works, and that the behaviour/results match my expectations from the theory. I also wanted to follow up on some old ideas while I had the chance.

Random matching/coarsening - completed. I've run the linear matching for a selection of graphs and ten randomised counterparts and found that there are significant differences in the number of levels generated in the multilevel scheme, due to the algorithm being dependant on the order vertices are stored in. Using the same algorithm but randomly shuffling the vertices presents a more universal number of levels, proving to be the more reliable option. Analysis of the layouts is forthcoming, however there are limited resources available to run the tests.
MGF Level Limit - during the generation of the Octree, the number of splits is capped at 13 following the implementation by Yifan Hu, making approximations faster by reducing the number of approximated force calculations (a shallower tree to traverse). Using similar logic, there is now a limit on the MGF depth (though this is controlled in FDP). Implemented, with testing of the value under-way.
Coarsener Tolerance - similar to using primitives and multimatching, coarsener tolerance aims to reduce the number of levels with little difference between them. As a new level is generated, the number of vertices is expected to be smaller, if not, the coarsening phase stops. If the difference is small, i.e. 1 vertex, there will be a larger number of levels and will require more time to generate layout. The tolerance is used to limit this. Implemented, with testing of the value is under-way.
Graph Theoretic Distance vs Weight - in MGF and Octree implementations, the weight of the approximated vertices are used to determine the strength of the repulsive force and provide a more accurate estimate. As an alternative, I wanted to provide a more realistic approximation using the graph theoretic distance between them, playing on a similar idea to the Kamada and Kawai method which states an imaginary spring exists between two distant vertices with a length equal to the graph theoretic distance between them. Testing has shown than a more uniform layout is generated which ignores much of the peripheral effect, suggesting the approximation is more realistic than using the weights (however, as discussed by Hu, this increases the "fluffiness" of a graph and so, increases the number of edge crossings). It is expected this matches a layout generated from an algorithm with no approximation but is yet to be proven.
Scale testing - once again, but only to ensure no changes to the originally calculated values as a result of recent changes. Results this far have shown the default values remain the better in terms of runtime and quality. Some issues have arisen with the octree implementation and require the grid structure to be made larger to encompass the graph (higher scale values generate larger layouts which have been shown to push vertices outside of the bounds of the octree - would like to point out, not a problem with MGF! *smug grin*)

Multimatching!

Levels analysis - to ensure no more panicking, much of the coarsening and multilevel has been tested with some short analysis. No FDP is applied, but the number of levels and the degree at each level has been analysed to show the behaviour of the algorithm. The plan was to identify why finan512 coarsened to hundreds if not thousands of level, which has now been shown to be caused by single matchings (it coarsens into a very large sparse graph with star like structure(s))
Testing under-way for M = 2,3,4,5,6,7,8 - with randomised matching. Following pre-analysis of the number of levels, those graphs which generate unreasonable number of levels (100+) are not fully tested due to the time taken to perform the tests, instead, only a limited number of tests will be used to prove my theories.
Primitives - Implemented for Stars, Rings and Chains in both 2 and 3 dimensions. Testing will begin when resources are available.

Dynamic!

Operations have been taken from previous implementations and are undergoing some early phase testing to ensure no bugs (and make my life easier by adding some other functions so I don't have to do it later)
Dynamic Chaco files are in the later stages of implementation (reading dynamic data) - though generating graphs with hundreds of thousands of vertices is a bit bias at the moment (sticking with very large grids, cube like structures, sierpinski triangles and primitives)
Measuring quality - implementation stage of mental/movement map and I've been playing with some edge crossing stuff (though so far this hasn't gone well...)
Dynamic coarsening - also under-way but more at the theoretical stage than anything, just to make sure I know what I'm doing and what I expect to implement (not something I just want to hack together given previous problems).

Other Bits!

Post Order efficiency - something brought up a long time ago, during the MGF approximation, positions of vertices are averaged to produce more accurate positions of vertices in the coarser graph when calculating approximated repulsive forces. But this undermines the layout previously generated - not a problem for static graphs as the layout is used only once, but for dynamic graphs it becomes a problem. Some theoretical work has been carried out similar to the "ideal spring length" work by Walshaw, but nothing concrete.
Warping of the graphs - using MGF has been shown to warp the layout of a graph (although using the graph theoretic distance instead of weights [mentioned above] has been shown to lower the effect greatly). The quadtree/octree has been shown to also have similar effect, especially on star type graphs, however, it is less noticeable to to the increased directions used (2 vs 4/8). Without using some radius to determine a cut off point, I wanted to see how much effect this has - but for now, this is not a concern.
Thesis! - I have been working on adding text to my Thesis; the MGF theory has now been added with some description of functions and parameters with only Results, Analysis of Drawings and Conclusions to go in, the Multi-matching and Primitive Theory has also been started but requires some more work/research, and brief explanations of the Dynamic coarsening/operations have been included.

That's about all I have to share for now (that's all that's on the piece of paper in front of me). I will attempt to update again soon but the current pattern is once every two to three weeks, so we will see how it goes. Have a good week!

Monday, 25 March 2013

Post-Holiday Cooldown

It's been a long while since I last posted anything - normally a cause for concern but a brief coverage of what I have been up to since my last post:

Last Week [18.03] - Holiday! I was away for a week to help with my parents project, nestled quietly in the middle of nowhere.

Two weeks ago [11.03] - Technical Report! Having had comments from my supervisors, making those changes, and then some further changes of my own. Hopefully the end is near for the technical report. In addition, meetings with supervisors to discuss these and my intentions when I return from holiday.

Four weeks ago [25.02] - Working on implementing the primitives functionality and moving to dynamic data structures (which are so slow that I will be providing support for both Collections and Arrays 5sec to 13sec for 4elt). As for this, all the primitive layout schemes appear to work and different primitive coarsening schemes - such as leaf and chain coarsening - appear to work (and work in combination with one another).

Since the last blog post [18.02] - I don't really remember this far back, but thanks to my notes, it appears I was working on some errors with random coarsening (over the linear method used previously). This has since been fixed and was due to a efficiency mechanism dependant on the order of vertices.

So this week, what are my intentions?

Renumbering of test graphs and ensuring results do not differ (the algorithm should treat identical graphs in the same way).
Collect runtime and quality results for the various preprocessing tasks and multi-matching tasks assuming previous problems with the matching are fixed
Dynamic graph time slices (frames of some dynamic graph after changes occur) with some quality analysis for each (and/or additionally - mental/movement map stuffs)
Collect results for the dynamic matching framework/stuff

Obviously not going to be completed in a week, especially when its the slow start after a break away, but these are my aims for the coming weeks. Time is running out, so we shall see how truly competent I am.

P.S. finally had my MPhil to PhD viva accepted by the Research Degrees Committee!

Thursday, 14 February 2013

Matching Problems II

A typically uninteresting week and a half since my last post, and not a lot of results to show for it unfortunately. I have spent the time attempting to resolve an issue with multi-matching algorithm, but to no avail.

Analysis of the processes (and when I say processes I mean the hundreds and thousands of lines of information output at each stage of the coarsening scheme) showed that the cause comes from minute decisions when choosing matchings. It appears as though these decisions come from the method of storing adjacent vertices within each vertex as opposed to a set of edges - and will only occur in large graphs with many levels (graphs over 20k vertices normally present with more levels in the multilevel scheme).

Attempted fixes did little to fix the issue (but did result in a more efficient coarsening algorithm - thank you hash maps!).

In any case, I suspect my methodology is flawed as I am iterating over vertices normally - it gives vertices at the beginning more chance of being matched whereas though near the end will have less available partners. For now, I need to check with my supervisor's previous work in order to determine whether the number of coarser levels is comparable, or whether it is some part of my code that is giving bad answers.

As a break away from problem solving, I intend to spend the rest of this week on the new matching and pre-processing techniques (using only standard edge contraction - known to work as expected). This will enable me to play around with some of the ideas and get some preliminary results.

Interesting thought for the day - it seems finan512, a ring of repeated groupings of vertices, does not actually coarsen to a ring in any of my coarsening. Instead, reducing to a sparse type graph. Looking at the structure makes me think that if I were to identify these repeated vertex groupings, I could provide layout for one and then just replicate the same layout around the circumference of the ring. This is a play on identifying automorphisms[?] within the graph and placing them in appropriate locations as suggested long ago by Lipton et al. It is a bit biased to just the finan512 graph, but if they could be identified with ease in any graph - would it help? not really my concern at the moment, but food for thought.

Wednesday, 6 February 2013

Back to MPhil? Oh, and the multi-matching is broken too

Interesting week this week, so most annoying and worrisome topic first: looks like I am still an MPhil student. It has become delightfully apparent that the examiners have not yet sent the Assessment Outcome form (of MPhil to PhD), and so as a result, I am still registered as an MPhil student. I just hope it doesn't interfere with my plans for finishing in 3 years (the Research Committee have been stubborn regarding forms in the past, and if they read the form as me having just completed the transfer, may dislike the idea of me trying to go for PhD with such small amount of time left).

Now that I have complained about it a little, I can move on to the next complaint: the multi-matching and its dislike of certain graphs. In the past few posts, I have reported good results regarding the reduction of runtime for many smaller graphs; however, tests on larger graphs took longer to run and so I was left waiting for those results.

It seems only a few of these tests came back with positive results, with some, specifically finan512, coming back with much higher runtimes (nearly 8x the original runtime measured in the Redosmium/MGF implementation). This extended runtime exists for all matching techniques, including the standard edge contraction.

The main difference in algorithms is the multi-matching, which relies on adjacent vertices being stored as a collection that is associated with individual vertices (similar to Walshaw, O(2E)), as opposed to the previous method of storing them as a separate collection (O(E)). Having received results for smaller graphs which were similar to the MGF, it was an assumption that the new coarsening algorithm and method of storing worked as it should of - these new results suggest differently.

As mentioned, finan512 has the unusually high runtime. Some analysis shows that the number of levels in the multilevel scheme has risen from 35 (on average) to over 600; reinforcing the notion that the problem lies in the coarsening scheme. Having investigated, it is difficult to see where the problem is coming from - the coarsening scheme runs as expected for smaller graphs.

Unable to see the problem, I have begun getting restless and irritated (perhaps why this MPhil thing has annoyed me more than it should). In any case, I plan to look at why finan512 and sierpinski10 have issues while others, such as dime20, do not, and if possible, fix this multi-matching/coarsening scheme so I can get some reliable results.

As a finishing note, it's been a while since I have shown any pictures of output, so here's two more recent layouts I have found particularly pleasing or interesting.

add20 drawn using a circular placement primitive (originally for smaller star type graphs but accidentally applied to the original graph - oddly enough).

dime20 drawn very quickly with a matching number of vertices (so any vertices with 7 or less adjacent vertices coarsened at once). The layout is very pleasing and seems not to show any issues with the local layout, however, the global layout appears tangled still. Investigation into the cause would be ideal (and would help global layout generation for huge graphs) but due to the above problem with multi-matching, is not yet a concern.

Friday, 1 February 2013

Primitive Coarsening

Primitive Coarsening - identifying primitive graph layouts during the coarsening phase, so to avoid any further coarsening deemed unnecessary and wasteful, and to provide expected positions to vertices.

Having presented my ideas and preliminary results to my primary supervisor, we discussed a long standing topic in my research - how to deal with graphs when they are coarsened to a primitive graph type such as a trail (chain/path/thing), ring or star of connected vertices. Previously, our implementation of the multilevel scheme has either ignored these and continued to coarsen to two vertices - or if the diffference between levels is less than some tolerance (normally 1 vertex), stop coarsening.

Trailing (new name pending) and Multi-matching has been shown to decrease runtime, with trailing being the more appropriate method (Multi-matching requires a matching number, and each graph may run better or worse with matching numbers, which have so far been unpredictable - should put some results on here soon I guess). Having shown that Trailing works so far, and copes with graphs when coarsened to a trail, we look at the star and ring primitives.

So some some definitions:

Trail/Chain/Path - two vertices with only 1 adjacent vertex (the ends), and all other vertices between having 2 adjacent vertices.
Star - one vertex (the center) having V-1 adjacent vertices, and V-1 vertices have 1 adjacent vertex.
Ring - every vertex having 2 adjacent vertices.

As these structures have a known layout, FDP is unnecessary. The layout can be computed quickly and applied to this level, providing the graph with an "initial layout". This is immediately passed to the next finest graph and does not require refinement.

So far, trailing has providing useful as a preprocessing for the coarsening phases, and results in large reductions in runtime for Add20 and Add32, with some small changes to other more regular structured graphs. Quality, on average, has some small improvement across the board, however, the range makes it rather unreliable - so we need some further investigation on this.

Stopping coarsening when a primitive structure is found further decreases this runtime; as an example, Add20 was previously 10seconds with MGF, this was decreased to 4.7 seconds with Trailing, and further decreased to 3.6seconds with the introduction of Primitive Identification/Coarsening. This is still slower than some of the methods used by Hu, but is a vast improvement on the multilevel runtimes.

Next, I have intention to try an alternate (or additonal) preprocessing method which identifies small star structures in a graph (which may be connected to other star structures), and coarsen those to simplify a graph when trails to not exist but when vertices loosely connected do.

For now, there is a lot of clearing up to do regarding the work and theory, and many many results to collect, analyse and write up. I shall update again and put everything into a less erratic form in the future. Back to work for me, have a good weekend!

Monday, 28 January 2013

One down, many to go!

A brief post this week - Major news: I have completed some draft version of my Technical Report. I had wished to include more results, specifically those regarding the C parameter and testing of the peripheral effect in drawings, but I would never finish if I continued.

Since then, I have begun managing the results of the multi-matching into a respectable format, highlighting some areas for investigation (why some parts behave the way they do). Work has continued with the paper, however, keeping to the bounds of graph drawing is difficult due to the similarity to clustering (and I don't want to venture into the depths there). So far, I have been quite content writing about the effect multi-matching has on the MGF Approximation with direct regard to sparse graphs, but for the fairest comparison I fear I may need to implement a version of MIVS coarsening (of which generating an approximation tree becomes very difficult).

Other news - Dynamic stuff is temporarily on hold until I have finished all old work, and I have been slowly adding more to my Thesis outline. Nothing else to share at the moment, but I will update when I have more made some more progress.

Wednesday, 16 January 2013

The Post Christmas Something

Although we started back at work two weeks ago, this is the first I've remember to update (and by the looks of it I haven't updated in a long time). So, what news do I have to share; following from the last post, I have been spending much of my time observing the cooling schedule, I have also been working on the Technical Report for the MGF algorithm (splitting that into a Technical Report and a separate paper for the Multimatching stuff due to crossovers in the text), and making some progress on the dynamic works - although progress is slow due to my dislike of the cooling schedule (which is arguably not designed for dynamic drawing). First things first;

Cooling Schedule

The cooling schedule has been a main aim of the work, and in preparation of questions that may arise in the Viva, I have continued my revisit to previous implementations of Octree and Multilevel Global Force approximation. The aim is to ensure I know how the algorithm works and why it works (achieved long ago but experience tells me to question whatever I think is right).

The cooling schedule is used alongside the placement forces modified by Fruchterman and Reingold, to limit the movement of vertices. The forces themselves provide a vector and the cooling schedule reduces the magnitude, stopping the algorithm when the magnitude (or average magnitude of all vectors) is below some tolerance. The forces themselves are too strong to ever be less than this tolerance, and so the algorithm is entirely based upon the cooling schedule.

The reason this becomes important is the crossover to dynamic graph drawing - if a cooling schedule is used, the algorithm will inevitably reach this tolerance and so will need to be reset after any change. How the cooling schedule should be altered after changes is yet to be identified (research has not highlighted this with most just resetting to the original value or not highlighting the issue).

The "original" graph drawing spring embedder implemented by Eades provide placement forces which do not require such cooling schedule, and instead uses an iterative approach. This may introduce a problem of layouts never being found - but I have yet to see a multilevel dynamic algorithm using these forces. May be worth looking at, maybe not. For now, the few attempts I have had at identifying a cooling schedule which relates to the convergence of a layout have hit brick walls, and so I am moving on with the dynamic work before I waste too much time.

Technical Report and Multimatching

The technical report, which has now been going for well over 4 or 5 months now, has finally reached some kind of end point - with only some checking and small editing required. I have spent a large amount of time checking the implementations used (when writing the paper - different to my more recent multimatching algorithms) so I can better discuss the results.

Specifically, an analysis of runtime - including a breakdown of how much time is spent within each approximation structure, how many levels exist in the approximation structure (octree is limited to 11 or 13 to avoid stack overflows, MGF is limited by coarsening - much much higher), and how the structures affect convergence (the answer is: screw the cooling schedule).

In addition, I have now split the report into a report and a paper specific to multimatching (as intended long ago). The multimatching approach has shown to decrease the runtime associated with the MGF approximation in comparison to standard edge contraction, and possible my own implementation of the Octree. However, it is still much slower than the algorithm used by Hu due to the coarsening technique (identification of MIVS is a much better method of tackling sparse graphs it seems, and has yet to be extended to work MGF, if it even can be).

Dynamic Drawing

So far, the only sure thing is that the operations implemented work, resetting the cooling schedule after each change (rather messy and a lot of work required to smooth this out). There is still a lot of work to go before I'm happy to call the algorithm dynamic, but have been limited by my problems with the cooling schedule. Unfortunately its come to a case where I have to bite the bullet and continue as is, and either improve or explain my choices later on.

The approach is much similar to other dynamic approaches which focus on smaller graphs (extensions of the spring embedder of Kamada and Kawai, and Eades), as opposed to the literal dynamics approach implemented by Veldhuizen. An implementation of the Veldhuizen algorithm was once attempted but due to problems understanding the next, was never completed, so there is nothing which can currently be compared (other than a user interface - in summary, he wins... for now).

Over the next few weeks I will be implementing the dynamic matching and collecting results for that work (with the intention of writing a paper), followed by some research into a type of "Mental Movement Map" for the movement of vertices at different levels of a dynamic multilevel scheme (as discussed in a previous post).

After this, or during this, I may also revisit the cooling schedule with the intention of making it more representative of the progress of a layout. But for now, I am running out of time and should focus on the more related subjects of my Thesis. I will hopefully update you again soon, depends how guilty I feel regarding my progress.

Until next time, cya!