Friday, May 8, 2009

Clustering Placemarks

Pam Fox has published a talk on avoiding red dot fever, in it she looks at various ways people have dealt with displaying large numbers of placemarks.

Users First: Firstly an observation: in this blog post Pam links to the author tackles the problem in a techy way, the problem is viewed as 'plotting too much data slows down the computer' and the solutions offered are technical. IMHO our first examination should be 'how can we best help the user?' before gettting into technical solutions. I apply this frame of thought to some of the maps Pam mentions below.

Maps with User Experience (UX) Problems:

Clustering Experiment Map

Clustering experiment: placemarks are clustered into group placemarks (see above image). It works but the user doesn't know the geographical bounds of the cluster. Its basically the same issue I talked about under 'Regions Functionality' here.

Pizza example solves that problem by being linked to state boundaries but I think a color coded key related to polygon fill would work better here - in this context I want to know there are lots/some/no pizza restuarants not that there are exactly 43 in California. Using size for the icon is good in that I suspect it is related strongly to number of restuarants in the users' mind but it has the problem that it obscures some of the other states notably California and the small eastern states. That's my opinion but its arguable.

Heat map this works well at a high level, its essentially the polygon fill solution I suggest above but when you zoom in, rather than resolve to placemarks, the heat map just becomes 'hot dots' which cluster around individual points which doesn't work that well.

Maps with Great UX:

Placemarks as Circles Example.

The Google maps way of presenting the majority of placemarks as simple circles while selecting some 'best' placemarks as paddles is great, it allows many more placemarks to be rendered on screen without cluttering the view or slowing the computer down too much.

Is the Criticism Fair? Criticizing this work is a bit unfair, a lot of it is experimental, designed to show off some functionality rather than being a real published map. However, I think its worth discussing the problem and the maps formed a nice illustration of issues. I would be interested to hear of user testing on these maps if any exists as that would be the acid test.

My Solution: To deal with large numbers of placemarks I would do the following:
  • At altitude the map is a grid of colored squares or polygons, color indicates density of placemarks underneath (described in detail here)
  • With the colors laid out in a key (Pam mentions the importance of this)
  • As the user flies lower and reduces the number of placemarks that must therefore be rendered, the grid dissolves to reveal the placemarks themselves
Choice of grid size, colors in the key and at what height the placemarks appear is context dependent.


Bob K said...

Nice posting - I particularly liked the observation about approaching this problem from the user's perspective.

And I think your observation about proximity clustering is spot on. I spent a week implementing a (very fast) server-side preclustering algorithm which handles hundreds of thousands of markers. Performance is great, but it suffers from the issue that the user really has no idea where the edges of the clusters lie on the map. They see odd behavior such as no markers showing up in Denver at one zoom level, and hey presto a hundred show up when you zoom in one level and the clustering changes. Fun to develop, but maybe not the best for users.

Heat maps look like an interesting alternative, thanks for the pointer.

p.s. I don't agree that the Google maps approach has "great UX". Users have no idea why certain markers are selected for display, it isn't clear if all markers are shown, and the approach doesn't work at low zoom levels without dropping many markers.

Chris said...

Hi, I just saw this blog. Great stuff!

I wanted to mention that I'm working on an open-source thematic cartography library at

One of the main features of the project is a grid-based clustering just like you outline here. I'm glad to see I'm not the only one thinking this is a good idea. :-)

Thanks for the great blog -- I look forward to reading more.