Tag clouds are a way of showing basic information about word frequencies within a text document. I have run a sample of clouds on some famous books with the results on this page, or you can jump straight to a CoolIris view of the images.
Visualization is the science of making pictures out of data so that they inform the viewer and allow them to understand the data and take action based on what can be seen. I create new methods of interacting with data using a computer interface and try to understand what tools help people model their data and find patterns and unusual features. I have a background in statistics and statistical graphics, and work with computer scientists as well as statisticians. My particular interests include research into:
At SPSS I am responsible for the design and application of all statistical graphics, tables, interactive graphics and statistical model visualization. In addition I have worked with Lee Wilkinson to develop course-grained parallel implementations of complex chainable statistical models. I designed and built my third major general statistical graphics system, Vizml, now called nViZn, based on ideas and techniques outlined in The Grammar of Graphics. This system uses an XML specification to define a visualization. Vizml is designed to be a superset of Lee Wilkinson's GPL, and is heavily influenced by that language and Lee's book The Grammar of Graphics. It also incorporates elements form my previous two large systems, EDV and REGARD, as well as novel features designed to handle interactivity, multiple output formats, and an advanced set of tools for defining data domains.
I worked on visualization research with particular application to software analysis at Bell labs. I created the EDV environment and lead a research team in further developing it. EDV is a powerful exploratory statistical environment that allows high-dimensional, multivariate information in a number of forms (database, textual, time series, hierarchical, networks) to be visualized.
I built the Nicheworks system for the visual analysis of very large (~million node) weighted network graphs. As well as specialized layout techniques I invented visual filtering and focusing techniques and methods for combining statistical graphics and network views. I worked on visualization of large time-based databases with heterogeneous data types.
I designed novel views which synthesized data mining and statistical data analysis techniques with visualization methodology in many areas, including: Conditional Independence Graphs, Decision Trees, Multi-way table analysis, Parallel coordinates, Lowess regression, Neural Nets and time-stamped data.
My Ph.D. was titled Spatial Data: Exploration and Modelling via Distance-based and Exploratory Graphics Techniques. In this thesis I described a visual environment for analyzing GIS with statistical data and also proved theorems that allow new and useful classes of distance force functions to be used in pairwise interaction point processes. These processes model individuals changing position over time under the influence of a general interactive force. I also demonstarted how traditional measures of pairwise association fail to spot important processes that can be easily seen visually.
I created a visual environment for viewing statistical graphics and geographical information called REGARD, and another time series visualization system named Diamond Fast which was capable fo working with non-gridded and regular time series data in the same environment. Aspects of this package surfaced in REGARD, allowing for the visual analysis of space-time data.
As an undergraduate I won the prestigous Foundation Scholarship (one of 14 awarded across all disciplines) and was awarded a Gold Medal for excellence in my finals. I entered Trinity College as the highest ranked mathematics student in the country.
Apparatus and method for use in collaboration services United States Patent #7,299,257 (2007)
Apparatus and method for use in a data/conference call system for automatically collecting participant information and providing all participants with that information for use in collaboration services United States Patent #7,107,312 (2006)
Method and apparatus for generating and displaying views of hierarchically clustered data United States Patent #6,304,260 (2001)
Apparatus for visualizing program slices United States Patent #6,125,375 (2000)
Graphical display of relationships United States Patent #5,835,085 (1998)
Apparatus for visualizing program slices United States Patent #5,793,369 (1998)
Using symbols whose appearance varies to show characteristics of a result of a query United States Patent #5,636,350 (1997)
Graphical display of relationships United States Patent #5,596,703 (1997)
Object-oriented functionality class library for use in graphics programming United States Patent #5,564,048 (1996)
Visualizing Time Springer, 2010
Visualizing Network Data Encyclopedia of Database Systems, Ozsu, M. Tamer; Liu, Ling (Eds.); chapter ???
Visualizing Hierarchies Encyclopedia of Database Systems, Ozsu, M. Tamer; Liu, Ling (Eds.); chapter ???
Scagnostics Distributions Journal of Computational and Graphical Statistics, Volume 17; pp.473-491
Linked Data Views Handbook of Data Visualization; Chen, Härdle and Unwin (eds); chapter II-2; pp.216-241
Networks Graphics of Large DataSets; Unwin, Theus and Hofmann (eds); Springer; chapter 8; pp157-176
Visualization Handbook of Data Mining and Knowledge Discovery. Oxford University Press, 2002; pp.707-714
Data visualization for domain exploration: interactive statistical graphics Handbook of Data Mining and Knowledge Discovery. Oxford University Press, 2002; pp.226-232
Natural Selection: Interactive Subset Creation Journal of Computational and Graphical Statistics, Volume 9, #3, September 2000; pp 544-557
Building Information Visualizations: A Commonality Analysis Information Visualization 2000 Conference Proceedings
Interactive Statistical Graphics Handbook of Data Mining and Knowledge Discovery. Oxford University Press, 1999
NicheWorks-Interactive Visualization of Very Large Graphs (1997) Journal of Computational and Graphical Statistics, vol.8 #2, pp 190-212
An Interactive View for Hierarchical Clustering Information Visualization '98 Conference Proceedings, 1998. Raleigh, North Carolina
High Interaction Graphics European Journal of Operational Research:445-459, 1995.
Navigating Large Networks with Hierarchies Visualization '93 Conference Proceedings, pages 204-210, 25-29 October 1993. San Jose, California.
Dynamic Graphics for Exploring Spatial Data, with Application to Locating Global and Local Anomalies American Statistician Vol.45 no.3 pp234-242
Dynamic Interactive Graphics for Spatially Referenced Data Softstat '89 Fortschritte der Statistik-Software 2, Gustav Fischer Verlag, Stuttgart, pp278-287
SPIDER - An Interactive Statistical Tool for the Analysis of Spatial Data Int.J Geographical Information Systems 4 no3, pp285-296
Eyeballing Time Series Proceedings of 1988 ASA Stat. Computing Section. pp263-268
The following links give fuller lists and more details:
Visualizing Time is the title of the book I am writing for Spring-Verlag, to appear in early-mid 2010. Here are some sample pages from an early draft PDF version, which is vector art and fully zoomable:
click to label, double-click to define a trip,
drag to spin, right- or cntl-click for menu.
Or go to a large version