Visualization is the science of making pictures out of data so that they inform the viewer and allow them to understand the data and take action based on what can be seen. I create new methods of interacting with data using a computer interface and try to understand what tools help people model their data and find patterns and unusual features. I have a background in statistics and statistical graphics, and work with computer scientists as well as statisticians. My particular interests include research into:
- Fundamental methods for interaction with data views
- Statistical methods to improve or motivate visualization design
- The interface between statistical models and statistical graphics
- Visualization of large weighted graphs
- Visualization of textual data
- Ways to use knowledge discovery techniques with visualization
Over the last 25 years I have built five different frameworks for visualization systems, starting alone, and then leading small teams, and now leading a team of forty people. I have published multiple articles on topics related to visualization frameworks, many of which are review articles and summaries for encyclopedic books.
These systems had different foci, languages and requirements, but consistently involve:
- multiple coordinated views, especially of domain specific views — networks, geography, text etc.,
- high interaction,
- focused on making information intelligible to domain experts rather than data and statistics experts.
Visualization Architect and Technical Leader, IBM: 2009 – present
I hold the position of Senior Technical Staff Member at IBM. I lead a small cognitive visualization team focused on adding intelligence to visualization systems, and am the Chief Architect for the RAVe IBM visualization team. This work involves both high-level and low-level design, as well as reviews, education and technical advice.
Visualization Architect and Principal Software Engineer, SPSS: 2001 – 2009
At SPSS I was responsible for the design and application of all statistical graphics, tables, interactive graphics and statistical model visualization. In addition I have worked with Lee Wilkinson to develop course-grained parallel implementations of complex chainable statistical models. I designed and built my third major general statistical graphics system, VizML, based on ideas and techniques outlined in The Grammar of Graphics. The VizML system has been distributed and used as part of all SPSS offerings for a decade, and has millions of users in over 150 different countries.
Principal Investigator, Bell Laboratories: 1992 – 2001
I worked on visualization research with particular application to software analysis at Bell labs. I created the EDV environment and lead a research team in further developing it. EDV is a powerful exploratory statistical environment that allows high-dimensional, multivariate information in a number of forms (database, textual, time series, hierarchical, networks) to be visualized. This work formed the basis of AT&T’s Visual Insights start-up company (now Advizor Solutions, Inc.)
I built the Nicheworks system for the visual analysis of very large (~million node) weighted network graphs. As well as specialized layout techniques I invented visual filtering and focusing techniques and methods for combining statistical graphics and network views. I worked on visualization of large time-based databases with heterogeneous data types.
I designed novel views which synthesized data mining and statistical data analysis techniques with visualization methodology in many areas, including: Conditional Independence Graphs, Decision Trees, Multi-way table analysis, Parallel coordinates, Lowess regression, Neural Nets and time-stamped data.
Assistant lecturer, trinity College, Dublin: 1988 – 1992
My Ph.D. was titled Spatial Data: Exploration and Modelling via Distance-based and Exploratory Graphics Techniques. In this thesis I described a visual environment for analyzing GIS with statistical data and also proved theorems that allow new and useful classes of distance force functions to be used in pairwise interaction point processes. These processes model individuals changing position over time under the influence of a general interactive force. I also demonstrated how traditional measures of pairwise association fail to spot important processes that can be easily seen visually.
I created a visual environment for viewing statistical graphics and geographical information called REGARD, and another time series visualization system named Diamond Fast which was capable of working with non-gridded and regular time series data in the same environment. Aspects of this package surfaced in REGARD, allowing for the visual analysis of space-time data.
As an undergraduate I won the Foundation Scholarship (one of 14 awarded across all disciplines) and was awarded a Gold Medal for excellence in my finals. I entered Trinity College as the highest ranked mathematics student in the country.
Publications
A relatively complete list is here.