Research Projects Listing and Overview

This page lists our major research projects, past and present.

Gaining big insight from big data requires big analytics, which poses big usability problems. Analyses of big data often rely on several computational and statistical models that operate on multiple levels of data scale to discover and characterize latent data structure. The models work jointly or in sequence to filter, group, summarize, and visualize big data so that analysts may assess the data.


Interactive visual analysis has been a key component of gaining insights in information visualization area. However, the amount of data has increased exponentially in the past few years. Existing information visualization techniques lack scalability to deal with big data, such as graphs with millions of nodes, or millions of multidimensional data records.


Along with members of the BaVA Group and statistics students, we are working with General Dynamics to create a new data analytics system.


Identifying coordinated relationships is an important task in data analytics. For example, an intelligence analyst might want to find three suspicious people who visited the same four cities. However, existing techniques that display individual relationships, such as between lists of entities, require repetitious manual selection and significant mental aggregation in cluttered visualizations to find coordinated relationships.


We present a visual analytics tool, called Spectrum, to analyze the movement and communication log data from VAST Challenge 2015. Spectrum has two views: MoveView and SpectrumView. MoveView gives an overview of the movement logs at a certain timestamp by synthesizing time, location and identity information. It replays movement logs over time and demonstrates communication logs with dynamic links. SpectrumView shows the status of all visitors' activities within a period of time. Each stay of visitors in a location is visualized as a line segment.


"Be the Data" is a physical and immersive approach to visual analytics designed for teaching abstract statistical analysis concepts to students. In particular, it addresses the problem of exploring alternative projections of high-dimensional data points using interactive dimension reduction techniques. In our system, each student literally embodies a data point in a dataset that is visualized in the room of students; coordinates in the room are coordinates in a two-dimensional plane to which the high-dimensional data are projected.


Andromeda enables users to directly manipulate the data points in 2D plots of high-dimensional data to explore alternative dimension reduction projections. Andromeda implements interactive weighted multidimensional scaling (WMDS) with semantic interaction. Andromeda allows for both parametric and observation-level interaction to provide in-depth data exploration. A machine learning approach enables WMDS to learn from user manipulated projections.

See the Andromeda demo video here: https://youtu.be/lyfUMCu-wC8


The goal of this project is to enable the creation of new human-centered computing tools that will help people effectively analyze large collections of textual documents by providing powerful statistical analysis functionality in a usable and intuitive form. To accomplish that, this project investigates “semantic interaction” in visual analytics as a method to combine the large-data computationally-intensive foraging abilities of formal statistical mining algorithms with the intuitive cognitively-intensive sensemaking abilities of human analysts.


We present an event-based approach for solving a directed sensemaking task in which we combine powerful information foraging tools with intuitive synthesis spaces to solve the VAST 2014 Mini-Challenge 1 (MC1). A combination of student-created and commericially available software are used to solve various aspects of the scenario.


Semantic interaction offers an intuitive communication mechanism between human users and complex statistical models. By shielding the users from manipulating model parameters, they focus instead on directly manipulating the spatialization, thus remaining in their cognitive zone. We present the concept of multi-model semantic interaction, where semantic interactions can be used to steer multiple models at multiple levels of data scale, enabling users to tackle big data problems. We also present an updated visualization pipeline model for generalized multi-model semantic interaction.


User reviews, like those found on Yelp and Amazon, have become an important reference for decision making in daily life, for example, in dining, shopping and entertainment. However, large amounts of available reviews make the reading process tedious. Existing word cloud visualizations attempt to provide an overview. However their randomized layouts do not reveal content relationships to users. In this paper, we present ReCloud, a word cloud visualization of user reviews that arranges semantically related words as spatially proximal.


This section describes our work on Semantic Interaction, a design space for user interaction in visual analytic tools that infer analytic reasoning of users for model steering.


Distributed cognition and embodiment provide compelling models for how humans think and interact with the environment. Our examination of the use of large, high-resolution displays from an embodied perspective has lead directly to the development of a new sensemaking environment called Analyst’s Workspace (AW). AW leverages the embodied resources made more accessible through the physical nature of the display to create a spatial workspace.


The data set for the VAST Challenge 2012 Mini Challenge 1 (MC1) requires a large scale situation awareness analysis to understand a large data set containing the network health and activity status of approximately one million online devices for three days. One of the tables in the dataset contains 158,530,955 records and the devices can be ordered by hierarchically based on business units and facilities. So the main visualization challenge was to support very large quantities of hierarchical information.


The Vehicle Terrain Measurement System (VTMS) allows highly detailed terrain modeling and vehicle simulations. Visualization of large-scale terrain datasets taken from VTMS provides insights into the characteristics of the pavement or road surface. However, the resolution of these terrain datasets greatly exceeds the capability of traditional graphics displays and computer systems.


The goal of this project is to enable end users to directly manipulate data visualizations created by mathematical models for dimension reduction. Users can explore structure in high-dimensional data by directly moving data points within the visualization, causing the models to learn from the user feedback, and viewing the effects of those movements on other points.
Also known as Object-Level Interaction (OLI) and Visual-to-Parametric Interaction (V2PI).
See also Semantic Interaction.


The multiplicity of computing and display devices currently available presents new opportunities for how visual analytics is performed. One of the significant inherent challenges that comes with the use of multiple and varied types of displays for visual analytics is sharing and subsequent integration of information among different devices. Multiple devices enable analysts to employ and extend visual space for working with visualizations, but it requires users to switch intermittently between activities and foci of interest over different workspaces.


VizCept is a new web-based visual analytics system which is designed to support fluid, collaborative analysis of large textual intelligence datasets. The main approach of the design is to combine individual workspace and shared visualization in an integrated environment. Collaborating analysts will be able to identify concepts and relationships from the dataset based on keyword searches in their own workspace and collaborate visually with other analysts using visualization tools such as a concept map view and a timeline view.


User generated reviews, like those found on Yelp and Amazon, have become important reference material in casual decision making, like dining, shopping and entertainment. However, very large amounts of reviews make the review reading process time consuming. A text visualization can speed up the review reading process.


Large collections of documents create a cumbersome comprehension task. To lighten the load, interactive computational techniques can create visual summaries of these documents. We conducted a study comparing document highlights from humans to document highlights from a salience algorithm. We are exploring interactive computational techniques identified from the study results as well as the modified salience algorithm within an interactive tool. We developed a salience tool that visualizes highlights based on percentages of users who found sentences salient.