A Microsoft Research scholarship place is available to study algorithms for massive data analysis, leading to a PhD in Computer Science. Increasingly, we are faced with larger and larger volumes of data from which to extract insights and intelligence. Of particular interest is data that can be represented as a graph or (adjacency) matrix.

A promising approach is to look for ways to sketch such structures: to build a representation that is much more compact than the input, but which allows some function of interest on the original data to be approximated accurately using the sketch. Such sketches are well-known and widely used for data that can be represented as a vector (such as to identify the most frequent elements, or to count the number of distinct items).

The goal of this scholarship project is to develop new algorithms for sketching of massive graphs and matrices, and to demonstrate their usefulness via theoretical analysis and empirical evaluation. The hope is to advance our knowledge of the theory in this area, and design algorithms which can be used in practice, such as for querying data represented as a (massive) graph, clustering/partitioning graph structured data, and optimization problems over large graphs.



