Protein Subcellular Localization
This constitutes the prediction of the location of proteins in the cell, by using various data sources such as, sequences, text, and images. Here, we are interested in the probability that a protein is located at a particular location in the cell.
Gaming Networks
This is about the analysis of a particular gaming network that contains millions of nodes and edges. The project includes identifying various network characteristics, designing efficient parallel algorithms for computing such characteristics using MapReduce, and learning interesting properties about cheaters in the population, such as, evolution of cheaters, influence of cheaters on non-cheaters, prediction of cheaters in the future.
Path Centrality
This includes the proposal of a new centrality measure that resembles betweenness centrality, called as Path Centrality and the experimental analysis of its computation on real networks and synthetic social networks and correlation with betweenness centrality.
Streaming Algorithms
This involves the analysis (theoretical and experimental) of various PageRank computation and clustering algorithms proposed under the streaming context, and the comparison of them with their non-streaming counterparts.
Mining Associations
This mainly is about the proposal of a directed hypergraph based model to mine associations in databases. Using experiments on S&P 500 dataset, we show that information about similar stocks and effectiveness of using a particular stock in prediction can be inferred from this modeling.
Influence Maximization
This study corresponds to influence maximization and identifying minimal sources in the context of distribution networks, the proposal of a flow-based diffusion model, and the analysis of these problems under other well established diffusion models such as independent cascade and linear threshold.