MSc in Computer Science, 2016
Federal University of Minas Gerais, Brazil
BSc in Computer Science, 2013
Federal University of Uberlândia, Brazil
I am a Ph.D. candidate in the Computer Science Department (DCC) at Federal University of Minas Gerais (UFMG), Brazil. Before that I received my M.Sc. at the same institution. And before that I received my B.Sc. in Computer Science from Federal University of Uberlândia (UFU), Brazil.
Broadly, I am interested in research problems concerning systems’ performance, specially large-scale distributed systems. Recently, I have been working on performance diagnosis and reconfiguration of data parallel programming models (a.k.a. Big Data) and in particular, the following challenges have been catching my attention (and stealing some of my sleep :P):
Task Model in Data-Parallel Frameworks: Data-itensive frameworks (e.g., Spark and Flink) are based on specific task models that describe how resources (memory, CPU, disk, etc.) in parallel computations are consumed.
Scalable (Single-)Graph Pattern Mining: Graph pattern mining is every task that regards discovering, matching, searching and summarizing “interesting” subgraph structures from a single (large) input graph. Instead of worrying about definitions of what is “interesting”, I am personally interested in the development of execution models for supporting those kind of computations. Last but not least, given the combinatorial nature, I would like to find out how far we can abstract models without serious performance degradation, and how expressive those models can be in order to facilitate the implementation of graph mining algorithms.
Multi-Resource Characterization in Parallel Programs: Parallel processing frameworks perform computation by statically requesting multi-dimensional resource containers from cluster schedulers (e.g., YARN and Mesos). Community experiences have shown that tuning these dimensions appropriately taking into account each application individually can have great impact in performance.
Autonomic Computing: As computational demands can greatly vary in a parallel execution due to faults, (over,under)subscription of resources, heterogeneous environments, and so on, modern distributed systems would leverage autonomic computing characteristics and principles.