dr. Alexandru Uta
Vrije Universiteit Amsterdam
De Boelelaan 1081A
Room no.: P430
I am a postdoctoral researcher in the Computer Systems Section at the Department of Computer Science of the Vrije Universiteit Amsterdam. Currently, I am working in the Massivizing Computer Systems Group, led by prof.dr.ir. Alexandru Iosup.
Previously, I briefly worked as a postdoctoral researcher under the supervision of prof.dr.ir. Henri Bal on scalable IoT infrastructures.
In March 2017, I received my PhD from the Vrije Universiteit on Optimizing the Execution of Many-Task Computing Applications Using In-Memory Distributed File Systems", under the supervision of Dr.-Ing. habil. Thilo Kielmann and prof.dr.ir. Henri Bal.
In 2012, I graduated my MSc with honors ("cum laudae") in high-performance distributed computing at Vrije Universiteit Amsterdam, with a thesis on GPU-accelerated video encoding, under the supervision of dr. Frank Seinstra.
I received my BSc diploma in 2009 in my home country, Romania, at the University of Bucharest, Faculty of Mathematics and Computer Science.
Please drop me a line if you want to collaborate or want to do a BSc or MSc project under my supervision. For more information on the kind of projects I supervise, please see my Projects or Publications.
|Best paper award IEEE IUCC conference||2017|
|Best e-Science Service or Project, IEEE eScience conference||2015|
|IEEE TCSC CCGrid Scale Challenge Finalist||2015|
|Best poster award, IEEE CLUSTER conference||2014|
|Junior Program Chair||ISPDC||International Symposium on Parallel and Distributed Computing||2019|
|Co-Chair||CCIW||1st Workshop on Converged Computing Infrastructure||2019|
|Co-Chair||HotCloudPerf||Workshop on Hot Topics in Cloud Computing||2019|
Program Committee Member
|ICPE||International Conference on Performance Engineering||2020|
|CCGrid||International Symposium on Cluster, Cloud and Grid Computing||2019|
|CLUSTER||International Conference on Cluster Computing||2019|
|EuroPar||European Conference on Parallel and Distributed Computing||2019|
|HotCloudPerf||Workshop on Hot Topics in Cloud Computing||2018, 2019|
|SCRAMBL||Workshop on Scalable Computing For Real-Time Big Data Applications||2017, 2019|
|EuroSys||European Conference on Computer Systems, shadow PC||2017|
|IEEE TPDS||Transactions on Parallel and Distributed Systems||2018-ongoing|
|IEEE ToSE||Transactions on Software Engineering||2018-ongoing|
|IEEE Access||The Multidisciplinary Open Access Journal||2018-ongoing|
|Elsevier FGCS||Future Generation Computer Systems||2015-ongoing|
|IPDPS||International Parallel & Distributed Processing Symposium||2018|
|CCGrid||International Symposium on Cluster, Cloud and Grid Computing||2018|
|ICPP||International Conference on Parallel Processing||2017|
|CLUSTER||International Conference on Cluster Computing||2014|
|BSc Systems Architecture||2017-ongoing|
|MSc Distributed Systems||2017-ongoing|
|BSc Research-first Honors Program||2018-ongoing|
|MSc Distributed Systems||2016|
|MSc Large-scale Computing Infrastructure||2014-2016|
|MSc Internet Programming||2015|
|MSc Cluster and Grid Computing||2013|
|MSc Computer Graphics||2012|
GranularGraph - Serverless Graph Processing - project lead
Distributed graph processing (GP) is complex and expensive, requiring systems to run on large-scaleclusters. Managing such clusters and the processing systems is not trivial, requiring dedicated person-nel. This leads to capital and operational costs that are prohibitive to all but the large corporations.GP systems generally run on a fixed amount of compute nodes, making such systems brittle andunable to respond to changes in the workload by scaling. The resources required for workloads haveto be determined in advance, leading to poor provisioning. To overcome such drawbacks, this projectdesigns a cost-effective and efficient GP system built on a serverless architecture.
BDCloudVar - project lead
Public cloud computing platforms are a cost-effective solution for individuals and organizations todeploy various types of workloads, ranging from scientific applications, business-critical workloads,e-governance to big data applications. Co-locating all such different types of workloads in a singledatacenter leads not only to performance degradation, but also to large degrees of performancevariability. Many studies have already assessed and characterized the degree of resource variability in public clouds. However, we are missing a clear picture on how resource variability impacts bigdata workloads. In this project, we take a step towards characterizing the behavior of big dataworkloads under network bandwidth variability. Moreover, we aim to create performance predictionmodels, and variability-aware scheduling policies to help practitioners keep performance variabilityat bay, and reduce its effects: lack of predictability, slowdowns and extra-costs.
HPGraph - project lead
Currently, the HPC and big data communities are not convergent: they operate different types ofinfrastructure and run complementary workloads. As computation demand and volumes of data tobe analyzed are constantly increasing, the lack of convergence is likely to become unsustainable:the costs of energy, computational, and human resources far exceed what most organizations canafford. we investigate the convergence of big data and HPC infrastructure for one of the mostchallenging application domains: graph processing. We contrast through a systematic, experimentalstudy of over 300,000 core-hours the performance of a modern multicore, and of traditional big datahardware, in processing representative graph workloads. The experimental results indicate KNL isconvergence-ready, performance-wise, but only after extensive and expert-level tuning of softwareand hardware parameters.
LDBC Graphalytics - tech-lead
LDBC Graphalytics is the de-facto industry-grade benchmark for graph analytics platforms. Grapha-lytics enables the objective comparison of graph analysis platforms. It consists of six core algorithms,standard datasets, and synthetic dataset generators. The design of the benchmark takes into accountthat graph processing is impeded by three dimensions of diversity: platform, algorithms and datasets.As the tech-lead of Graphalytics, I interface regularly with our international industrial partners fromIntel, Oracle, Huawei, and work on the renewal process of the benchmark: assessing new algorithmsand datasets suitable for inclusion in our harness.
MemFS - Main PhD Research
I investigated the performance and scalability of distributed storage systems for many-task comput-ing applications. Typically, such applications exhibit data footprint variability and are composed ofmany short-lived tasks that communicate by means of files. In data-intensive scenarios, traditionaldisk-based storage systems become performance bottlenecks for such applications. To overcomethis, I designed and implemented Mem(E)FS, an in-memory distributed file system, that exposesthe cluster nodes’ memory as a fast, unified cache. Mem(E)FS is able to scale elastically, duringruntime, based on the application demands, improving resource efficiency, and adapting to band-width variability. On our local cluster, Mem(E)FS achieves all dimensions of scalability - vertical,horizontal and elastic, scaling to more than 1K cores and 1TB memory.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author.s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.