logo



Alexandru Uta


Info Grants Awards Service Education Projects Publications

dr. Alexandru Uta
Vrije Universiteit Amsterdam
De Boelelaan 1081A
1081HV Amsterdam
The Netherlands

Email: Initial.Lastname@vu.nl

Room no.: P430


Google Scholar

General Information

I am a postdoctoral researcher in the Computer Systems Section at the Department of Computer Science of the Vrije Universiteit Amsterdam. Currently, I am working in the Massivizing Computer Systems Group, led by prof.dr.ir. Alexandru Iosup.

In the summer of 2017, I worked as a research-intern at Databricks, where I designed and implemented a distributed index on Spark in collaboration with prof.dr. Peter Boncz.

Previously, I briefly worked as a postdoctoral researcher under the supervision of prof.dr.ir. Henri Bal on scalable IoT infrastructures.

In March 2017, I received my PhD from the Vrije Universiteit on Optimizing the Execution of Many-Task Computing Applications Using In-Memory Distributed File Systems", under the supervision of Dr.-Ing. habil. Thilo Kielmann and prof.dr.ir. Henri Bal.

In 2012, I graduated my MSc with honors ("cum laudae") in high-performance distributed computing at Vrije Universiteit Amsterdam, with a thesis on GPU-accelerated video encoding, under the supervision of dr. Frank Seinstra.

I received my BSc diploma in 2009 in my home country, Romania, at the University of Bucharest, Faculty of Mathematics and Computer Science.

Please drop me a line if you want to collaborate or want to do a BSc or MSc project under my supervision. For more information on the kind of projects I supervise, please see my Projects or Publications.


Grants

  1. Google, 2018 - PI, Granular Graph Processing, $5,000 grant for usage in the Google Cloud. Projecton assessing the impact of FaaS and Serverless paradigms on Graph Analytics.

  2. SURFsara, 2018 - PI, BDCloudVar, 70K compute hours in the SURFsara cloud, accounting €70,000. Project on studying the effects of performance variability on Big Data workloads.

  3. NWO, 2017-2018 - PI, HPGraph, Pilot Project for 500K Cartesius Cluster hours, equivalent to approx. €500,000. Project on studying the HPC and Big Data convergence.

  4. Intel, 2017-2018 - PI, Intel gift for unlimited use of an Intel KNL cluster of 256 nodes.

Awards

Best paper award IEEE IUCC conference2017
Best e-Science Service or Project, IEEE eScience conference2015
IEEE TCSC CCGrid Scale Challenge Finalist2015
Best poster award, IEEE CLUSTER conference2014

Service

Conference/Workshop Organization
Junior Program ChairISPDCInternational Symposium on Parallel and Distributed Computing2019
Co-ChairCCIW1st Workshop on Converged Computing Infrastructure2019
Co-ChairHotCloudPerfWorkshop on Hot Topics in Cloud Computing2019

Program Committee Member
ICPEInternational Conference on Performance Engineering2020
CCGridInternational Symposium on Cluster, Cloud and Grid Computing2019
CLUSTERInternational Conference on Cluster Computing2019
EuroParEuropean Conference on Parallel and Distributed Computing2019
HotCloudPerfWorkshop on Hot Topics in Cloud Computing2018, 2019
SCRAMBLWorkshop on Scalable Computing For Real-Time Big Data Applications2017, 2019
EuroSysEuropean Conference on Computer Systems, shadow PC2017

Journal Reviewer
IEEE TPDSTransactions on Parallel and Distributed Systems2018-ongoing
IEEE ToSETransactions on Software Engineering2018-ongoing
IEEE AccessThe Multidisciplinary Open Access Journal2018-ongoing
Elsevier FGCSFuture Generation Computer Systems2015-ongoing

External Reviewer
IPDPSInternational Parallel & Distributed Processing Symposium2018
CCGridInternational Symposium on Cluster, Cloud and Grid Computing2018
ICPPInternational Conference on Parallel Processing2017
CLUSTERInternational Conference on Cluster Computing2014


Education

Co-Teacher
BSc Systems Architecture2017-ongoing
MSc Distributed Systems2017-ongoing
BSc Research-first Honors Program2018-ongoing

Teaching Assistant
MSc Distributed Systems2016
MSc Large-scale Computing Infrastructure2014-2016
MSc Internet Programming2015
MSc Cluster and Grid Computing2013
MSc Computer Graphics2012


Projects

GranularGraph - Serverless Graph Processing - project lead
Distributed graph processing (GP) is complex and expensive, requiring systems to run on large-scaleclusters. Managing such clusters and the processing systems is not trivial, requiring dedicated person-nel. This leads to capital and operational costs that are prohibitive to all but the large corporations.GP systems generally run on a fixed amount of compute nodes, making such systems brittle andunable to respond to changes in the workload by scaling. The resources required for workloads haveto be determined in advance, leading to poor provisioning. To overcome such drawbacks, this projectdesigns a cost-effective and efficient GP system built on a serverless architecture.

BDCloudVar - project lead
Public cloud computing platforms are a cost-effective solution for individuals and organizations todeploy various types of workloads, ranging from scientific applications, business-critical workloads,e-governance to big data applications. Co-locating all such different types of workloads in a singledatacenter leads not only to performance degradation, but also to large degrees of performancevariability. Many studies have already assessed and characterized the degree of resource variability in public clouds. However, we are missing a clear picture on how resource variability impacts bigdata workloads. In this project, we take a step towards characterizing the behavior of big dataworkloads under network bandwidth variability. Moreover, we aim to create performance predictionmodels, and variability-aware scheduling policies to help practitioners keep performance variabilityat bay, and reduce its effects: lack of predictability, slowdowns and extra-costs.

HPGraph - project lead
Currently, the HPC and big data communities are not convergent: they operate different types ofinfrastructure and run complementary workloads. As computation demand and volumes of data tobe analyzed are constantly increasing, the lack of convergence is likely to become unsustainable:the costs of energy, computational, and human resources far exceed what most organizations canafford. we investigate the convergence of big data and HPC infrastructure for one of the mostchallenging application domains: graph processing. We contrast through a systematic, experimentalstudy of over 300,000 core-hours the performance of a modern multicore, and of traditional big datahardware, in processing representative graph workloads. The experimental results indicate KNL isconvergence-ready, performance-wise, but only after extensive and expert-level tuning of softwareand hardware parameters.

LDBC Graphalytics - tech-lead
LDBC Graphalytics is the de-facto industry-grade benchmark for graph analytics platforms. Grapha-lytics enables the objective comparison of graph analysis platforms. It consists of six core algorithms,standard datasets, and synthetic dataset generators. The design of the benchmark takes into accountthat graph processing is impeded by three dimensions of diversity: platform, algorithms and datasets.As the tech-lead of Graphalytics, I interface regularly with our international industrial partners fromIntel, Oracle, Huawei, and work on the renewal process of the benchmark: assessing new algorithmsand datasets suitable for inclusion in our harness.

MemFS - Main PhD Research
I investigated the performance and scalability of distributed storage systems for many-task comput-ing applications. Typically, such applications exhibit data footprint variability and are composed ofmany short-lived tasks that communicate by means of files. In data-intensive scenarios, traditionaldisk-based storage systems become performance bottlenecks for such applications. To overcomethis, I designed and implemented Mem(E)FS, an in-memory distributed file system, that exposesthe cluster nodes’ memory as a fast, unified cache. Mem(E)FS is able to scale elastically, duringruntime, based on the application demands, improving resource efficiency, and adapting to band-width variability. On our local cluster, Mem(E)FS achieves all dimensions of scalability - vertical,horizontal and elastic, scaling to more than 1K cores and 1TB memory.


Publications

  1. Alexandru Uta, Bogdan Ghit, Ankur Dave, Peter Boncz: [Demo] Low-latency Spark eries on Updatable Data, 2019 ACM SIGMOD International Conference on Management of Data, 1-5 July, Amsterdam.

  2. Lucian Toader, Alexandru Uta, Alexandru Iosup: Graphless: Toward Serverless Graph Processing, 2019 IEEE International Symposium on Parallel and Distributed Computing (ISPDC), 5-7 June, Amsterdam.

  3. Michel Cojocaru, Alexandru Uta, Ana-Maria Oprescu: Attributes Assessing the Quality of Microservices Automatically Decomposed from Monolithic Applications, 2019 IEEE International Symposium on Parallel and Distributed Computing (ISPDC), 5-7 June, Amsterdam.

  4. Maria Voinea, Alexandru Uta, Alexandru Iosup: POSUM: A Portfolio Scheduler for MapReduce Workloads, 2018 IEEE International Conference on Big Data, December 10-13, Seattle.

  5. Alexandru Uta, Ana Lucia Varbanescu, Ahmed Musaafir, Chris Lemaire, Alexandru Iosup: Exploring HPC and Big Data Convergence: a Graph Processing Study on Intel Knights Landing, 2018 IEEE International Conference on Cluster Computing (CLUSTER), September 2018, Belfast.

  6. Alexandru Uta, Sietse Au, Alexey Ilyushkin, Alexandru Iosup: Elasticity in Graph Analytics? A Benchmarking Framework for Elastic Graph Processing, 2018 IEEE International Conference on Cluster Computing (CLUSTER), September 2018, Belfast.

  7. Erwin van Eyk, Lucian Toader, Sacheendra Talluri, Laurens Versluis, Alexandru Uta, Alexandru Iosup: Serverless is More: From PaaS to Present Cloud Computing, IEEE Internet Computing, Sept, 2018.

  8. Alexandru Iosup, Alexandru Uta, Laurens Versluis, Georgios Andreadis, Erwin van Eyk, Tim Hegeman, Sacheendra Talluri, Vincent van Beek, Lucian Toader: Massivizing Computer Systems: a Vision to Understand, Design, and Engineer Computer Ecosystems through and beyond Modern Distributed Systems, ICDCS, Vienna, Austria, July 2-5, 2018.

  9. Alexandru Uta, Harry Obaseki: A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability, 1st Workshop on Hot Topics in Cloud Computing Performance, in conjuction with ICPE, 9 April 2018, Berlin.

  10. Sietse Au, Alexandru Uta, Alexey Ilyushkin, Alexandru Iosup: An Elasticity Study of Distributed Graph Processing, 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 2018, Washington.

  11. Nicolae Vladimir Bozdog, Marc X. Makkes, Alexandru Uta, Roshan Bharath Das, Aart Van Halteren and Henri Bal: SenseLE: Exploiting Spatial Locality in Decentralized Sensing Environments, 16th IEEE International Conference on Ubiquitous Computing and Communications (IUCC 2017), Guangzhou, China, December 12-15, 2017 (best paper award)

  12. Marc X. Makkes, Alexandru Uta, Roshan Bharath Das, Vladimir Bozdog and Henri Bal: P2-SWAN: Real-time Privacy Preserving Computation for IoT Ecosystems, IEEE, 1st International Conference on Fog and Edge Computing (ICFEC 2017), May 2017, Madrid

  13. Alexandru Uta, Ove Danner, Cas van der Weegen, Ana-Maria Oprescu, Andreea Sandu, Stefania Costache, Thilo Kielmann: MemEFS: A network-aware elastic in-memory runtime distributed file system, Future Generation Computer Systems, 2017, https://doi.org/10.1016/j.future.2017.03.017

  14. Alexandru Uta, Ana-Maria Oprescu, Thilo Kielmann: Towards Resource Disaggregation - Memory Scavenging for Scientific Workloads, 2016 IEEE International Conference on Cluster Computing (CLUSTER), September 2016, Taipei

  15. Alexandru Uta, Andreea Sandu, Stefania Costache, Thilo Kielmann: MemEFS: an elastic in-memory runtime file system for escience applications, 2015 IEEE 11th International Conference on e-Science (e-Science), August/September 2015, Munchen (best eScience service or project)

  16. Alexandru Uta, Andreea Sandu, Stefania Costache, Thilo Kielmann: Scalable In-Memory Computing, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 2015, Shenzhen

  17. Alexandru Uta, Andreea Sandu, Thilo Kielmann: Overcoming data locality: An in-memory runtime file system with symmetrical data distribution, Future Generation Computer Systems, 2015, https://doi.org/10.1016/j.future.2015.01.013

  18. Alexandru Uta, Andreea Sandu, Thilo Kielmann: POSTER: MemFS: An in-memory runtime file system with symmetrical data distribution, 2014 IEEE International Conference on Cluster Computing (CLUSTER), September 2014, Madrid (best poster award)

  19. Alexandru Uta, Andreea Sandu, Ion Morozan, Thilo Kielmann: In-memory runtime file systems for many-task computing, International Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, co-located with PODC 2014, Paris




This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author.s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.



Alexandru Uta