Vladimir Vlassov

Professor, PhD, ACM member, IEEE member

Vladimir (Vlad) Vlassov is a Professor in Computer Systems at the Division of Software and Computer Systems (SCS), Department of Computer Science (CS), School of Electrical Engineering and Computer Science (EECS), KTH Royal Institute of Technology, Stockholm, Sweden. He is a member of the Distributed Computing research group. He was a visiting scientist at Massachusetts Institute of Technology (1998) and at the University of Massachusetts Amherst (2004), USA. Vladimir has participated in EU projects such as Grid4All (2006-2009, FP6), SELFMAN (2006-2009, FP6), ENCORE (2010-2013, FP7), PaPP (2012-2015, FP7), CLOMMUNITY (2013-2015, FP7). He is one of the coordinators of the EMJD-DC Erasmus Mundus Joint Doctorate in Distributed Computing (2011-2020). Currently, he is a principal investigator from KTH in the H2020-EU project "ExtremeEarth: From Copernicus Big Data to Extreme Earth Analytics" (2019-2021, H2020). At KTH, he is teaching courses on Data Mining, Distributed Systems, Concurrent Programming, Stream Processing. His current research interests include Cloud computing; data-intensive computing, stream processing, scalable distributed deep learning; autonomic computing; distributed systems.


Research

Current research interests: Distributed systems; Cloud computing; Autonomic computing; Data-intensive computing, Stream processing, Scalable distributed deep learning, Reinforcement Learning, NLP and NLU

ALEC2 - Adaptive Level of Effective and Continuous Care to common mental health disorders (CMDs)

Research council of Norway, PN 321561, 2021-2025

Abstract The ALEC2 project combines expertise from Clinical Psychology and Computer Science to develop and implement ML methods to improve adherence, clinical efficiency and treatment efficacy in the delivery of (Braive’s) digital, iCBT (Internet-based Cognitive Behavioral Therapy) supported psychotherapy solutions. Braive (the company coordinating the project) will bring to market a new patient-centric and R&D-driven solution that will address the shortcomings of current iCBT solutions. To meet this goal, we will create a new generation of iCBT that gradually automates the timely response to a patient’s development in treatment. Our new system AVA (Automated Vigilance Assistant) - an AI-supported system for online cognitive behavioural therapy to patients with common mental health disorders - will make use of AI technologies, namely NLP, NLU and DL, to automatically diagnose, predict and monitor mental health conditions, to support patients towards full recovery. AVA will be able to: (i) Take patient’s guided inputs from clinically validated Mental Health Check (MHC) tool and support clinical decision-making by remote therapists, using quantitative scores and qualitative analysis; (ii) Understand patients’ notes and queries through Deep Learning (DL) and Natural Language Understanding (NLU) systems; (iii) Monitor and detect deviations from treatment trajectories, by interpreting written input and analyse sporadic queries with patients to assess compliance and do sentiment analysis; (iv) Trigger human- or AI-led interventions targeted to each patient and to the observed deviation from treatment trajectory.

ExtremeEarth: From Copernicus Big Data to Extreme Earth Analytics

Abstract Copernicus is the European program for monitoring the Earth. The geospatial data produced by the Sentinel satellites puts Copernicus at the forefront of the Big Data paradigm, giving rise to all the relevant challenges: volume, velocity, variety, veracity and value. ExtremeEarth concentrates on developing the technologies that will make Europe a pioneer in the area of Extreme Earth Analytics i.e., the Remote Sensing and Artificial Intelligence techniques that are needed for extracting information and knowledge out of the petabytes of Copernicus data. The ExtremeEarth consortium consists of Remote Sensing and Artificial Intelligence researchers and technologists with outstanding scientific track records and relevant commercial expertise. The research and innovation activities undertaken in ExtremeEarth will significantly advance the frontiers in Big Data, Earth Analytics and Deep Learning for Copernicus data and Linked Geospatial Data, and make Europe the top player internationally in these areas. The ExtremeEarth technologies will be demonstrated in two use cases with societal, environmental and financial value: the Food Security use case and the Polar use case. ExtremeEarth will bring together the Food Security and Polar communities, and will work with them to develop technologies that can be used by these communities in the respective application areas. The results of ExtremeEarth will be exploited commercially by the industrial partners of the consortium.

EMJD-DC: Erasmus Mundus Joint Doctorate in Distributed Computing

EU/EACEA EMJD, GA 2012-0030, 2011-2020

Abstract EMJD-DC is an international doctoral programme in Distributed Systems. Students carry out their research work over up to four years in two universities from different countries, with additional mobility to industry in most projects. Joint training schools cover both scientific topics and transferable skills, such as project and scientific management, communication, innovation techniques. EMJD-DC initially awards double degrees, however a task is evaluating the implementation of a Joint Degree. The research projects address some of the key technological challenges of our time, mainly but not exclusively: ubiquitous data-intensive applications, scalable distributed systems (including Cloud computing and P2P models), adaptive distributed systems (autonomic computing, green computing, decentralized and voluntary computing), and applied distributed systems (distributed algorithms and systems, working in an inter-disciplinary manner, in existing and emerging fields to address industrial and societal needs in the European and worldwide context. The consortium partners assembled in EMJD-DC have a high international reputation in the research fields described above. They complement each other very well in their specialisation fields of research, and in the corresponding training offers. The first language of all training and research activities will be English, but students are exposed to local languages.

CLOMMUNITY: A Community networking Cloud in a Box

Objective Community networking is an emerging model for the Future Internet across Europe and beyond where communities of citizens can build, operate and own open IP-based networks, a key infrastructure for individual and collective digital participation. The CLOMMUNITY project aims at addressing the obstacles for communities of citizens in bootstrapping, running and expanding community-owned networks that provide community services organised as community clouds. That requires solving specific research challenges imposed by the requirement of: self-managing and scalable (decentralized) infrastructure services for the management and aggregation of a large number of widespread low-cost unreliable networking, storage and home computing resources; distributed platform services to support and facilitate the design and operation of elastic, resilient and scalable service overlays and user-oriented services built over these underlying services, providing a good quality of experience at the lowest economic and environmental cost. This will be achieved through experimentally-driven research, using the FIRE CONFINE community networking testbed, the participation of large user communities (20000+ people) and software developers from several community networks, by extending existing cloud service prototypes in a cyclic participatory process of design, development, experimentation, evaluation and optimization for each challenge. The consortium has two representative community networks with a large number of end-users and developers, who use diverse applications (e.g., content distribution, multimedia communication, community participation) and also service providers, research institutions with experience and prototypes in the key related areas, and a recognized international organisation for the dissemination of the outcome.

E2E-Clouds: End-to-End Distributed Clouds

Summary The E2E-cloud project proposes to develop a distributed and federated cloud infrastructure that meets the challenge of scale and performance for data-intensive services by aggregating, provisioning and managing computational, storage and network resources from multiple centers and providers. It is an open, secure and integrated network, storage and computing infrastructure where different nodes are owned by different organisations and organisations may combine the role of provider and user. The management of network resources is integrated with the management of computation and storage enabling good performance of applications running across multiple centers, as well as timely and efficient delivery of content to end-users encouraging further digital convergence between Telco, Media and ICT. The E2E-Cloud is based on an open, self-managing decentralised architecture, aggregating and managing distributed resources in a secure and fault-tolerant manner. It will facilitate the construction of novel data-intensive and media-intensive Internet services that use resources on-demand in short intense bursts, using and generating massive amounts of data. Together with industrial collaborators a number of demonstrators will be developed to test, evaluate and illustrate the E2E-Cloud platform, ranging from telecom to media production and distribution, large scale analysis of Web content, and software testing as a service.

PaPP: Portable and Predictable Performance on Heterogeneous Embedded Manycores

Objective Modern advanced products of today use embedded computing systems with exacting requirements on execution speed, timeliness, and power consumption. It is a grand challenge to guarantee these requirements across product families and in the face of rapid technological evolution, as current development practices cannot manage performance requirements the same way they manage functional requirements. Even worse, with the proliferation of complex parallel target platforms, it becomes more difficult to design a system that reaches a given performance goal with just the minimum amount of resources, managed right. Today the only solution to this problem is to over-design systems: systems are equipped pragmatically with an overcapacity that likely avoids under-performance, but for this very reason are more expensive and consume more resources than necessary. The proposed project aims at making performance predictable in every development phase, from the modelling of the system, over its implementation, to its execution by allowing for early specification and analysis of performance of systems, its adaptation to different hardware platforms, including an adaptive runtime system. During the project, the developed methods and tools will be evaluated on a number of industrial use cases and demonstrators in three application domains important to European industry: Multimedia, Avionics and space, and Mobile communication. This approach will guarantee that the methods and tools developed are both usable and effective. To achieve our goals we have built a highly skilled European consortium consisting of a balanced mix of problem owners, domain experts, and technology providers: large enterprises as application drivers, platform providers and system integrators, SME’s as key-technology innovators, and research institutes and universities bringing leading edge perspectives.

ENCORE: ENabling technologies for a programmable many-CORE

Objective Design complexity and power density implications stopped the trend towards faster single-core processors. The current trend is to double the core count every 18 months, leading to chips with 100+ cores in 10-15 years. Developing parallel applications to harness such multicores is the key challenge for scalable computing systems. The ENCORE project aims at achieving a breakthrough on the usability, reliability, code portability, and performance scalability of such multicores.The project achieves this through three main contributions. First, defining an easy to use parallel programming model that offers code portability across several architectures. Second, developing a runtime management system that will dynamically detect, manage, and exploit parallelism, data locality, and shared resources. And third, providing adequate hardware support for the parallel programming and runtime environment that ensures scalability, performance, and cost-efficiency.The technology will be developed and evaluated using multiple applications, provided by the partners, or industry-standard benchmarks, ranging from massively parallel high-performance computing codes, where performance and efficiency are paramount, to embedded parallel workloads with strong real-time and energy constraints.The project integrates all partners under a common runtime system running on real multicore platforms, a shared FPGA architecture prototype, and a large-scale software simulated architecture. Architecture features will be validated through implementation on ARM's detailed development infrastructure.ENCORE takes a holistic approach to parallelization and programmability by analyzing the requirements of several relevant applications ranging from High Performance Computing to embedded multicore, by parallelizing these applications using the proposed programming model, by optimizing the runtime system for a range of parallel architectures, and by developing hardware support for the runtime system.

SELFMAN: Self Management for Large-Scale Distributed Systems

Objective The goal of SELFMAN is to make large-scale distributed applications that are self managing, by combining the strong points of component models and structured overlay networks. One of the key obstacles to deploying large-scale applications running on networks such as the Internet is the issue of management. Currently many specialized personnel are needed to keep large Internet applications running. SELFMAN will contribute to removing this obstacle, and thus enable the development of many more Internet applications. In the context of SELFMAN, we define self management along four axes: self configuration (systems configure themselves according to high-level management policies), self healing (systems automatically handle faults and repair them), self tuning (systems continuously monitor their performance and adjust their behaviour to optimize resource usage and meet service level agreements), and self protection (systems protect themselves against security attacks). SELFMAN will provide self management by combining a component model with a structured overlay network.

Grid4All: Self-* Grid: Dynamic Virtual Organizations for schools, families, and all

Objective Grid4All aims to enable domestic users and non-profit organisations such as schools and small enterprises, to share their resources and to access massive grid resources when needed, envisioning a future in which access to resources is democratised and cooperative. Examples include home users of image editing application, school projects like volcanic eruption simulations, or small businesses doing data mining. Cooperation examples include joint homework between pupils, or international collaboration.Grid4All goals entail a system pooling large amounts of cheap resources (connecting to commercial cluster providers when needed); a dynamic system satisfying spikes of demand; using self-management techniques to scale; supporting isolated, secure, dynamic, geographically distributed user groups and using secure peer-to-peer techniques to federate large numbers of small-scale resources into large-scale grids.We target small communities such as domestic users, schools and SMEs (for-profit or non-profit), harnessing their resources added to resources from operated IT centres, to form on-demand service oriented grids, avoiding preconfigured infrastructures. The technical issues addressed are security, support for multiple administrative and management authorities, P2P techniques for self-management/adaptivity/dynamicity, on-demand resource allocation, heterogeneity, and fault tolerance.The proof of concept applications include: e-learning tools for collaborative editing in schools and digital content processing service accessible by end residential users.

CoreGrid: European research network on foundations, software infrastructures and applications for large scale distributed, grid and peer-to-peer technologies

Objective CoreGRID aims at strengthening and advancing scientific and technological excellence in the area of Grid and Peer-to-Peer technologies. To achieve this objective, the Network brings together 119 permanent researchers and 165 PhD students from 42 institutions. An ambitious joint programme of activity will be conducted around 6 complementary research areas that have been selected on the basis of their strategic importance, their research challenges and the recognised European expertise to develop next generation Grid middleware, namely: knowledge & data management; programming models; system architecture; Grid information and monitoring services; resource management and scheduling; problem solving environments, tools and GRID systems.

EVERGROW: EVER-GROWing Global Scale-Free Networks, Their Provisioning, Repair and Unique Functions

The goal of the project is to build the science-based foundations for the global information networks of the future. Not only will networks soon provide us with access to all the world's knowledge, but society as a whole will become network-based, from private life and business to industry and the processes of government. The demands on the future Internet will be high. We can already see how the complexity of the Internet is continually increasing, and we know a great deal about the problems this will cause. Above all, a number of today's highly manual processes must be automated, such as network management, network provisioning and network repair on all levels.

PEPITO: Peer-To-Peer-Implementation-and-TheOry

Traditional centralised system architectures are ever more inadequate. A good understanding is lacking of future decentralised peer-to-peer (P2P) models for collaboration and computing, both of how to build them robustly and of what can be built. The PEPITO project will investigate completely decentralised models of P2P computing.



Master Degree Project Proposals

The Distributed Computing group at the Department of Computer Science of KTH is looking for candidates for master thesis projects in the areas of distributed scalable Machine Learning and Deep Learning, data intensive computing, stream processing, and distributed systems. If you are interested or need more information, contact academic supervisor(s) of the project of your interest (with CC to Vladimir Vlassov).

Distributed training of deep learning models for Copernicus data (PDF)

Academic supervisors: Desta Haileselassie Hagos, PhD, Postdoc, KTH, and Tianze Wang, PhD student, KTH, Emails: {destah,tianzew}@kth.se
Examiner: Vladimir Vlassov Email: vladv@kth.se
Requirements: a good knowledge of Python programming, machine learning, deep learning models, TensorFlow or PyTorch libraries, and a basic understanding of statistics.
Background Deep learning, characterised by a collection of computational neural network models that are composed of multiple processing layers capable of learning distributed representations of data with multiple levels of abstraction, has revolutionized and advanced the state-of-the-art for many research domain problems in the computer vision and geospatial research community. Its techniques seek to automatically discover the representations of large amounts of raw data fed into a machine for performing actions such as prediction and classification tasks. However, the computational complexity of training the multiple processing layers of the end-to-end deep learning models and finding the correct combination of weights from layer to layer and the parameters that change the input data becomes substantially compute-intensive. This thesis work is related to the H2020 EU- project ”ExtremeEarth: From Copernicus Big Data to Extreme Earth Analytics”.
Problem Statement The training of sophisticated, large-scale machine learning and deep learning models is very compute-intensive and a critical challenge in the machine learning community. This has fundamentally led to the emergence and increasing demand for powerful compute-resources. Hence, designing an approach for parallelization and distributed training of deep learning models across multiple compute devices connected by a network is important to speed up the task, improve the performances, and make the system scalable and fault-tolerant.
Tasks and Expected Results include (1) Looking into existing state-of-the-art deep learning frameworks for distributed training. (2) Describing the problem statement in detail followed by different parallelization and distributed training strategies for deep learning models. (3) Analysing which among the existing state-of-the-art distributed deep learning frameworks are most suitable for distributed learning in an earth observation settings. (4) Designing and implementing a scalable distributed deep learning training model for Copernicus data.


PhD Students

Primary supervisor

Co-supervisor


Publications

2021

2020

2019

2018

2017

2016

2015

2014

2013

Selected papers 2012 and before

Complete List of Publications: DBLP | Google Scholar | KTH DiVA

Services

Selected Program Committees

Selected Conference Organisations


Contact

  • Email: vladv@kth.se

  • Phone: +46 8 7904115

  • Mobile phone: +46 73 6441465

  • Postal address: Vladimir Vlassov, KTH/EECS/SCS, Electrum 229, SE-164 40, Kista, Sweden

  • Visiting address: Kistagången 16, Electrum, elevator A, level 4, Software and Computer Systems, room 2490