From Gilles Fedak

Main: HomePage

Picture of Gilles Fedak

Gilles Fedak

INRIA Researcher / University of Lyon


Laboratoire de l'Informatique du Parallélisme
ENS LyonGilles.Fedak AT inria.fr
46 avenue d'ItalieTel : (+33) 4 72 72 85 47
69364 LYON CEDEX 07Fax: (+33) 4 72 72 80 80

Short Bio

I am a permanent INRIA research scientist since 2004 and I am currently working in the AVALON team at ENS-Lyon, France. After receiving my Ph.D degree from University Paris Sud in 2003, I followed a postdoctoral fellowship at University California San Diego in 2003-2004. My research interests are in Parallel and Distributed Computing, with a particular emphasis on the problematic of using large and loosely coupled distributed computing infrastructures to support highly demanding computational and data-intensive Science. I produced pioneering software and algorithms in the field of Grid and Cloud Computing that allow people to easily harness large parallel systems consisting of thousands of machines distributed on the Internet (XtremWeb, MPICH-V, BitDew, SpeQulos, Xtrem-MapReduce, Active Data, …). I co-authored more than 80 peer-reviewed scientific papers and won two Best Paper awards. In 2012, I co-edited with C. Cérin the Desktop Grid Computing Book, (CRC publication). In 2015, I received the Chinese Academy of Sciences PIFI Award.

You can read a Short Bio here.

News

Research Interest

Data Management on Hybrid Distributed Infrastructures

There is a growing demand for computing power from scientific communities to run their applications and process large volumes of scientific data. Meanwhile, the supply of distributed computing infrastructures (DCI) for scientific computing continues to diversify: from super-computers to interconnected grids, from grids of PCs to Cloud Computing. To the point that, not only users can now choose their prefered architectures based on parameters such as performance, cost or quality of service, but can also combine transparently several of these infrastructures. For instance, the FP7 projects EDGeS/EDGI allow users of the EGEE Grid to run some of their computational tasks on Internet volunteers' resources transparently and safely. Indeed, hybrid computing infrastructures, consisting of several types of distributed infrastructure are becoming a reality, and determining the relevant criteria for an effective usage of such hybrid infrastructures is a scientific challenge which needs to be addressed. My researches around Data Management on Hybrid DCIs follow three axes :

  1. The BitDew framework is a programmable environment for management and distribution of data for Grid, Desktop Grid and Cloud Systems. BitDew is a subsystem which can be easily integrated into large scale computational systems (XtremWeb, BOINC, Hadoop, Condor, Glite, Unicore etc..). Our approach is to break the "data wall" by providing in single package the key P2P technologies (DHT, BitTorrent) and high level programming interfaces. The BitDew framework will enable the support for data-intense parameter sweep applications, long-running applications which requires distributed checkpoint services, workflow applications and maybe in the future soft-realtime and stream processing applications.
  2. MapReduce for Hybrid DCI : The Map-Reduce programming model adapts well to the data-intense class of applications, and there is a growing interest in supporting Map-Reduce on Desktop Grids. We aim at providing a complete runtime environment for MapReduce application on Desktop Grid. At the moment there exists no such environment dedicated to Desktop Grid. We will rely on the BitDew middleware which is a programmable environment for automatic and transparent data management on computational Desktop Grids.
  3. Active Data is data lifecycle management software that allows to manage large scientific data sets when they are distributed across heterogeneous software and systems.

Desktop Grid in Hybrid DCI

Although Desktop Grids have been very successful in the area of High Throughput Computing, data-intense computing is a still a promising area: for now, Desktop Grids have mostly focused on Bag-of-tasks applications with few IOs and without dependencies between the tasks. Some major achievements combining their huge storage potential with their processing capability are expected. They would impact the applications requiring a important volume of data input storage with frequent data reuse and limited volume of data output.

For example, in the FP7 project EDGI, we are proposing mechanism to supplement Desktop Grid with Cloud resources in order to enhance the QoS of applications executed on Desktop Grid. Conversely, we can imagine mechanisms which use local Desktop Grid resources as cache to decrease the usage of remote Cloud resources, in order to improve the performance or decrease the cost of Cloud resources usage. In the ANR MapReduce project, we propose mechanisms to enable scientific data-intense applications to run simultaneously on Clouds system (EC2 Azure, or IBM Cloud) and on Desktop Grid.

Retrieved from http://graal.ens-lyon.fr/~gfedak/pmwiki-test/pmwiki.php/Main/HomePage
Page last modified on August 31, 2017, at 03:18 PM