INRIA Researcher / University of Lyon
|Laboratoire de l'Informatique du Parallélisme|
|ENS Lyon||Gilles.Fedak AT inria.fr|
|46 avenue d'Italie||Tel : (+33) 4 72 72 85 47|
|69364 LYON CEDEX 07||Fax: (+33) 4 72 72 80 80|
I am a permanent INRIA research scientist since 2004 and I am currently working in the AVALON team. After graduated from University Paris Sud in 2003, I followed a postdoctoral fellowship at University California San Diego in 2003-2004. My research topics include designing and implementing an open-source Desktop Grid system called XtremWeb, an open-source platform for data- intensive applications on Cloud and Desktop Grid called BitDew.
You can read a Short Bio here.
- 2014 : Year of Active Data (Anthony Simonet's PhD. thesis) ! Active Data is a new data-life cycle oriented programming model to manage large scientific data sets on heterogeneous systems and infrastructures (joint work with Matei Ripeanu). Typical use-case for Active Data are data progress monitoring on complex infrastructures, cross-systems cooperation and optimizations, handling of distributed and dynamic data-sets, stream and incremental processing etc... As an example of applications, we proposed a Data Surveillance Framework with Kyle Chard and Ian Foster (UC/ANL) applied to the Advanced Photon Source experience.
- 2013 : CloudPower's on the launch pad ! The CloudPower project, funded by the French National Research Agency (ANR) aims at valorisation of Desktop Grid technologies developed by CNRS and INRIA during the last decade. We’ll build a scalable, secure, low-cost, and on-demand service for High Performance Computing based on the XtremWeb-HEP middleware. The expected outcome is the creation of a CNRS/INRIA spin-off company in 2014. Welcome to Sylvain Bernard, Haiwu He and Etienne Urbah have joined our team to work on this project !
- Fun : my page on Google Scholar
- Book on Desktop Grid Computing presents common techniques used in numerous models, algorithms, and tools developed during the last decade to implement desktop grid computing. Edited by Christophe Cérin and Gilles Fedak :15 chapters, 388 pages, published by Chapman & Hall/CRC Press. You can get the book from your favorite bookshop .
- I am very happy to participate to the French ANR project MapReduce, lead by Gabriel Antoniu. Our first paper presenting an original implementation of MapReduce for large scale Desktop Grid has been presented in November at the 3PGCIC conference at Fukuoka, Japan.
- Mini-Course on MapReduce : I'm giving a little serie of short classes on MapReduce Runtime Environments, in 2012 at the University of Heidelberg in Germany, in 2013 at the ERAD-NE 2013 school at Salvador de Bahia in Brazil, in 2014 at the BBU, Cluj-Napoca in Romania, Paris XIII and ENS-Lyon in France ! Here is the page that will gradually contain more and more material about the course.
- New member of the editorial board of Springer Cluster Computing Journal 2014: Member of HPDC, AINA, Euro-Par, 2013: Member of HPDC, CCGRID, CloudCom, , eScience, SBAC-PAD, FedICI, BDDS, NPC, HPCC, HPDIC,
- 2012 : Year of the SpeQuloS ! SpeQuloS is a service to provide QoS on Bag-of-Tasks application when executed on elastic infrastructure. We published the paper describing the framework at HPDC'2012. The paper won the Best presentation award at the Grid5K winter school, was best presentation finalist at HPDC'12 (kudos Simon) and has been selected for HPDC Special Issue of Jornal of Cluser Computing.
Data Management on Hybrid Distributed Infrastructures
There is a growing demand for computing power from scientific communities to run their applications and process large volumes of scientific data. Meanwhile, the supply of distributed computing infrastructures (DCI) for scientific computing continues to diversify: from super-computers to interconnected grids, from grids of PCs to Cloud Computing. To the point that, not only users can now choose their prefered architectures based on parameters such as performance, cost or quality of service, but can also combine transparently several of these infrastructures. For instance, the FP7 projects EDGeS/EDGI allow users of the EGEE Grid to run some of their computational tasks on Internet volunteers' resources transparently and safely. Indeed, hybrid computing infrastructures, consisting of several types of distributed infrastructure are becoming a reality, and determining the relevant criteria for an effective usage of such hybrid infrastructures is a scientific challenge which needs to be addressed. My researches around Data Management on Hybrid DCIs follow three axes :
- The BitDew framework is a programmable environment for management and distribution of data for Grid, Desktop Grid and Cloud Systems. BitDew is a subsystem which can be easily integrated into large scale computational systems (XtremWeb, BOINC, Hadoop, Condor, Glite, Unicore etc..). Our approach is to break the "data wall" by providing in single package the key P2P technologies (DHT, BitTorrent) and high level programming interfaces. The BitDew framework will enable the support for data-intense parameter sweep applications, long-running applications which requires distributed checkpoint services, workflow applications and maybe in the future soft-realtime and stream processing applications.
- MapReduce for Hybrid DCI : The Map-Reduce programming model adapts well to the data-intense class of applications, and there is a growing interest in supporting Map-Reduce on Desktop Grids. We aim at providing a complete runtime environment for MapReduce application on Desktop Grid. At the moment there exists no such environment dedicated to Desktop Grid. We will rely on the BitDew middleware which is a programmable environment for automatic and transparent data management on computational Desktop Grids.
- Active Data is data lifecycle management software that allows to manage large scientific data sets when they are distributed across heterogeneous software and systems.
Desktop Grid in Hybrid DCI
Although Desktop Grids have been very successful in the area of High Throughput Computing, data-intense computing is a still a promising area: for now, Desktop Grids have mostly focused on Bag-of-tasks applications with few IOs and without dependencies between the tasks. Some major achievements combining their huge storage potential with their processing capability are expected. They would impact the applications requiring a important volume of data input storage with frequent data reuse and limited volume of data output.
For example, in the FP7 project EDGI, we are proposing mechanism to supplement Desktop Grid with Cloud resources in order to enhance the QoS of applications executed on Desktop Grid. Conversely, we can imagine mechanisms which use local Desktop Grid resources as cache to decrease the usage of remote Cloud resources, in order to improve the performance or decrease the cost of Cloud resources usage. In the ANR MapReduce project, we propose mechanisms to enable scientific data-intense applications to run simultaneously on Clouds system (EC2 Azure, or IBM Cloud) and on Desktop Grid.