Welcome to the 2014 EcoCloud Annual Event. This year’s event took place in Lausanne and included a keynote, a poster session, an industrial speakers’ session and presentations by EcoCloud researchers. The event began with the keynote, a poster session and cocktail on the evening of Thursday, June 5th, 2014 and ran all day on June 6th, 2014.
Doug Burger, Microsoft
Moore’s Law is dying, and will soon be dead. This fact has enormous implications for the computing industry, from mobile devices through large-scale datacenters. In this talk, I will describe a computational fabric designed to achieve large performance and efficiency gains for datacenter services even in the face of the end of CMOS scaling. This fabric tightly couples many FPGAs together via a high-speed, low-latency secondary network, and permits elastic allocation of groups of reconfigurable chips to support varied services. I will also describe the implementation of an accelerated web search service on this fabric.
Doug Burger is a Director in Microsoft Research’s Technologies division, where he leads an interdisciplinary team focused on innovations in datacenter design and services, as well as new computing experiences in the mobile and wearable spaces. Before joining Microsoft in 2008, he was a Professor of Computer Science at the University of Texas at Austin, where he co-led the TRIPS project which invented EDGE architectures and NUCA caches. He is a Fellow of both the ACM and the IEEE.
This year’s industrial session will feature prominent speakers from the IT industry, including:
We describe the design and implementation of FARM, a main memory distributed computing platform that exploits RDMA to improve both latency and throughput by an order of magnitude relative to state of the art main memory systems that use TCP/IP. FARM exposes the memory of machines in the cluster as a shared address space. Applications can use efficient ACID transactions to allocate, read, write, and free objects with location transparency. We expect this simple programming model to be sufficient for most application code. In addition, FARM provides two mechanisms to improve performance where required: lock-free reads over RDMA, and support for collocating objects and function shipping to enable the use of efficient single machine transactions. We used FaRM to build a key-value store and a graph store similar to Facebook’s. They both perform well, for example, a 20-machine cluster can perform 160 million key-value lookups per second with a latency of 31µs.
Miguel Castro has been working at Microsoft Research since late 2000 on distributed systems, security, and operating systems. Miguel received a PhD from MIT in 2001 for his work on Byzantine fault tolerance. He is the recipient of the 2011 SIGOPS Mark Weiser award.
The scalability requirements of Modern Data Centers for Web 2.0 and Cloud present new challenges to the IO services offered by the network infrastructure. Remote Direct Memory Access (RDMA) technology is a communication paradigm that delivers on the promise of tight integration of very large clusters through its salient characteristics of low latency, high bandwidth and low CPU utilization. RDMA fits naturally with network virtualization schemes and creates the ultimate IO environment for efficient network storage, distributed databases and parallel processing. Highly scalable RDMA implementations, purposely designed to address these needs, are a key ingredient for the deployment of these solutions.
Michael Kagan is a co-founder of Mellanox and has served as CTO since January 2009. Previously, Mr. Kagan served as vice president of architecture from May 1999 to December 2008. From August 1983 to April 1999, Mr. Kagan held a number of architecture and design positions at Intel Corporation. While at Intel Corporation, between March 1993 and June 1996, Mr. Kagan managed Pentium MMX design, and from July 1996 to April 1999, he managed the architecture team of the Basic PC product group. Mr. Kagan holds a Bachelor of Science in Electrical Engineering from the Technion — Israel Institute of Technology.
For the IBM-ASTRON DOME μServer project, we are currently building two types of memory DIMM-like form factor compute node boards. The first is based on a 4 core 2.2 GHz SoC and the second on a 12 core / 24 thread 1.8 GHz SoC. Both employ the 64-bit Power instruction set. Our innovative hot-water based cooling infrastructure also supplies the electrical power to our compute node board. We show initial performance results and conclude with the key lessons.
Ronald P. Luijten, senior IEEE member, received his Masters of Electronic Engineering with honors from the University of Technology in Eindhoven, Netherlands in 1984. In the same year he joined the systems department at IBM’s Zurich Research Laboratory in Switzerland. He has contributed to the design of various communication chips, including PRIZMA high port count packet switch and ATM adapter chip sets, culminating in a 15-month assignment at IBM’s networking development laboratory in La Gaude, France as lead-architect, from 1994-95. He currently manages the IBM DOME uServer and Algorithms and Machines teams. Formerly he managed the OSMOSIS optical switch demonstrator project in close collaboration with Corning, inc. His team contributed the congestion control mechanism to the converged enhanced Ethernet standard and is worked on the network validation of IBM’s PureSystems product line. His team also contributed a 2.5D pattern selection method to enhance performance of source/mask optimization for 22nm and 14nm CMOS. Ronald’s personal research interests are in datacenter architecture, design and performance (eData Motion in Data Centers). He holds more than 25 issued patents, and has co-organized 7 IEEE conferences. Over the years, IBM has awarded Ronald with three outstanding technical achievement awards and a corporate patent award.
As the number of on-die transistors continues to grow, new computing models are needed to utilize this growing compute capacity despite a clock-frequency scaling wall, and relatively sluggish improvements in I/O bandwidth. The spatial compute and programming model, as introduced by the OpenSPL specification, provides a method for taking advantage of compute capacity offered by current and trending hardware technology. With spatial computing, compute processing units are laid out in space (either physically or conceptually) and connected by flows of data. The result of this approach is compute implementations which are naturally highly-parallel and thus very effectively use modern transistor-rich hardware. In this talk, I will describe both the spatial computing model and a practical realization of the OpenSPL specification: Multiscale Dataflow Engines. Multiscale Dataflow Engines are a platform which directly implements the spatial computing model in hardware while at the same time supporting tight integration with conventional CPU-based compute resources.
www.openspl.org
M. J. Flynn, O. Mencer, V. Milutinovic, G. Rakocevic, P. Stenstrom, R.Trobec, M. Valero: “Moving from Petaflops to Petadata”, Communications of the ACM, May 2013
Oskar Mencer – Prior to founding Maxeler, Oskar was Member of Technical Staff at the Computing Sciences Center at Bell Labs in Murray Hill, leading the effort in “Stream Computing”. He joined Bell Labs after receiving a PhD from Stanford University. Besides driving Maximum Performance Computing (MPC) at Maxeler, Oskar was Consulting Professor in Geophysics at Stanford University and he is also affiliated with the Computing Department at Imperial College London, having received two Best Paper Awards, an Imperial College Research Excellence Award in 2007 and a Special Award from Com.sult in 2012 for “revolutionising the world of computers”.
The network connectivity in the datacenter as it is today suffers from several shortcomings ranging from the inability to perform rapid changes, the plethora of encapsulation mechanisms being used by servers, a lot of management points and a model to describe workload connectivity that is rooted in broadcast domains, VLAN tagging and IP addresses. The talk describes how we can change this model by decoupling the definition of workload relationships from these old constructs, how we can simplify the task of providing virtual and physical workload connectivity and how in the end the result is a much more portable element of configuration that can templet-ize the instantiation of “applications” on the fabric.
Maurizio Portolani is a Distinguished Engineer at Cisco Systems involved in architecting and validating large scale data center designs. He is the author of several Cisco Data Center solution architectures and of the book “Datacenter Fundamentals”. Maurizio holds patents on advanced spanning-tree, load balancing features, security and virtualization features.
Ideally located in the heart of Lausanne, the Olympic Capital, in French speaking Switzerland, the Lausanne Palace & Spa enjoys a superb view across Lake Geneva to the Alps.
The beauty of its surrounding landscape makes it an exceptional place of refinement. The recognition since 2007 by UNESCO of the Lavaux region, 10 km from the Lausanne Palace & Spa, now on the list of “Real World Heritages” serves as proof of its uniqueness.
At the heart of Lausanne, the Lausanne Palace & Spa, offers to you its parking open 24/24 and 7/7, an ideal base from which to discover Lausanne and quickly and easily access the different services which the Lausanne Palace & Spa offers you.