2024
NVMe and Data Systems: A Decade and Counting
NVMe is synonymous with modern storage. It was introduced as a means to efficiently expose Solid-State Drives as
PCIe 3.0 peripherals. With NVMe, I/Os were no longer the bottleneck. Initially, the challenge for operating system
and database system designers was to accomodate radically faster storage devices. Then, SSDs evolved to meet a range
of cost/performance requirements. Accordingly, NVMe 2.0 introduced new transport models, storage models and cross-layer
optimizations. This diversity introduced new challenges. Today, NVMe passthru and Flexible Data Placement enable data
systems designers to shape how data is stored, instead of designing their systems around the characteristics of opaque
storage devices. Computational storage was supposed to further improve the ability of system designers to specialize
storage devices to fit their workloads. However, device memory management became a challenge. We discuss the proposed
standard and speculate on the role NVMe may play in future data systems, in a context where CXL emerges, PCIe 7.0 is
being standardized and power consumption is the bottleneck.Philippe Bonnet (IT University of Copenhagen)
Philippe Bonnet is professor at DIKU, the department of Computer Science of the University of Copenhagen. He contributed
to the uFlip Benchmark, the Linux multiqueue block layer, the Linux framework for Open-Channel SSDs, the OX architecture
for computational storage, the xNVMe library and Delilah, a prototype for eBPF offload on computational storage. Philippe
is co-author of the book “Principles of Database and Solid State Drive Co-Design” with Alberto Lerner. He is currently
trustee of the VLDB endowment and chair of the ACM EIG on Reproducibility and Replicability.
Computer Architecture in Flux: The Central Processing Unit Is No
Longer Central
We start with a review of the instability of modern hardware, given the slowing of Moore’s Law, the end of Dennard
scaling, and the rise of the demand for AI cycles versus traditional applications.
Data is becoming more critical than compute due to its increasing cost and slowing capacity curves for memory and
storage. Data location and movement are now central to cost and performance. To build robust systems in light of these
changes, we must shift the focus of hardware and software design from processing to the memory, storage, and network
components.David Patterson (University of California Berkeley)
David Patterson is a UC Berkeley Pardee professor emeritus, a Google distinguished engineer, and the RISC-V International
Vice-Chair. His most influential Berkeley projects likely were RISC (Reduced Instruction Set Computer) and RAID (Redundant
Array of Inexpensive Disks). His best-known book is Computer Architecture: A Quantitative Approach. He and his co-author
John Hennessy shared the 2017 ACM A.M Turing Award and the 2022 NAE Charles Stark Draper Prize for Engineering. The Turing
Award is often referred to as the “Nobel Prize of Computing” and the Draper Prize is considered a “Nobel Prize of
Engineering.”
Effortless Locality Through On-the-fly Data Transformation
What if we could access any layout and ship only the relevant data through the memory hierarchy by transparently
converting rows to (arbitrary groups of) columns? We capitalize on the reinvigorated trend of hardware specialization to
propose Relational Fabric, a near-data vertical partitioner that allows memory or storage components to perform on-the-fly
transparent data transformation. By exposing an intuitive API, Relational Fabric pushes vertical partitioning to the
hardware, which has a profound impact on the process of designing and building data systems. (A) There is no need for data
duplication and layout conversion, making hybrid systems viable using a single layout. (B) It simplifies the memory and
storage manager. (C) It reduces unnecessary data movement through the memory hierarchy allowing for better hardware utilization
and, ultimately, better performance. In this talk, I will introduce the Relational Fabric vision and present our initial
results on in-memory systems. I will also share some of the challenges of building this hardware and the opportunities it
brings for simplicity and innovation in the data system software stack, including physical design, query processing, and
concurrency control, and conclude with ongoing work for data transformation for general workloads including matrix and
tensor processing.Manos Athanassoulis (Boston University)
Manos Athanassoulis is an Assistant Professor of Computer Science at Boston University, Director and Founder of the
BU Data-intensive Systems and Computing Laboratory and co-director of the BU Massive Data Algorithms and Systems Group.
His research is in the area of data management focusing on building data systems that efficiently exploit modern
hardware (computing units, storage, and memories), are deployed in the cloud, and can adapt to the workload both at setup
time and, dynamically, at runtime. Before joining Boston University, Manos was a postdoctoral researcher at Harvard School
of Engineering and Applied Sciences. Manos obtained his PhD from EPFL, Switzerland, and spent one summer at IBM Research,
Watson. Manos’ work is published in top conferences and journals of the community, like ACM SIGMOD, PVLDB, ACM TODS,
VLDBJ, and others, and has been recognized by awards like “Best Demonstration” in VLDB 2023, “Best of SIGMOD” in 2017,
“Best of VLDB” in 2010 and 2017, and “Most Reproducible Paper” at SIGMOD in 2016. Manos has been acting as a program
committee member and technical reviewer in top data management conferences and journals for the past 12 years, having
received the “Distinguished PC Member Award” for SIGMOD 2018 and SIGMOD 2023. He is currently an associate editor for ACM
SIGMOD Record, co-chair of ACM SIGMOD 2023 Availability and Reproducibility, and co-chair of ICWE 2023 Industrial Track.
His work is supported by several awards, including an NSF CRII award, an NSF CAREER award, a Facebook Faculty Research Award,
multiple RedHat Collaboratory Research Incubation Awards, and a Cisco Research Award.