SPONSORS

Alibaba logo Amazon logo Google logo Intel logo SAP logo

News

Call For Papers

The aim of this one-day workshop is to bring together researchers who are interested in optimizing database performance on modern computing infrastructure by designing new data management techniques and tools.

Topics of Interest

The continued evolution of computing hardware and infrastructure imposes new challenges and bottlenecks to program performance. As a result, traditional database architectures that focus solely on I/O optimization increasingly fail to utilize hardware resources efficiently. Multi-core CPUs, GPUs, FPGAs, new memory and storage technologies (such as flash and non-volatile memory), and low-power hardware imposes a great challenge to optimizing database performance. Consequently, exploiting the characteristics of modern hardware has become an important topic of database systems research.

The goal is to make database systems adapt automatically to the sophisticated hardware characteristics, thus maximizing performance transparently to applications. To achieve this goal, the data management community needs interdisciplinary collaboration with computer architecture, compiler, operating systems and storage researchers. This involves rethinking traditional data structures, query processing algorithms, and database software architectures to adapt to the advances in the underlying hardware infrastructure.

We seek submissions bridging the area of database systems to computer architecture, compilers, and operating systems. In particular, submissions covering topics from the following non-exclusive list are encouraged:

  • database algorithms and data structures on modern hardware
  • cost models and query optimization for novel hierarchical memory systems
  • hardware systems for query processing
  • data management using co-processors
  • novel application of new storage technologies to data management
  • query processing using computing power in storage systems
  • database architectures for low-power computing and embedded devices
  • database architectures on multi-threaded and chip multiprocessors
  • performance analysis of database workloads on modern hardware
  • compiler and operating systems advances to improve database performance
  • new benchmarks for micro-architectural evaluation of database workloads
  • taking advantage of modern network capabilities for data processing
  • hardware/software co-design for modern data-intensive workloads (e.g., machine learning, graph analytics)

Submission Tracks

We invite submissions to two tracks:

  1. Full papers: A full paper must be no longer than 6 pages excluding the bibliography. There is no limit on the length of the bibliography. Full papers describe a complete work in the area of data management for new hardware. Accepted papers will be given 10 pages (plus bibliography) for the camera-ready version and a long presentation slot during the workshop.

  2. Short Papers: Short papers must not exceed 2 pages excluding the bibliography. Short papers describe very early stage works or summaries of mature systems. Short papers will be included in the proceedings, given 4 pages (plus bibliography) for the camera-ready version, and may be given a short presentation slot during the workshop.

All accepted papers (full and short) will also be presented as posters during a workshop poster session.

Best of DaMoN 2024

This year all accepted DaMoN papers will be considered for a best paper award.

We intend to invite extended versions of a selection of DaMoN'24 papers for submission to the VLDB Journal. Extended papers that are accepted by the VLDB Journal will appear in a special “Best of DaMoN 2024” section within one of the regular VLDBJ issues.

Important Dates

Paper submission: March 15th, 2024 March 22nd, 2024 (11:59pm PST)

Notification of acceptance: April 26th, 2024 April 29th, 2024

Camera-ready copies: May 10th, 2024 May 15th, 2024 (23:59pm PST)

Workshop: June 10th, 2024

Submission Instructions

Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum. Manuscripts should be submitted electronically as PDF files using the latest ACM paper format consistent with the ACM SIGMOD formatting guidelines to the DaMoN 2024 CMT site, at https://cmt3.research.microsoft.com/DaMoN2024. Submissions will be reviewed in a single-blind manner. Submissions that are 2 pages or shorter excluding the bibliography will be reviewed as short papers. Submissions that are 6 pages or shorter excluding the bibliography will be reviewed as full papers. Submissions that are longer than 6 pages excluding the bibliography will be desk-rejected.

Accepted papers will be included within the informal online proceedings at the website. Additionally, all accepted papers will be published online in the ACM Digital Library. Therefore, the papers must include the standard ACM copyright notice on the first page.

Workshop Program

Program - Monday (June 10, 2024)


09:30 - 10:00 Coffee Break


10:00-10:10 Workshop Opening (Carsten Binnig & Nesime Tatbul)


10:10-12:15 Session 1: Data Access and Data Structures (Session Chair: TBD)

10:10-11:00 Keynote (Philippe Bonnet)

11:00-11:15 SFVInt: Simple, Fast and Generic Variable-Length Integer Decoding using Bit Manipulation Instructions. Gang Liao (University of Maryland); Ye Liu (Bytedance); Yonghua Ding (Bytedance); Le Cai (Bytedance); Jianjun Chen (Bytedance).

11:15-11:30 NULLS!: Revisiting Null Representation in Modern Columnar Formats. Xinyu Zeng (Tsinghua University); Ruijun Meng (Tsinghua University); Andrew Pavlo (Carnegie Mellon University); Wes McKinney (Posit PBC); Huanchen Zhang (Tsinghua University).

11:30-11:45 Efficient Data Access Paths for Mixed Vector-Relational Search. Viktor Sanca (EPFL); Anastasia Ailamaki (EPFL).

11:45-12:00 Simple, Efficient, and Robust Hash Tables for Join Processing. Altan Birler (TU Munich); Tobias Schmidt (TU Munich); Philipp Fent (CedarDB); Thomas Neumann (TU Munich).

12:00-12:10 In situ Neighborhood Sampling for Large-Scale GNN Training. Yuhang Song (Boston University); Po Hao Chen (Brown University); Yuchen Lu (Boston University); Naima Abrar Shami (Boston University); Vasiliki Kalavri (Boston University).


12:15-13:30 Lunch Break


13:30-14:00 Sponsors Session (Alibaba, Amazon, Google, Intel, SAP)


14:00-15:30 Session 2: Disaggregation and Prefetching (Session Chair: TBD)

14:00-14:30 Invited Talk (David Patterson)

14:30-14:45 So Far and yet so Near - Accelerating Distributed Joins with CXL. Alexander Baumstark (TU Ilmenau); Marcus Paradies (TU Ilmenau); Kai-Uwe Sattler (TU Ilmenau); Steffen Kläbe (Actian); Stephan Baumann (Actian).

14:45-14:55 Seamless: Transparent Storage Access Through Smart Switches. Simon Binder (TU Darmstadt); Matthias Jasny (TU Darmstadt); Tobias Ziegler (TU Darmstadt).

14:55-15:10 How Does Software Prefetching Work on GPU Query Processing?. Yangshen Deng (Southern University of Science and Technology); Shiwen Chen (Southern University of Science and Technology); Zhaoyang Hong (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology).

15:10-15:25 How to Be Fast and Not Furious: Looking Under the Hood of CPU Cache Prefetching. Roland Kühn (TU Dortmund); Jan Mühlig (TU Dortmund); Jens Teubner (TU Dortmund).


15:30-16:00 Coffee Break


16:00-18:30 Session 3: Privacy and Heterogeneity (Session Chair: TBD)

16:00-16:30 Fresh Thinking Talk (Manos Athanassoulis)

16:30-16:45 The Price of Privacy: A Performance Study of Confidential Virtual Machines for Database Systems. Lina Qiu (Boston University); Rebecca Taft (Cockroach Labs); Alexander Shraer (Cockroach Labs); George Kollios (Boston University).

16:45-16:55 DuckDB-SGX2: The Good, The Bad and The Ugly within Confidential Analytical Query Processing. Ilaria Battiston (CWI); Charlotte C Felius (CWI); Sam Ansmink (DuckDB Labs); Laurens Kuiper (CWI, DuckDB Labs); Peter Boncz (CWI).

16:55-17:10 Heterogeneous Intra-Pipeline Device-Parallel Aggregations. Artem Kroviakov (TU Munich); Petr Kurapov (Intel); Christoph Anneser (TU Munich); Jana Giceva (TU Munich).

17:10-17:20 Performance or Efficiency? A Tale of Two Cores for DB Workloads. Rathijit Sen (Microsoft).

17:20-17:35 Accelerating GPU Data Processing using FastLanes Compression. Azim Afroozeh (CWI); Charlotte C Felius (CWI); Peter Boncz (CWI).


17:40-18:20 20th Year Panel


18:20-18:25 Workshop Closing (Carsten Binnig & Nesime Tatbul)


18:30 SIGMOD Reception


Keynote & Invited Talks

Keynote Talk

NVMe and Data Systems: A Decade and Counting

Philippe Bonnet, IT University of Copenhagen

Potrait of Philippe Bonnet

Abstract: NVMe is synonymous with modern storage. It was introduced as a means to efficiently expose Solid-State Drives as PCIe 3.0 peripherals. With NVMe, I/Os were no longer the bottleneck. Initially, the challenge for operating system and database system designers was to accomodate radically faster storage devices. Then, SSDs evolved to meet a range of cost/performance requirements. Accordingly, NVMe 2.0 introduced new transport models, storage models and cross-layer optimizations. This diversity introduced new challenges. Today, NVMe passthru and Flexible Data Placement enable data systems designers to shape how data is stored, instead of designing their systems around the characteristics of opaque storage devices. Computational storage was supposed to further improve the ability of system designers to specialize storage devices to fit their workloads. However, device memory management became a challenge. We discuss the proposed standard and speculate on the role NVMe may play in future data systems, in a context where CXL emerges, PCIe 7.0 is being standardized and power consumption is the bottleneck.

About the SpeakerPhilippe Bonnet is professor at DIKU, the department of Computer Science of the University of Copenhagen. He contributed to the uFlip Benchmark, the Linux multiqueue block layer, the Linux framework for Open-Channel SSDs, the OX architecture for computational storage, the xNVMe library and Delilah, a prototype for eBPF offload on computational storage. Philippe is co-author of the book “Principles of Database and Solid State Drive Co-Design” with Alberto Lerner. He is currently trustee of the VLDB endowment and chair of the ACM EIG on Reproducibility and Replicability.

Invited Talk

Computer Architecture in Flux: The Central Processing Unit Is No Longer Central

David Patterson, University of California Berkeley

Potrait of David Patterson

Abstract: We start with a review of the instability of modern hardware, given the

  • slowing of Moore’s Law,
  • the end of Dennard scaling, and
  • the rise of the demand for AI cycles versus traditional applications.

Data is becoming more critical than compute due to its increasing cost and slowing capacity curves for memory and storage. Data location and movement are now central to cost and performance. To build robust systems in light of these changes, we must shift the focus of hardware and software design from processing to the memory, storage, and network components.

About the SpeakerDavid Patterson is a UC Berkeley Pardee professor emeritus, a Google distinguished engineer, and the RISC-V International Vice-Chair. His most influential Berkeley projects likely were RISC (Reduced Instruction Set Computer) and RAID (Redundant Array of Inexpensive Disks). His best-known book is Computer Architecture: A Quantitative Approach. He and his co-author John Hennessy shared the 2017 ACM A.M Turing Award and the 2022 NAE Charles Stark Draper Prize for Engineering. The Turing Award is often referred to as the “Nobel Prize of Computing” and the Draper Prize is considered a “Nobel Prize of Engineering.”

Fresh Thinking Talk

Effortless Locality Through On-the-fly Data Transformation

Manos Athanassoulis, Boston University

Potrait of David Patterson

Abstract: What if we could access any layout and ship only the relevant data through the memory hierarchy by transparently converting rows to (arbitrary groups of) columns? We capitalize on the reinvigorated trend of hardware specialization to propose Relational Fabric, a near-data vertical partitioner that allows memory or storage components to perform on-the-fly transparent data transformation. By exposing an intuitive API, Relational Fabric pushes vertical partitioning to the hardware, which has a profound impact on the process of designing and building data systems. (A) There is no need for data duplication and layout conversion, making hybrid systems viable using a single layout. (B) It simplifies the memory and storage manager. (C) It reduces unnecessary data movement through the memory hierarchy allowing for better hardware utilization and, ultimately, better performance. In this talk, I will introduce the Relational Fabric vision and present our initial results on in-memory systems. I will also share some of the challenges of building this hardware and the opportunities it brings for simplicity and innovation in the data system software stack, including physical design, query processing, and concurrency control, and conclude with ongoing work for data transformation for general workloads including matrix and tensor processing.

About the SpeakerManos Athanassoulis is an Assistant Professor of Computer Science at Boston University, Director and Founder of the BU Data-intensive Systems and Computing Laboratory and co-director of the BU Massive Data Algorithms and Systems Group. His research is in the area of data management focusing on building data systems that efficiently exploit modern hardware (computing units, storage, and memories), are deployed in the cloud, and can adapt to the workload both at setup time and, dynamically, at runtime. Before joining Boston University, Manos was a postdoctoral researcher at Harvard School of Engineering and Applied Sciences. Manos obtained his PhD from EPFL, Switzerland, and spent one summer at IBM Research, Watson. Manos’ work is published in top conferences and journals of the community, like ACM SIGMOD, PVLDB, ACM TODS, VLDBJ, and others, and has been recognized by awards like “Best Demonstration” in VLDB 2023, “Best of SIGMOD” in 2017, “Best of VLDB” in 2010 and 2017, and “Most Reproducible Paper” at SIGMOD in 2016. Manos has been acting as a program committee member and technical reviewer in top data management conferences and journals for the past 12 years, having received the “Distinguished PC Member Award” for SIGMOD 2018 and SIGMOD 2023. He is currently an associate editor for ACM SIGMOD Record, co-chair of ACM SIGMOD 2023 Availability and Reproducibility, and co-chair of ICWE 2023 Industrial Track. His work is supported by several awards, including an NSF CRII award, an NSF CAREER award, a Facebook Faculty Research Award, multiple RedHat Collaboratory Research Incubation Awards, and a Cisco Research Award.

Accepted Papers

Full Papers

  • SFVInt: Simple, Fast and Generic Variable-Length Integer Decoding using Bit Manipulation Instructions
    Gang Liao (University of Maryland); Ye Liu (Bytedance); Yonghua Ding (Bytedance); Le Cai (Bytedance); Jianjun Chen (Bytedance)

  • The Price of Privacy: A Performance Study of Confidential Virtual Machines for Database Systems
    Lina Qiu (Boston University); Rebecca Taft (Cockroach Labs); Alexander Shraer (Cockroach Labs); George Kollios (Boston University)

  • Heterogeneous Intra-Pipeline Device-Parallel Aggregations
    Artem Kroviakov (TU Munich); Petr Kurapov (Intel); Christoph Anneser (TU Munich); Jana Giceva (TU Munich)

  • Simple, Efficient, and Robust Hash Tables for Join Processing
    Altan Birler (TU Munich); Tobias Schmidt (TU Munich); Philipp Fent (CedarDB); Thomas Neumann (TU Munich)

  • How Does Software Prefetching Work on GPU Query Processing?
    Yangshen Deng (Southern University of Science and Technology); Shiwen Chen (Southern University of Science and Technology); Zhaoyang Hong (Southern University of Science and Technology); Bo Tang (Southern University of Science and Technology)

  • Efficient Data Access Paths for Mixed Vector-Relational Search
    Viktor Sanca (EPFL); Anastasia Ailamaki (EPFL)

  • So Far and yet so Near - Accelerating Distributed Joins with CXL
    Alexander Baumstark (TU Ilmenau); Marcus Paradies (TU Ilmenau); Kai-Uwe Sattler (TU Ilmenau); Steffen Kläbe (Actian); Stephan Baumann (Actian)

  • Accelerating GPU Data Processing using FastLanes Compression
    Azim Afroozeh (CWI); Charlotte C Felius (CWI); Peter Boncz (CWI)

  • How to Be Fast and Not Furious: Looking Under the Hood of CPU Cache Prefetching
    Roland Kühn (TU Dortmund); Jan Mühlig (TU Dortmund); Jens Teubner (TU Dortmund)

  • NULLS!: Revisiting Null Representation in Modern Columnar Formats
    Xinyu Zeng (Tsinghua University); Ruijun Meng (Tsinghua University); Andrew Pavlo (Carnegie Mellon University); Wes McKinney (Posit PBC); Huanchen Zhang (Tsinghua University)


Short Papers

  • In situ Neighborhood Sampling for Large-Scale GNN Training
    Yuhang Song (Boston University); Po Hao Chen (Brown University); Yuchen Lu (Boston University); Naima Abrar Shami (Boston University); Vasiliki Kalavri (Boston University)

  • Performance or Efficiency? A Tale of Two Cores for DB Workloads
    Rathijit Sen (Microsoft)

  • Seamless: Transparent Storage Access Through Smart Switches
    Simon Binder (TU Darmstadt); Matthias Jasny (TU Darmstadt); Tobias Ziegler (TU Darmstadt)

  • DuckDB-SGX2: The Good, The Bad and The Ugly within Confidential Analytical Query Processing
    Ilaria Battiston (CWI); Charlotte C Felius (CWI); Sam Ansmink (DuckDB Labs); Laurens Kuiper (CWI, DuckDB Labs); Peter Boncz (CWI)

Program Committee

Chairs

Potrait of Carsten Binnig

Carsten Binnig

TU Darmstadt, Germany
carsten.binnig@cs.tu-darmstadt.de

Potrait of Nesime Tatbul

Nesime Tatbul

Intel Labs and MIT, USA
tatbul@csail.mit.edu

Members

  • Anastasia Ailamaki, EPFL, Switzerland
  • Beng Chin Ooi, National University of Singapore, Singapore
  • Danica Porobic, Oracle, Switzerland
  • David Cohen, Intel, USA
  • Huanchen Zhang, Tsinghua University, China
  • Ken Ross, Columbia University, USA
  • Lawrence Benson, TU Munich, Germany
  • Norman May, SAP SE, Germany
  • Orestis Polychroniou, Amazon Web Services, USA
  • Pinar Tozun, IT University of Copenhagen, Denmark
  • Rathijit Sen, Microsoft, USA
  • Rebecca Taft, Cockroach Labs, USA
  • Tilmann Rabl, HPI and University of Potsdam, Germany
  • Tobias Ziegler, TU Darmstadt, Germany
  • Utku Sirin, Harvard University, USA
  • Yannis Chronis, Google, USA

Steering Committee

Potrait of Anastasia Ailamaki

Anastasia Ailamaki

EPFL, Switzerland
anastasia.ailamaki@epfl.ch

Potrait of Peter Boncz

Peter Boncz

CWI, Netherlands
boncz@cwi.nl

Potrait of Stefan Manegold

Stefan Manegold

CWI, Netherlands
stefan.manegold@cwi.nl

Potrait of Ken Ross

Ken Ross

Columbia University, USA
kar@cs.columbia.edu