Content Addressed Computing at IPFS Camp 2022

Content Addressed Computing at IPFS Camp 2022
Share this item:

In the Compute-Over-Data track at the 2022 IPFS Camp (opens new window), we heard from several projects and lots of people on how the landscape of computing is evolving, and how we believe computing can become refocused and more powerful by embracing content addressing and a “merkle-native” way of doing things. In this post, we’ll do a quick recap of what was covered.

Content-addressing (opens new window) and the use of merkle datastructures (opens new window) – identifying data by a cryptographic hash – is already a well-known revolution of how data structures can be designed for decentralization, bringing benefits like verifiability, naturally coordination-free data deduplication, and so forth. In examining content-addressed computing, and looking for ways to embrace computing over content-addressed data (sometimes referred to in this community as compute-over-data for short), we look at how we can bring those virtues of verifiability and decentralization to data processing as well as data storage and transport (where content-addressing has already become near-ubitiquous in any modern protocol design), and we look for what new virtues we can discover that are unlocked by giving computations predictable coordination-free identifers.

A wide range of projects exist in this space! In the videos of the event, you’ll find presentations of projects that are ranging in focus from linux containers and how to unify them with content-addressed storage, all the way to new bytecode VMs which have direct integrations to content-addressed storage and content-addressed code execution primitives. You’ll find approaches to scaling and approaches to user adoption which range from developer-centric build tools, all the way to projects with a focus on massively scaled parallelized compute job scheduling. And you’ll find exploration of virtues of developing systems with computation-addressable primitives, ranging from software security and reproducible builds, to public verifiability, to sheer scaling.

All of the talks were recorded, and you can also find the full Compute-Over-Data track playlist (opens new window) on the IPFS YouTube channel.

Here’s a quick summary of all the talks and their topics:

  • In the keynote: David Aronchik and Wes Floyd introduce us to the potential for revolution in the big data age, and what do we mean by “compute over data”:
  • In “Warpforge — Hashes go in, hashes come out, exec in the middle!”, Eric Evenchick and Eric Myhre introduce Warpforge, a tool for declarative computation and software build pipelines, as well as demonstrate new data structures (in IPLD!) to describe decentralized package management – emphasizing how to collaborate, without enforced central coordination.
  • In “Bacalhau — Bringing the Compute to the Data!”, David Aronchick tells the story of the Bacalhau project, its origin, motivations, and progress so far, as well as demos of using it to run distributed compute jobs.
  • In “FVM – The (EVM-Compatible!) Filecoin Virtual Machine”, Matt Hamilton shows a new computing environment called the FEVM, which hosts Ethereum-compatible smart-contracts on-chain in Filecoin. This allows smart contracts that integrate with the state storage mechanisms of Filecoin. Applications of this could include automatic storage deal renegociation, among other ideas. Live demos are included!
  • In “Zapps — A new standard for go-anywhere linux executables”, Eric Myhre dives into how to ship software on linux, and demonstrates a new way to do it in a drag-and-drop way, with truly minimal system dependencies, and without resorting to containers.

Tons of questions were asked and answered throughout these talks:

  • Verification of compute in decentralized systems: how can we do it?
  • Deterministic computation: how does it relate to verification? And is it prevalent in the wild?
  • What are the interventions we can make if deterministic computation isn’t prevalent in the wild, and we want to make it so, as a community?
  • How do markets relate to these systems? Can we make decentralized markets for data as well as the processing over that data?
  • How can we make it easier for people to get started in building new decentralized software?
  • How do we get software in the hands of end-users with less fuss? How do we make software packages easier to compose, so more people can join us more easily in building data pipelines?
  • How can we label, annotate, and share references to data, without central coordination? Hash-based identifiers are a given – where do we go from there?

… and okay, some of these questions are just asked; not all of these questions are answered. 😉 Some of them are hard questions; some have multiple answers! And for the hard problems remaining, if you want to contribute in some way, you can find more information on getting in touch below. All of these projects are looking for both users and contributors.

The summaries above hopefully pique your interest – if so, watch the videos! Almost every one of these presentations included live demos, which are very cool, and hard to summarize in text 😉

# Getting in touch

If you’re looking for followup contacts for these groups:

For more information about the IPFS Camp 2022 overall, the event info can be found on the IPFS Camp site (opens new window). More information about the other tracks can be found grouped by this tag (opens new window).