Learn: What Is Ethereum’s ‘Data Availability' Problem, and Why Does It Matter?

Separate “data availability” layers could reduce congestion on the Ethereum network by making it easier for ancillary “rollup” networks to verify that transactional details exist.

Ethereum is suddenly becoming so crowded with new “layer 2” networks – separate blockchains that sit atop the main network and specialize in fast and cheap transactions – that experts are trying to figure out how to handle the growing number of transactional details.

The whole point of these layer 2 networks in the first place is to reduce congestion. So one way to accomplish that is to reduce the number of times that data needs to be downloaded from the main network.

Enter what’s known as the “data availability problem” – how to prove that the records of the transactional details exist, and are available if needed, without actually downloading them – using cryptography and advanced mathematics.

This article is featured in the latest issue of The Protocol, our weekly newsletter exploring the tech behind crypto, one block at a time. Sign up here to get it in your inbox every Wednesday.

Top developers of Ethereum have proposed their own plans for handling the data, known as EIP-4844 or colloquially “proto-danksharding.” Such measures are expected to significantly scale the blockchain and introduce “blobs” for data, which helps the blockchain to process that data more efficiently, and on the cheap.

But there’s also a new breed of players – Celestia and Avail are seen as two of the leaders in the space – trying to develop alternate solutions for data availability, arguing that the fullest implementation of Ethereum’s own proposed solution, known as danksharding, might still be years away.

Analogy to Google photo upload

According to Alchemy, a blockchain infrastructure startup, the data availability layer “is a system that stores and provides consensus on the availability of blockchain data.” Its goal is to help reduce the data load from a mainnet blockchain, and therefore lower transaction fees for users of layer 2s, also known as rollups.

Data availability layers, like Celestia and Avail, are betting that they are going to become more integral to layer 2s, as users and developers look for space for their data (on Ethereum).

“Data availability is a solution to the problem of needing to make data available for anyone on the internet to download,” Nick White, the COO of Celestia Labs, said in a tweet.

Understanding data availability – sometimes shorthanded to DA – and how it works can get quite technical, but let's give it a shot:

An analogy that the team at Avail – originally conceived as a special project within the Ethereum scaling solution Polygon, but spun out earlier this year – likes to make is to a user who uploaded a photo to Google, and then wants to make sure the photo is actually there. The user queries Google, which responds with a fragment of the photo; the exercise is the confirmation; the user doesn't need to download the photo, just to make sure it's there.

The idea is that just having a separate blockchain to handle the task of proving that the data exists, and is available if needed, is, in and of itself, a major task that needs to be handled in the interaction between a layer 2 network and the main Ethereum blockchain.

“Celestia and DA layers will be the security and scaling backbone of the entire blockchain ecosystem,” White told CoinDesk. “They will provide the raw input for running all decentralized applications, namely secure, trust-minimized blockspace.”

Proto-danksharding, then danksharding

Ethereum developers have explored other ways in which they can address the issue of data on the blockchain. Concepts like sharding, which splits the blockchain into smaller pieces, allows for more space to process transactions and therefore lower gas fees.

Proto-danksharding, or EIP-4844, is the first prototype for this concept that will go live as early as the end of this year during the Dencun upgrade.

Still, layer 2s need to have that access to data now, and so data availability layers see themselves as a crucial element in helping rollups to succeed.

“The reality is that rollups are now acknowledged to be the best way to do execution,” said Anurag Arjun, the co-founder of Avail, which will provide data availability to layer 2s. “Avail is really a base layer that only focuses on what is important to rollups, which is data availability.”

Some developers behind layer 2s feel that while data availability layers currently play an important role, having data available from a layer 1 is unmatched.

“Your L1 data, fundamentally from just an architectural perspective, is going to be supreme. There's never going to be better data,” Karl Floersch CEO of OP Labs, the main developer of the layer-2 blockchain Optimism, told CoinDesk. “That doesn't mean, though, that all DA providers are not useful. They are useful because they can augment it and there’s a second class of data availability that you can use. They're not a replacement for L1 data, they can just help assist it.”

Alex Gluchowski, CEO of Matter Labs, the company spearheading the zkSync era rollup, says proto-danksharding is the preferred way of scaling, given that it inherits the underlying security from the Ethereum blockchain.

“The preferred option for our users, if you can afford it, will probably be proto-danksharding,” Gluchowski said. “If at some point we will see the prices rising up again, then there will be a large section of users who will prefer data availability solutions.”

Gluchowski doesn’t think that data availability solutions will disappear once proto-danksharding is live.

“But it doesn't mean that the existing players will remain,” he said.

According to Floersch, “No one data availability provider should take the whole alt-DA marketplace.”

“There should be a lot of different solutions, different trade offs, different teams,” he said.

This article first appeared in CoinDesk, written by Margaux Nijkerk

Last updated