Open source and decentralization are the core ethos of Web 3.0. Chris Dixon explains why decentralization matters in this medium blog. Decentralized file storage is critical for the success of Web 3.0 so users and creators can take control of the internet once and for all. While blockchains are built for decentralized storage, they are not designed to store large file sizes. They are meant for handling transaction data, smart contracts, and the source code.
To put it in perspective, both Bitcoin and Ethereum ledger are less than 1 TB each whereas, in 2020, the internet stored 40 billion TB of data as per some estimates. Most value creation or value transfer on the internet in the future will involve some content or a file that needs a decentralized, permanent storage solution. The good news is that there are a number of solutions/projects that are working while keeping decentralized permanent storage in mind.
What is a Decentralized Storage Network?
Before learning about the best decentralized storage networks, let’s find out what a decentralized storage network is in the first place, shall we? When it comes to the storage of data, these are our usual options:
- Storages for small files like USB drives.
- Centralized cloud storage, where your data is stored on a central cloud space owned and controlled by a third-party entity. The problem here is that your data now belongs to a said entity operating the cloud, and thus loses out on the freedom and security factors.
This is where decentralized storage networks come into play as a solution. This way, any data is stored on a network spread among multiple users across the globe. These users are incentivized to join and operate the network so as to keep it decentralized, and also to ensure the data is accessible at all times. The servers used are thus hosted by a group of people instead of a single authoritative body.
Anyone can join a decentralized network based on a blockchain. Smart contracts ensure the users’ integrity and authenticity, and they are incentivized with native tokens to run the network. Said tokens come with a range of benefits, namely governance rights.
Why should you choose a decentralized storage network over centralized storage?
- The security factor is much stronger, and blockchain technology puts up greater resistance to data breaches.
- The chances of attacks like DDoS are much less.
- You get to have complete ownership over your personal data.
- There are much less censorship and surveillance associated with decentralized storage networks.
History of Storage Solutions
For ages, humans have been attempting to figure out the best way to keep track of records and vital information. The ability to store and retrieve data is essential in our everyday lives, both at work and at home. Here’s a brief and somewhat chronological look at storage devices from the early computer era:
- Selectron Tube: Originally developed in 1946, the memory storage device proved expensive and suffered from production problems.
- Punch Cards: Early computers often used punch cards to input both programs and data. They were first invented in 1725 and were improved in the 1940s so that the data could be read.
- Magnetic Drum Memory: Invented in 1932, it was widely used in the 1950s and 60s as the main working memory of computers.
- Hard Drives: Invented by IBM in 1956, the first hard drive took up the majority of a room and used rapidly rotating disks to store data.
- The Floppy Disc: The diskette was invented by IBM and widely used from the mid-1970s to the late-1990s.
- DVDs: This revolutionary form of data storage enabled people to store all manner of media on an external source, from files, to sound recordings, to videos.
- USB Drives: USB (Universal Serial Bus) was originally developed and introduced in 1996 as a way of setting up communication between a computer and peripheral devices by replacing many varieties of serial and parallel ports.
- TODAY — CLOUD STORAGE: More than half of the businesses use cloud storage. As of 2013, 1 Exabyte of data is stored in the cloud (that’s 1 billion GB).
List of Top Decentralized Storage Projects
|2. BitTorrent File System|
Top Decentralized Storage Projects
The InterPlanetary File System was invented by Juan Bennet at Protocol Labs. IPFS is used in a distributed file system to store and share data. It is similar to torrent, but for the web – the files are not hosted in a single location, but rather by anyone who has a copy and wants to host it.
- Distributed Hash Table: used to store and retrieve data between network nodes
- BitSwap: a peer-to-peer file-sharing protocol that coordinates data transmission between untrusted swarms
- Merkle DAG: used to track changes to files on the network in a distributed-friendly way
Data Sharing in IPFS
IPFS runs a process called ‘Garbage collection’ for optimizing memory usage by deleting objects which are not used frequently. A local ‘Pin’ may be added to ensure that the data is saved locally. For higher data size and safety, ‘Pinning services’ may be used. Pinning services ensure that the data is saved in multiple nodes for a fee.
IPFS does not guarantee long-term storage. However, IPFS can be integrated with multiple tools such as Filecoin to ensure permanence. These integrations need a minimum file size, involve cost, and may take higher retrieval times.
Content addressing using Content Identifiers (CID) ensures the immutability of content. Any change in content will generate a new content identifier which ensures immutability. Integrity for web 3.0 use cases such as NFT is ensured by linking on-chain ‘metadata’ which includes the IPFS URL created using the CID of the content stored on IPFS.
The default layer of IPFS does not have a consensus mechanism. However, layers built on IPFS such as IPFS cluster and Filecoin have their own consensus mechanisms.
IPFS cluster offers CRDT (Conflict-free Replicated Data Type) and raft consensus mechanisms. CRDT is used for a liquid peer network where the peers frequently enter and exit, and not all peers have modification permissions. This also supports batching operations for pinning/unpinning operations. Raft consensus is an older but proven mechanism used when the peer network is fully trusted. Filecoin uses its own consensus mechanism called ‘Proof of storage’.
Cost of Storage
IPFS is free to use. However, uploading files on IPFS does not ensure permanence as the data is stored only on your computer. In order to ensure that the data is replicated in multiple places, pinning services and permanence tools are needed which incurs a fee.
- Content addressing using Content Identifiers (CID) ensures immutability
- Optimized linking and addressing of content using Merkle Directed Acyclic Graphing technique
- Content discovery using Distributed Hash Tables(DHT)
2. BitTorrent File System
BitTorrent is a decentralized file-sharing protocol created by developer Brad Cohen in 2001. In 2018, Tron Foundation acquired the BitTorrent Foundation and launched the BTT token in 2019.
Instead of downloading or uploading files to a single server, users join a network of computers running software that allows them to exchange files and data with one another. â€‹â€‹BTT token powers decentralized applications including DLive, BitTorrent Speed, BitTorrent File System, and many others.
Data sharing in BitTorrent
Proof of storage contract – BTFS uses multiple smart contracts between the renters (file owners) and hosts (storage providers) to ensure that the files are stored. The hosts provide periodic proof of storage, failing which, they are subject to fines.
Data never gets stored on BitTorrent servers. Once a user downloads the BitTorrent client, he can manage a piece of data for a lifetime. Sync users are able to use Sync even if the program goes offline.
The total and maximum token supply of BTT tokens is 990,000,000,000 while the current circulating supply is 923,767.70B BTT
CID – Content identifiers
Staking contracts- Hosts stake BitTorrent Token (BTT) based on which files uploaded will be sent to them for storage.
Cost of storage
BTFS is a fork of IPFS. It has additional features such as token economics with BitTorrent Token (BTT) integration, file encryption, file removal for hosts
Arweave was originally named Archain. It is a decentralized storage network founded by Sam Williams and William Jones. The goal of Arweave is to permanently store files over a distributed network of computers.
Arweave protocol works on two layers:
- Blockweave: Arweave stores its data in a graph of blocks. Each block is linked to two earlier blocks in Arweave, forming a structure called a “blockweave.”
- Permaweb: everything published on the permaweb is available forever. The permaweb offers low-cost, zero maintenance, permanent hosting of web apps and web pages.
Data sharing in Arweave
Arweave uses token economics for persistence with AR tokens. A storage endowment ensures that mining of Arweave remains profitable and sustainable by maintaining higher rewards than expenditure to maintain the data. Miner rewards consist of a % of Transaction fees (sum of all transaction fees in a block), Inflation reward (a gradually decreasing function of block height) and an endowment reward (paid out if the other two components are lower than expenditure). A % of transaction fees are directed to the endowment wallet.
Arweave is designed for ‘data permanence’ while recognizing that a new, better permanent data storage mechanism may emerge into which Arweave’s data may be subsumed.
The blockweave data structure in which miners are incentivized to store blocks ensures immutability. Also unlike traditional storage which requires a fixed number of replications, block weave uses a probabilistic approach in order to ensure the right number of replications with higher incentives to store rare data.
Proof of Access: A node with access to a randomly selected ‘recall’ block wins the block and validators validate this proof. PoA is an enhancement of Proof of Work and incentivizes miners to store blocks and win rewards.
Cost of storage
Approx $8 one-time lifetime storage cost/GB. No monthly subscription
Pay only once for permanent storage
Blockweave: Each block is linked to 2 prior blocks: the previous block and a random block from the previous history of the blockchain, a recall block.
Proof of access: Miners are incentivized to store rare blocks thereby following a probabilistic and incentive-driven approach to replicate data in the network.
Memoization of state: New nodes need not download all the previous blocks. New users can only download the current blocks from trusted peers or use block data structures-Block Hash List and Wallet List and verify/request old blocks. These lists are synchronized and kept up to date by the miners. Each node can prioritize their storage as per their preferences or resources and the network will still be able to guarantee storage and replication.
â€‹â€‹Filecoin is a decentralized storage network in which anybody can rent storage space. Instead of entrusting your documents to one company, they can be split up and stored on computers all over the world. It is an incentive layer built on top of IPFS that incentivizes users to rent out their storage space by paying them in FIL tokens.
Data sharing in Filecoin
Filecoin uses ‘Proof of replication’ to verify that a node has stored data and ‘Proof of space time’ to ensure that data is saved over a range of time.
Filecoin uses FIL tokens and multiple mechanisms associated with the tokens to incentivize long-term storage.
- Network baseline: In order to prevent miners from exiting the network after cashing out early rewards, the block rewards are scaled up till the network reaches a certain baseline.
- Collateral: Miners pledge collateral funds to guarantee storage for a certain period of time. In case the miner fails to provide storage or goes offline during the guaranteed duration, a portion of the collateral and block rewards would be taken away.
- Reliability: Filecoin provides the option to store files with multiple independent miners and the option to verify that independent copies are being stored, which makes it fault-tolerant
Filecoin is built on IPFS. It uses CIDs for addressing content which ensures immutability- any change in content will result in a new CID.
Filecoin uses a ‘Useful work’ consensus protocol wherein the probability of a miner being allocated is proportional to the storage assigned by the miner.
Cost of storage
Some storage providers such as web3.storage offer storage on Filecoin for free. This is because miners’ rewards are higher if they are already storing data. The average cost as per file.app is $ .0000009 USD/GB/Month.
- Proof of storage using proof of replication and proof of space-time
- Consensus protocol rewards miners with higher storage
- Collateral to ensure reliability
Siacoin (SC) is the native token of the Sia network. It allows any computer to rent out unused hard drive space to users looking to store files. Sia has designed software that is capable of creating a peer-to-peer storage (P2P) network that allows anyone to be part of that network.
Its platform has the ability to use smart contracts. They serve as an agreement between the customer and the storage service provider. Being decentralized storage, it has no single point of failure.
Data sharing in Sia
Storj decentralized cloud storage is an open-source decentralized network for storing data. It aims to solve for high durability with minimum expansion. It does not follow the typical approach of replicating files in multiple nodes and instead fragments files into multiple nodes, thereby maintaining high durability (probability of surviving outage) with low expansion (additional storage required).
Storage nodes and applications: enables anyone with spare disk space and internet to join the network.
Uplink clients, developer tools: Mechanism to upload and download data.
Satellite nodes: In the absence of a consensus mechanism, trusted nodes manage the metadata, node information, data repair and payouts.
Data sharing in Storj
Uses erasure code to ensure that data persists without increasing the network traffic. Data is broken into fragments, encoded, and saved in multiple nodes. This requires a much lesser expansion factor, i.e. additional storage as compared to replication for the same durability or probability of recovering the data in case of an outage. The math behind it is here.
An audit is run to detect the performance of a system. Failed audits will result in the file being recovered from the remaining nodes, and missing pieces being reconstructed and saved in other nodes.
Metadata entries – any modification will require metadata entries to be modified.
No mechanism currently relies on satellite operators for consensus. Plans to build a mechanism in the future.
Cost of storage
- High durability with fragmentation(erasure code)
- AWS S3 compatibility
- Privacy with Encryption
Centralized Vs Decentralized Storage – The Difference
|Centralized Storage||Decentralized Storage|
|Single storage provider||No dependency on a single platform|
|Subject to censorship||Censorship resistant|
|Pricing decided by a centralized authority in an organization||Pricing decided by free market + democratic protocols|
|Limited redundancy||High redundancy|
|Relatively expensive||Typically cheaper than centralized solutions|
|Profits are largely accumulated at the top and tricked down||Funds generated via fees are used to incentivize storage providers |
directly and to develop the decentralized ecosystem
|Network information and source codes not publicly shared||Network information and source codes are open sources with|
incentives for developers to contribute to the network
List of Layer 1 Crypto Projects | How to Avoid Crypto Scams | Peer to Peer Money Transfer | Advantages and Disadvantages of Blockchain Technology | What are Layer 1 Blockchain | Layer 1 Vs Layer 2 Blockchain | Check Blockchain Wallet | How Cryptocurrencies are Frozen | Public Blockchain Ledger | How Altcoins Work | Best Place to Mint NFT | Blockchain Advantages | What is Blockchain Security | Whales in Crypto | What is Fiat Currency | What is Distributed Ledger Technology | What is Nonce in Blockchain | What is Physical Layer in OSI Model | EVM Vs Non-EVM Chains | Best Crypto Youtube Channels | What is Asset Tokenization | Best P2P Crypto Exchanges | Blockchain Node Providers | Difference Between Cryptocurrency and Blockchain | Best Defi Wallets
Disclaimers : Opinions expressed in this publication are those of the author(s). They do not necessarily purport to reflect the opinions or views of Shardeum foundation.
About the Author(s) :
Harsha Karanth has been in the energy and e-commerce industries before he came across web 3.0. He is enthusiastic about building impact projects on web 3, particularly in the sectors of environment, animal care and education. You can follow him on Twitter
Shuwam Rana is a Technical Analyst, Digital marketer and SEO expert with a passion to help businesses grow. He also has an engineering background. You can follow him on Twitter