While I’m waiting for my ipfs add <200-gb-file-here>
to finish, I might as well write something..
Currently IPFS uses flatfs
as its datastore (git-like foldered hash structure), and its performance is pretty bad due to filesystem overheads, and the fact that generally filesystems are not the best databases.
Recently in IPFS 0.5.0, Badger as datastore is finally marked stable, along with some improvements.
It’s not used by default, when initing the repo, you can use:
ipfs init -p badgerds
to enable badger as the datastore.
This makes ipfs add <large-file>
a little bit faster (shrinking my 270gb file add time from 5.5h to 4.5h), and can have up to 32x performance compared to flatfs.
But even in this way IPFS is pretty slow, ipfs add --offline
still only processes at ~20M/s, which is way lower than the disk speed (~200M/s on a 2T HDD on GCP). I still don’t have an idea what the bottleneck is.
Welp, guess I’ll probably do a IPFS lab on the local network sometime in the future.
Update: Remember to include -w
as a flag of ipfs add
!