<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>distributed &amp;mdash; musicmatzes blog</title>
    <link>https://beyermatthias.de/tag:distributed</link>
    <description></description>
    <pubDate>Wed, 06 May 2026 11:43:36 +0200</pubDate>
    <item>
      <title>Blueprint of a distributed social network on IPFS - and its problems (2)</title>
      <link>https://beyermatthias.de/blueprint-of-a-distributed-social-network-on-ipfs-and-its-problems-2</link>
      <description>&lt;![CDATA[After thinking a while about the points I layed out&#xA;in my previous post&#xA;I&#39;d like to update my ideas here.&#xA;&#xA;It is not necessary to read the first post to understand what I am talking&#xA;about in this second one, but it also does not do any harm.&#xA;&#xA;  Matrix and Mastodon are nice - but federation is only the first step - we&#xA;  have to go towards fully distributed applications!&#xA;&#xA;(me, at the 34. Chaos Communication Congress 2017)&#xA;&#xA;The idea&#xA;&#xA;With the rise of protocols like the matrix protocol, activitypub and others,&#xA;decentralized social community platforms like matrix, mastodon and others gained&#xA;power and were made real.&#xA;I consider these platforms, especially mastodon and matrix, to be great steps&#xA;into the future and am using both enthusiastically.&#xA;But I think we can do better. Federation is the first step out of&#xA;centralization and definitively a good one. But we have to push further -&#xA;towards full distributed environments!&#xA;&#xA;(For a &#34;Why?&#34; have a look at the end of the article!)&#xA;&#xA;How would it work?&#xA;&#xA;The foundations how a social network on IPFS would work are rather simple.&#xA;I am very tempted to use the un-word &#34;blockchain&#34; in this article, but because&#xA;of the hype around that word and because nobody really understands what a&#xA;&#34;blockchain&#34; actually is, I refrain from using it.&#xA;&#xA;I use a better one instead: &#34;DAG&#34; - &#34;Directed Acyclic Graph&#34;. Also &#34;Merkle-Tree&#34;&#xA;is a term which could be used, but when using this term, a notion of&#xA;implementation-details comes to mind and I want to avoid that. One instantly&#xA;thinks of crypto, hash values and blobs when talking about hash trees or&#xA;merkle trees. A DAG though is a bit more abstract concept which fits my ideas&#xA;better.&#xA;&#xA;What we would need to develop a social network (its core functionality) on IPFS&#xA;is a DAG and some standard data formats we agree upon.&#xA;We also need a private-public-key infrastructure, which IPFS already has.&#xA;&#xA;There are two &#34;kinds&#34; of data which must be considered: meta-data (which&#xA;should be replicated by as many nodes as possible) and actual user data&#xA;(posts, messages, images, videos, files).&#xA;I&#39;m not talking about the second one here very much, because the meta-data is&#xA;where the problems are.&#xA;&#xA;Consider the following metadata blob:&#xA;&#xA;{&#xA;  &#34;version&#34;: 1,&#xA;  &#34;previous&#34;: [ &#34;Qm...1234567890&#34; ],&#xA;&#xA;  &#34;profile&#34;: [ &#34;Qm...098765&#34;, &#34;Qm...54312&#34; ],&#xA;&#xA;  &#34;post&#34;: {&#xA;    &#34;mimetype&#34;: &#34;text/plain&#34;,&#xA;    &#34;nodes&#34;: [&#34;Qm...abc&#34;],&#xA;    &#34;date&#34;: &#34;2018-01-02T03:04:05+0200&#34;,&#xA;    &#34;reply&#34;: [],&#xA;  },&#xA;&#xA;  &#34;publicfollow&#34;: [ &#34;Qm...efg&#34;, &#34;Qm...hij&#34; ]&#xA;}&#xA;&#xA;The version key describes the version of the protocol, of course.&#xA;&#xA;Here, the previous array points to the previous metadata blob(s).&#xA;  We need multiple entries here (an array) because we want to create a DAG.&#xA;&#xA;The profile key holds a list of IPNS names which are associated with the&#xA;  profile.&#xA;&#xA;The version, previous and profile keys are the only ones required in&#xA;such a metadata blob.&#xA;All other keys shown above are optional, though one metadata-blob should&#xA;only contain one at a time (or none).&#xA;&#xA;The post table describes the actual userdata. Some meta-information is&#xA;  added, for example the mimetype (&#34;text/plain&#34; in this case) and the date it&#xA;  was created. More can be thought of.&#xA;  The nodes key points to a list of actual content (again via IPFS hashes).&#xA;  I&#39;m not yet convinced whether this shall be a list or a single value.&#xA;  Details!&#xA;  I&#39;d say that these three keys are required in a post table.&#xA;  The reply key notes that this post is a reply to another post. This is, of&#xA;  course, optional.&#xA;&#xA;The publicfollow is a list of IPNS hashes to other profiles which the user&#xA;  follows publicly.&#xA;  Whether such a thing is desireable is to be discussed.&#xA;  I show it here to give a hint on the possibilities.&#xA;&#xA;More such data could be considered, though the meta-data blobs should be&#xA;  held small: If one thinks of 4kb per meta-data blob (which is a lot) and&#xA;  10 million blobs (which I do not consider that much, because every&#xA;  interaction which is a input into the network in one form or another results&#xA;  in a new meta-data blob), we have roughly 38 GB of meta-data content, which is&#xA;  really too much.&#xA;  If we have 250 bytes per metadata-blob (which sounds like a reasonable size)&#xA;  we get 2.3 GB of meta-data for 10 million blobs. That sounds much better.&#xA;&#xA;The profile DAG&#xA;&#xA;The idea of linking the previous version of a profile from each new version of&#xA;the profile is of course one of the key points.&#xA;With this approach, nobody has to fetch the whole list of profile versions.&#xA;Traversing the whole chain backwards is only required if a user wants to see&#xA;old content from the profile she&#39;s browsing.&#xA;&#xA;Because of IPFS and its caching, content automatically gets replicated over&#xA;nodes as users browse profiles.&#xA;Nodes can cache either only meta-data blobs (not so much data) or user content&#xA;as well (more data). This can happen automatically or user-driven - several&#xA;possibilities here!&#xA;It is even possible that users &#34;pin&#34; content if they think its important to&#xA;keep it.&#xA;&#xA;Profile updates can even be &#34;announced&#34; using PubSub so other nodes can then&#xA;fetch the new profile versions and cache them. The latest profile&#xA;metadata-blob (or &#34;version&#34;) can be published via a IPNS name.&#xA;The IPNS name should be published per-device and not per-account.&#xA;(This is also why there is a devices array in the metadata JSON blob!)&#xA;&#xA;Why should we publish IPNS names per-device and why do we actually need a DAG&#xA;here? That&#39;s actually because of we want multi-device support!&#xA;&#xA;Multi-device support&#xA;&#xA;I already mentioned that the profile-chain would be a DAG.&#xA;I also mentioned that there would be a profile key in the meta-data blob.&#xA;&#xA;This is because of the multi-device support.&#xA;If two, three or even more devices need to post to one account, we need to be&#xA;able to merge different versions of an account: Consider Alice and Bob sharing&#xA;one account (which would be possible!). Now, Bob loses connection to the&#xA;internet. But because we are on IPFS and work offline, this is not a problem.&#xA;Alice and Bob could continue creating content and thus new profile versions:&#xA;&#xA;A &lt;--- B &lt;--- C &lt;--- D &lt;--- E&#xA;        \&#xA;         C&#39; &lt;--- D&#39; &lt;--- E&#39;&#xA;&#xA;In the shown DAG, Alice posts C, D and E, each referring to the former.&#xA;Bob creates C&#39;, D&#39; and E&#39; - each refering to the former.&#xA;Of course both C and C&#39; would refer to B.&#xA;&#xA;As soon as Bob comes back online, Alice notices that there is another chain of&#xA;posts to the profile and can now merge the chains be publishing a new&#xA;version F which points to both E and E&#39;:&#xA;&#xA;A &lt;--- B &lt;--- C &lt;--- D &lt;--- E &lt;--- F&#xA;        \                         /&#xA;         C&#39; &lt;--- D&#39; &lt;--- E&#39; &lt;-----&#xA;&#xA;Because Bob would also see another chain, his client would also provide a new&#xA;version of the profile (F&#39;) where E and E&#39; are merged - one of the&#xA;problem which must be sorted out. But a rather trivial one in my opinion, as&#xA;the clients need only to do some sort of leader-election. And this election is&#xA;temporary until a new node is published - so not really a complicated form&#xA;of concensus-finding!&#xA;&#xA;What has to be sorted out, though, is that the devices/nodes which share an&#xA;account and now need to agree upon which one merges the chains need some form&#xA;of communication between them. I have not yet thought about how this should be&#xA;done. Maybe IPFS PubSub is a viable option for this. Cryptographic signatures&#xA;play a important role here.&#xA;&#xA;This gets a bit more complicated if there are more than two devices posting to&#xA;one account and also if some of them are not available yet - though it is&#xA;still in a problem space near &#34;we have to think hard about this&#34; ... and&#xA;nowhere in the space of &#34;seems impossible&#34;!&#xA;&#xA;The profile key is provided in the account data so the client knows which&#xA;other chains should be checked and merged. Thus, only nodes which are already&#xA;allowed to publish new profile versions are actually allowed to add new nodes&#xA;to that list.&#xA;&#xA;Deleting content in the DAG&#xA;&#xA;Deleting old versions of the profile - or old content - is possible, too.&#xA;Because the previous key is an array, we can refer to multiple old&#xA;versions of a profile.&#xA;&#xA;Consider the following chain of profile versions:&#xA;&#xA;A&lt;---B&lt;---C&lt;---D&lt;---E&#xA;&#xA;Now, the user wants to drop profile version C. This is possible by creating&#xA;a new profile version which refers to E and B in the previous field and&#xA;then dropping C. The following chain (DAG) is the result:&#xA;&#xA;A&lt;---B     &lt;---D&lt;---E&lt;---F&#xA;      \                 /&#xA;       -----------------&#xA;&#xA;Of course, D would now point to a node which does not exist. But that is not&#xA;a problem. Indeed, its a fundamental key point of the idea - that content may be&#xA;unavailable.&#xA;&#xA;F should not contain new content.&#xA;If F would contain new content, dropping this content would become harder as&#xA;the previous key would be copied over, creating even more links to previous&#xA;versions in the new profile version.&#xA;&#xA;&#34;Forgetting&#34; content&#xA;&#xA;Because clients won&#39;t traverse the whole chain of a profile, but only the&#xA;newest 10, 100 or 1,000 entries, older content gets &#34;forgotten&#34; slowly.&#xA;Of course it is still there and the device hosting it still has it (and other&#xA;devices which post to the same account, eventually also caching servers).&#xA;Either way, content gets forgotten slowly. If the user who published the&#xA;content deletes it, the network may be unable to fetch it at some point.&#xA;&#xA;Is that bad? I don&#39;t think so! Important content gets replicated by others, so&#xA;if I post a comment on an article, I could (automatically or manually) pin the&#xA;article itself in my IPFS instance to preserve it.&#xA;If I do not and the author of the article thinks that it might not be that&#xA;interesting, the article may be deleted and gets unavailable to the network.&#xA;&#xA;And I think that is fine. Replicate important content, delete unimportant&#xA;content. The user has the power to decide here!&#xA;&#xA;Comments on posts (and comments)&#xA;&#xA;Consider you want to comment on a post. Of course you create new content,&#xA;which links to the post you just commented.&#xA;But the person who wrote the original post does not automatically link to your&#xA;comment, so nobody is able to find your comment.&#xA;&#xA;The approach for solving this is to provide updates to content.&#xA;An update is simply a new meta-data blob in the profile.&#xA;The blob would contain a link to the original post and the comment on it:&#xA;&#xA;{&#xA;  &#34;version:&#34; 1,&#xA;  &#34;previous&#34;: [ &#34;Qm...1234567890&#34; ],&#xA;&#xA;  &#34;profile&#34;: [ &#34;Qm...098765&#34;, &#34;Qm...54312&#34; ],&#xA;&#xA;  &#34;post&#34;: {&#xA;    &#34;update&#34;: &#34;Qm...abc&#34;,&#xA;    &#34;new-reply&#34;: &#34;Qm...ghjjk&#34;,&#xA;  },&#xA;}&#xA;&#xA;The  post.update and post.new-reply would link to meta-data blobs: The&#xA;update one to the original post or the latest update on the post - the&#xA;new-reply one on the post from the other user which provides a comment on the&#xA;post.&#xA;Maybe it would also be an option to list all direct replies to the post here. Details!&#xA;&#xA;Because this works on both &#34;posts&#34; and &#34;reply&#34; kind of data, comments on&#xA;comments are possible.&#xA;&#xA;Comments deep down the chain of comments would have to slowly propagate to the&#xA;top - to the actual post.&#xA;&#xA;Here, several configurations are possible:&#xA;&#xA;Automatically include comments and publish new profile versions for them&#xA;Publishing/propagating comments until some mark is hit (original post is&#xA;  more than 1 month old, more than 100 comments are propagated)&#xA;User can select other users where comments are automatically propagated and others have to be moderated&#xA;User has to confirm propagation (moderated comments).&#xA;&#xA;The key difference to known approaches here is that not the author of the original post permits&#xA;comments, but always the author of the post or comment the reply was filed&#xA;for.&#xA;I don&#39;t know whether this is a nice thing or a problem.&#xA;&#xA;Unavailable content&#xA;&#xA;The implementation of the social network has to be error-resistant, of course.&#xA;IPFS hashes might not be there, fetching content might not be possible&#xA;(temporarily or at all). But that&#39;s an implementation-detail to me and I&#xA;will not lose any more words about it.&#xA;&#xA;Federated component&#xA;&#xA;One might think &#34;If I go offline with my node, my posts are not accessible if&#xA;nobody else is online having them&#34;. And that&#39;s true.&#xA;&#xA;That&#39;s why I would introduce a federated component, which would run a&#xA;stripped-down version of the application.&#xA;&#xA;As soon as another instance connects and a new post is announced,&#xA;the instance automatically pins or caches it.&#xA;Of course, this would mean that all of these federated instances would pin all&#xA;content, which is surely not nice.&#xA;Posts which are pinned for a certain amount of time are most likely&#xA;distributed well enough so the federated component nodes can drop them...&#xA;maybe after 90 days, maybe after 10... Details!&#xA;&#xA;Subscribing (privately) and other private information&#xA;&#xA;Another issue with multi-device support would be subscribing privately to&#xA;another account. For example, if a user (lets call her Amy) subscribes to&#xA;another user (lets call him Sheldon) on her Notebook, this information needs&#xA;to be stored somehow.&#xA;And because Amys machines do not necessarily sync with each other, her mobile&#xA;phone may never know that following Sheldon is a thing now!&#xA;&#xA;This problem could by solved by storing the &#34;follow&#34;-information in her public&#xA;profile. Although, some users might not like everyone to know who to follow.&#xA;Cryptographic things could be considered to fix visibility.&#xA;&#xA;But then, users may want to &#34;categorize&#34; their friends, store them in groups&#xA;or whatever. This information would be stored in the public profile as well,&#xA;which would create even more noise on the network.&#xA;Also, because cryptography is hard and information would be stored forever,&#xA;this might not be an option as some day, the crypto might be broken and reveal&#xA;all the things that were stored privately before.&#xA;&#xA;Another solution for this would be that Amys devices would have to somehow&#xA;sync directly, without others beeing able to read any of that data.&#xA;Something like a&#xA;CRDT&#xA;which holds a configuration file which is then shared&#xA;between the devices directly (think of a git-repository which is pushed&#xA;between the devices directly without accessing a server on the internet).&#xA;This would, of course, only work if the devices are on the same network.&#xA;&#xA;As you see, I have not thought about this particular problem very much yet.&#xA;&#xA;Discovering content&#xA;&#xA;What I did not spend much time thinking about as well was how clients discover&#xA;new content.&#xA;When a user installs a client, this client does not know any IPFS peers - or&#xA;rather any &#34;social network nodes&#34; where it can fetch user profiles/data from -&#xA;yet.&#xA;Even if it knows some bootstrap nodes to connect to, it might not get content&#xA;from them if they do not serve any social network data and if the user does&#xA;not know any hashes of user profiles.&#xA;To be able to find new social network IPFS nodes, a client has to know their&#xA;IPNS hashes - But how to discover them?&#xA;&#xA;This is a hard problem. My first idea would be a PubSub channel where each&#xA;client periodically announces their IPNS hashes.  I&#39;m not sure whether PubSub&#xA;nodes must be connected directly. If this is the case, the problem just got&#xA;harder.  There&#39;s also the problem that this channel would be rather&#xA;high-volume as soon as the network grows. If each client announces their IPNS&#xA;hash every 15 minutes, for example, we get 4 messages per client each hour.&#xA;That&#39;s already a lot of bandwidth if we speak about 1,000, 10,000 or even&#xA;100,000 clients!&#xA;It is also an attack-vector how the system can be flooded. Not nice!&#xA;&#xA;One way to think about this is that if only nodes which are subscribed to a&#xA;topic do also forward the topics messages&#xA;(like this comment suggests),&#xA;we could reduce the time between &#34;publishing&#34; messages in the topic.&#xA;Such a message would contain all IPNS hashes a node knows about, thus the&#xA;amount of data would be rather much. As soon as the network grows, a node would&#xA;need to send this message less and less often, to reduce the number of&#xA;messages and bytes send. Still, if each node knows 10,000 nodes and sends this&#xA;list once an hour, we get&#xA;&#xA;bytesperhash = 46&#xA;numberofnodes = 10000&#xA;messagesize = bytesperhash  numberofnodes&#xA;bytesperhour = numberofnodes  messagesize&#xA;&#xA;4,28 GiB of &#34;I know these nodes&#34; messages per hour. That does&#xA;obviousely not scale!&#xA;&#xA;Maybe each client should offer an API where other clients can ask them about&#xA;which IPNS hashes they know. That would be a &#34;pull&#34; approach rather than a&#xA;&#34;push&#34; approach then, which would limit bandwidth a bit.  This could even be&#xA;done via PubSub as well, where the channel name is generared from the IPFS&#xA;instance hash, for example.&#xA;I don&#39;t know whether this would be a nice idea.&#xA;Still, this would need some &#34;internet-facing&#34; software where clients need to&#xA;be able to talk directly to eachother. I don&#39;t know whether IPFS offers&#xA;functionality to do this in a simple way.&#xA;&#xA;Either way, I have no solution for this problem yet.&#xA;&#xA;Why IPFS?&#xA;&#xA;Platforms like scuttlebutt or movim also implement distributed social&#xA;networks, why not use those? Also, why IPFS and not the dat protocol or&#xA;something else?&#xA;&#xA;That question is rather simple to answer: IPFS provides functionality and&#xA;semantics other tools/frameworks do not provide. Most importantly the notion&#xA;that content is immutable, but also full decentralization (not federation like&#xA;with services like movim or mastodon, for example).&#xA;&#xA;Having immutable content is a key point. The dat protocol, for example,&#xA;features mutable content as it is roughly based on bittorrent (if I understood&#xA;everything correctly, feel free to point out mistakes).&#xA;That might be nice in some cases, though I think immutability is the way to go.&#xA;Distributed applications or frameworks for distributed content with immutability&#xA;as core concept are better suited for netsplit, slow connections and&#xA;peer-to-peer applications.&#xA;From what I saw from the last weeks and months of looking at frameworks for&#xA;distributed content storage is that IPFS is way more mature than these other&#xA;frameworks. IPFS is build to replace existing contents and to stay, and that&#39;s a&#xA;nice thing to build applications on.&#xA;&#xA;Remaining Questions&#xA;&#xA;Some questions are remaining:&#xA;&#xA;Is it possible to talk to a specific node directly in IPFS? This would be&#xA;  helpful for discovering content by asking nodes what profiles they know.&#xA;  It would also be a nice way for finding consensus when multiple devices have&#xA;  to agree on which node publishes a merge.&#xA;How fast is IPFS with small files? If I need to traverse a long chain of&#xA;  profile updates, I constantly request a small file, parse it and continue&#xA;  requesting the previous node in the chain. That should be fast.&#xA;  If it is not, we might need to introduce some &#34;pack files&#34; where a list of&#xA;  metadata-nodes is provided and traversing becomes unnecessary with. But that&#xA;  makes deleting content rather complicated, TBH.&#xA;&#xA;That&#39;s all I can think of right now, but there might be more questions which&#xA;are not yet answered.&#xA;&#xA;Problems are hard in distributed environments&#xA;&#xA;Distributed systems involve a lot of new complexity where we have to carefully&#xA;think about details and how to design our system.&#xA;New ways to design systems can be discovered by the &#34;distributed approach&#34; and&#xA;new paradigms emerge.&#xA;&#xA;Moving away from a central authority which holds the truth, the global state and&#xA;also the data results in a&#xA;paradigm shift&#xA;we really have to be careful about.&#xA;&#xA;I think we can do it and design new, powerful and fully distributed systems&#xA;with user freedom, usability, user-convenience and state-of-the-art in mind.&#xA;Users want to have a system which is reliable, failure proof, convenient and&#xA;easy to use.&#xA;They want to &#34;always be connected&#34;.&#xA;I think we can provide such software.&#xA;Developers want nice abstractions to build upon, data integrity, failure-proof&#xA;software with simplicity designed into the system and reusable data structures&#xA;and be able to scale.&#xA;I think IPFS is the way to for this.&#xA;In addition, I think we can provide free software with free data.&#xA;&#xA;I do not claim to know the final solution to any of the problems layed out in&#xA;this article.&#xA;Its just that I think of them and would love to get an open conversation started&#xA;on the whole subject of distributed social networks and problems that come with&#xA;them.&#xA;&#xA;And maybe we can come up with a prototype for this?&#xA;&#xA;tags:  #distributed #network #open-source #social #software&#xA;]]&gt;</description>
      <content:encoded><![CDATA[<p>After thinking a while about the points I layed out
<a href="/blog/2017/10/31/blueprint-of-a-distributed-social-network-on-ipfs---and-its-problems/">in my previous post</a>
I&#39;d like to update my ideas here.</p>

<p>It is not necessary to read the first post to understand what I am talking
about in this second one, but it also does not do any harm.</p>

<blockquote><p>Matrix and Mastodon are nice – but federation is only the first step – we
have to go towards fully distributed applications!</p></blockquote>

<p>(me, at the 34. Chaos Communication Congress 2017)</p>

<h1 id="the-idea" id="the-idea">The idea</h1>

<p>With the rise of protocols like the matrix protocol, activitypub and others,
decentralized social community platforms like matrix, mastodon and others gained
power and were made real.
I consider these platforms, especially mastodon and matrix, to be great steps
into the future and am using both enthusiastically.
But I think we can do better. Federation is the first step out of
centralization and definitively a good one. But we have to push further -
towards full distributed environments!</p>

<p>(For a “Why?” have a look at the end of the article!)</p>

<h1 id="how-would-it-work" id="how-would-it-work">How would it work?</h1>

<p>The foundations how a social network on IPFS would work are rather simple.
I am very tempted to use the un-word “blockchain” in this article, but because
of the hype around that word and because nobody really understands what a
“blockchain” actually is, I refrain from using it.</p>

<p>I use a better one instead: “DAG” – “Directed Acyclic Graph”. Also “Merkle-Tree”
is a term which could be used, but when using this term, a notion of
implementation-details comes to mind and I want to avoid that. One instantly
thinks of crypto, hash values and blobs when talking about hash trees or
merkle trees. A DAG though is a bit more abstract concept which fits my ideas
better.</p>

<p>What we would need to develop a social network (its core functionality) on IPFS
is a DAG and some standard data formats we agree upon.
We also need a private-public-key infrastructure, which IPFS already has.</p>

<p>There are two “kinds” of data which must be considered: meta-data (which
should be replicated by as many nodes as possible) and actual user data
(posts, messages, images, videos, files).
I&#39;m not talking about the second one here very much, because the meta-data is
where the problems are.</p>

<p>Consider the following metadata blob:</p>

<pre><code class="language-json">{
  &#34;version&#34;: 1,
  &#34;previous&#34;: [ &#34;Qm...1234567890&#34; ],

  &#34;profile&#34;: [ &#34;Qm...098765&#34;, &#34;Qm...54312&#34; ],

  &#34;post&#34;: {
    &#34;mimetype&#34;: &#34;text/plain&#34;,
    &#34;nodes&#34;: [&#34;Qm...abc&#34;],
    &#34;date&#34;: &#34;2018-01-02T03:04:05+0200&#34;,
    &#34;reply&#34;: [],
  },

  &#34;publicfollow&#34;: [ &#34;Qm...efg&#34;, &#34;Qm...hij&#34; ]
}
</code></pre>
<ul><li><p>The <code>version</code> key describes the version of the protocol, of course.</p></li>

<li><p>Here, the <code>previous</code> array points to the previous metadata blob(s).
We need <em>multiple</em> entries here (an array) because we want to create a <em>DAG</em>.</p></li>

<li><p>The <code>profile</code> key holds a list of <code>IPNS</code> names which are associated with the
profile.</p></li></ul>

<p>The <code>version</code>, <code>previous</code> and <code>profile</code> keys are the only ones required in
such a metadata blob.
All other keys shown above are <em>optional</em>, though one metadata-blob should
only contain one at a time (or none).</p>
<ul><li><p>The <code>post</code> table describes the actual userdata. Some meta-information is
added, for example the mimetype (<code>&#34;text/plain&#34;</code> in this case) and the date it
was created. More can be thought of.
The <code>nodes</code> key points to a list of actual content (again via IPFS hashes).
I&#39;m not yet convinced whether this shall be a <em>list</em> or a single value.
Details!
I&#39;d say that these three keys are required in a <code>post</code> table.
The <code>reply</code> key notes that this post is a reply to another post. This is, of
course, optional.</p></li>

<li><p>The <code>publicfollow</code> is a list of IPNS hashes to other profiles which the user
follows <em>publicly</em>.
Whether such a thing is desireable is to be discussed.
I show it here to give a hint on the possibilities.</p></li>

<li><p>More such data could be considered, though the meta-data blobs should be
held <em>small</em>: If one thinks of 4kb per meta-data blob (which is a lot) and
10 million blobs (which I do not consider that much, because every
interaction which is a input into the network in one form or another results
in a new meta-data blob), we have roughly 38 GB of meta-data content, which is
really too much.
If we have 250 bytes per metadata-blob (which sounds like a reasonable size)
we get 2.3 GB of meta-data for 10 million blobs. That sounds much better.</p></li></ul>

<h2 id="the-profile-dag" id="the-profile-dag">The profile DAG</h2>

<p>The idea of linking the previous version of a profile from each new version of
the profile is of course one of the key points.
With this approach, nobody has to fetch the whole list of profile versions.
Traversing the whole chain backwards is only required if a user wants to see
old content from the profile she&#39;s browsing.</p>

<p>Because of IPFS and its caching, content automatically gets replicated over
nodes as users browse profiles.
Nodes can cache either only meta-data blobs (not so much data) or user content
as well (more data). This can happen automatically or user-driven – several
possibilities here!
It is even possible that users “pin” content if they think its important to
keep it.</p>

<p>Profile updates can even be “announced” using PubSub so other nodes can then
fetch the new profile versions and cache them. The latest profile
metadata-blob (or “version”) can be published via a IPNS name.
The IPNS name should be published per-device and not per-account.
(This is also why there is a <code>devices</code> <em>array</em> in the metadata JSON blob!)</p>

<p>Why should we publish IPNS names per-device and why do we actually need a DAG
here? That&#39;s actually because of we want multi-device support!</p>

<h2 id="multi-device-support" id="multi-device-support">Multi-device support</h2>

<p>I already mentioned that the profile-chain would be a DAG.
I also mentioned that there would be a <code>profile</code> key in the meta-data blob.</p>

<p>This is because of the multi-device support.
If two, three or even more devices need to post to one account, we need to be
able to merge different versions of an account: Consider Alice and Bob sharing
one account (which would be possible!). Now, Bob loses connection to the
internet. But because we are on IPFS and work offline, this is not a problem.
Alice and Bob could continue creating content and thus new profile versions:</p>

<pre><code>A &lt;--- B &lt;--- C &lt;--- D &lt;--- E
        \
         C&#39; &lt;--- D&#39; &lt;--- E&#39;
</code></pre>

<p>In the shown DAG, Alice posts <code>C</code>, <code>D</code> and <code>E</code>, each referring to the former.
Bob creates <code>C&#39;</code>, <code>D&#39;</code> and <code>E&#39;</code> – each refering to the former.
Of course both <code>C</code> and <code>C&#39;</code> would refer to <code>B</code>.</p>

<p>As soon as Bob comes back online, Alice notices that there is another chain of
posts to the profile and can now <em>merge</em> the chains be publishing a new
version <code>F</code> which points to both <code>E</code> and <code>E&#39;</code>:</p>

<pre><code>A &lt;--- B &lt;--- C &lt;--- D &lt;--- E &lt;--- F
        \                         /
         C&#39; &lt;--- D&#39; &lt;--- E&#39; &lt;-----
</code></pre>

<p>Because Bob would also see another chain, his client would also provide a new
version of the profile (<code>F&#39;</code>) where <code>E</code> and <code>E&#39;</code> are merged – one of the
problem which must be sorted out. But a rather trivial one in my opinion, as
the clients need only to do some sort of leader-election. And this election is
<em>temporary</em> until a new node is published – so not really a complicated form
of concensus-finding!</p>

<p>What has to be sorted out, though, is that the devices/nodes which share an
account and now need to agree upon which one merges the chains need some form
of communication between them. I have not yet thought about how this should be
done. Maybe IPFS PubSub is a viable option for this. Cryptographic signatures
play a important role here.</p>

<p>This gets a bit more complicated if there are more than two devices posting to
one account and also if some of them are not available yet – though it is
still in a problem space near “we have to think hard about this” ... and
nowhere in the space of “seems impossible”!</p>

<p>The <code>profile</code> key is provided in the account data so the client knows which
other chains should be checked and merged. Thus, only nodes which are already
allowed to publish new profile versions are actually allowed to add new nodes
to that list.</p>

<h2 id="deleting-content-in-the-dag" id="deleting-content-in-the-dag">Deleting content in the DAG</h2>

<p>Deleting old versions of the profile – or old content – is possible, too.
Because the <code>previous</code> key is an <em>array</em>, we can refer to multiple old
versions of a profile.</p>

<p>Consider the following chain of profile versions:</p>

<pre><code>A&lt;---B&lt;---C&lt;---D&lt;---E
</code></pre>

<p>Now, the user wants to drop profile version <code>C</code>. This is possible by creating
a new profile version which refers to <code>E</code> and <code>B</code> in the <code>previous</code> field and
then dropping <code>C</code>. The following chain (DAG) is the result:</p>

<pre><code>A&lt;---B     &lt;---D&lt;---E&lt;---F
      \                 /
       -----------------
</code></pre>

<p>Of course, <code>D</code> would now point to a node which does not exist. But that is not
a problem. Indeed, its a fundamental key point of the idea – that content may be
unavailable.</p>

<p><code>F</code> should not contain new content.
If <code>F</code> would contain new content, dropping this content would become harder as
the <code>previous</code> key would be copied over, creating even more links to previous
versions in the new profile version.</p>

<h2 id="forgetting-content" id="forgetting-content">“Forgetting” content</h2>

<p>Because clients won&#39;t traverse the whole chain of a profile, but only the
newest 10, 100 or 1,000 entries, older content gets “forgotten” slowly.
Of course it is still there and the device hosting it still has it (and other
devices which post to the same account, eventually also caching servers).
Either way, content gets forgotten slowly. If the user who published the
content deletes it, the network may be unable to fetch it at some point.</p>

<p>Is that bad? I don&#39;t think so! Important content gets replicated by others, so
if I post a comment on an article, I could (automatically or manually) pin the
article itself in my IPFS instance to preserve it.
If I do not and the author of the article thinks that it might not be that
interesting, the article may be deleted and gets unavailable to the network.</p>

<p>And I think that is fine. Replicate important content, delete unimportant
content. The user has the power to decide here!</p>

<h2 id="comments-on-posts-and-comments" id="comments-on-posts-and-comments">Comments on posts (and comments)</h2>

<p>Consider you want to comment on a post. Of course you create new content,
which links to the post you just commented.
But the person who wrote the original post does not automatically link to your
comment, so nobody is able to find your comment.</p>

<p>The approach for solving this is to provide updates to content.
An update is simply a new meta-data blob in the profile.
The blob would contain a link to the original post and the comment on it:</p>

<pre><code class="language-json">{
  &#34;version:&#34; 1,
  &#34;previous&#34;: [ &#34;Qm...1234567890&#34; ],

  &#34;profile&#34;: [ &#34;Qm...098765&#34;, &#34;Qm...54312&#34; ],

  &#34;post&#34;: {
    &#34;update&#34;: &#34;Qm...abc&#34;,
    &#34;new-reply&#34;: &#34;Qm...ghjjk&#34;,
  },
}
</code></pre>

<p>The  <code>post.update</code> and <code>post.new-reply</code> would link to meta-data blobs: The
<code>update</code> one to the original post or the latest update on the post – the
<code>new-reply</code> one on the post from the other user which provides a comment on the
post.
Maybe it would also be an option to list all direct replies to the post here. Details!</p>

<p>Because this works on both “posts” and “reply” kind of data, comments on
comments are possible.</p>

<p>Comments deep down the chain of comments would have to slowly propagate to the
top – to the actual post.</p>

<p>Here, several configurations are possible:</p>
<ul><li>Automatically include comments and publish new profile versions for them</li>
<li>Publishing/propagating comments until some mark is hit (original post is
more than 1 month old, more than 100 comments are propagated)</li>
<li>User can select other users where comments are automatically propagated and others have to be moderated</li>
<li>User has to confirm propagation (moderated comments).</li></ul>

<p>The key difference to known approaches here is that not the author of the original post permits
comments, but always the author of the post or comment the reply was filed
for.
I don&#39;t know whether this is a nice thing or a problem.</p>

<h2 id="unavailable-content" id="unavailable-content">Unavailable content</h2>

<p>The implementation of the social network has to be error-resistant, of course.
IPFS hashes might not be there, fetching content might not be possible
(temporarily or at all). But that&#39;s an implementation-detail to me and I
will not lose any more words about it.</p>

<h1 id="federated-component" id="federated-component">Federated component</h1>

<p>One might think “If I go offline with my node, my posts are not accessible if
nobody else is online having them”. And that&#39;s true.</p>

<p>That&#39;s why I would introduce a federated component, which would run a
stripped-down version of the application.</p>

<p>As soon as another instance connects and a new post is announced,
the instance automatically pins or caches it.
Of course, this would mean that all of these federated instances would pin all
content, which is surely not nice.
Posts which are pinned for a certain amount of time are most likely
distributed well enough so the federated component nodes can drop them...
maybe after 90 days, maybe after 10... Details!</p>

<h1 id="subscribing-privately-and-other-private-information" id="subscribing-privately-and-other-private-information">Subscribing (privately) and other private information</h1>

<p>Another issue with multi-device support would be subscribing privately to
another account. For example, if a user (lets call her Amy) subscribes to
another user (lets call him Sheldon) on her Notebook, this information needs
to be stored somehow.
And because Amys machines do not necessarily sync with each other, her mobile
phone may never know that following Sheldon is a thing now!</p>

<p>This problem could by solved by storing the “follow”-information in her public
profile. Although, some users might not like everyone to know who to follow.
Cryptographic things could be considered to fix visibility.</p>

<p>But then, users may want to “categorize” their friends, store them in groups
or whatever. This information would be stored in the public profile as well,
which would create even <em>more</em> noise on the network.
Also, because cryptography is hard and information would be stored forever,
this might not be an option as some day, the crypto might be broken and reveal
all the things that were stored privately before.</p>

<p>Another solution for this would be that Amys devices would have to somehow
sync directly, without others beeing able to read any of that data.
Something like a
<a href="https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type">CRDT</a>
which holds a configuration file which is then shared
between the devices directly (think of a git-repository which is pushed
between the devices directly without accessing a server on the internet).
This would, of course, only work if the devices are on the same network.</p>

<p>As you see, I have not thought about this particular problem very much yet.</p>

<h1 id="discovering-content" id="discovering-content">Discovering content</h1>

<p>What I did not spend much time thinking about as well was how clients discover
new content.
When a user installs a client, this client does not know any IPFS peers – or
rather any “social network nodes” where it can fetch user profiles/data from -
yet.
Even if it knows some bootstrap nodes to connect to, it might not get content
from them if they do not serve any social network data and if the user does
not know any hashes of user profiles.
To be able to find new social network IPFS nodes, a client has to know their
IPNS hashes – But how to discover them?</p>

<p>This is a hard problem. My first idea would be a PubSub channel where each
client periodically announces their IPNS hashes.  I&#39;m not sure whether PubSub
nodes must be connected directly. If this is the case, the problem just got
harder.  There&#39;s also the problem that this channel would be rather
high-volume as soon as the network grows. If each client announces their IPNS
hash every 15 minutes, for example, we get 4 messages per client each hour.
That&#39;s already a lot of bandwidth if we speak about 1,000, 10,000 or even
100,000 clients!
It is also an attack-vector how the system can be flooded. Not nice!</p>

<p>One way to think about this is that if only nodes which are subscribed to a
topic do also forward the topics messages
(like <a href="https://discuss.ipfs.io/t/when-will-ipfs-pubsub-be-scaleable/1923/2?u=musicmatze">this comment suggests</a>),
we <em>could</em> reduce the time between “publishing” messages in the topic.
Such a message would contain all IPNS hashes a node knows about, thus the
amount of data would be rather much. As soon as the network grows, a node would
need to send this message less and less often, to reduce the number of
messages and bytes send. Still, if each node knows 10,000 nodes and sends this
list once an hour, we get</p>

<pre><code>bytes_per_hash = 46
number_of_nodes = 10_000
message_size = bytes_per_hash * number_of_nodes
bytes_per_hour = number_of_nodes * message_size
</code></pre>

<p>4,28 GiB of “I know these nodes” messages per hour. That does
obviousely not scale!</p>

<p>Maybe each client should offer an API where other clients can ask them about
which IPNS hashes they know. That would be a “pull” approach rather than a
“push” approach then, which would limit bandwidth a bit.  This could even be
done via PubSub as well, where the channel name is generared from the IPFS
instance hash, for example.
I don&#39;t know whether this would be a nice idea.
Still, this would need some “internet-facing” software where clients need to
be able to <em>talk directly to eachother</em>. I don&#39;t know whether IPFS offers
functionality to do this in a simple way.</p>

<p>Either way, I have no solution for this problem yet.</p>

<h1 id="why-ipfs" id="why-ipfs">Why IPFS?</h1>

<p>Platforms like scuttlebutt or movim also implement distributed social
networks, why not use those? Also, why IPFS and not the dat protocol or
something else?</p>

<p>That question is rather simple to answer: IPFS provides functionality and
semantics other tools/frameworks do not provide. Most importantly the notion
that content is immutable, but also full decentralization (not federation like
with services like movim or mastodon, for example).</p>

<p>Having immutable content is a key point. The dat protocol, for example,
features mutable content as it is roughly based on bittorrent (if I understood
everything correctly, feel free to point out mistakes).
That might be nice in some cases, though I think immutability is the way to go.
Distributed applications or frameworks for distributed content with immutability
as core concept are better suited for netsplit, slow connections and
peer-to-peer applications.
From what I saw from the last weeks and months of looking at frameworks for
distributed content storage is that IPFS is way more mature than these other
frameworks. IPFS is build to replace existing contents and to stay, and that&#39;s a
nice thing to build applications on.</p>

<h1 id="remaining-questions" id="remaining-questions">Remaining Questions</h1>

<p>Some questions are remaining:</p>
<ul><li>Is it possible to talk to a specific node directly in IPFS? This would be
helpful for discovering content by asking nodes what profiles they know.
It would also be a nice way for finding consensus when multiple devices have
to agree on which node publishes a merge.</li>
<li>How fast is IPFS with small files? If I need to traverse a long chain of
profile updates, I constantly request a small file, parse it and continue
requesting the previous node in the chain. That should be fast.
If it is not, we might need to introduce some “pack files” where a list of
metadata-nodes is provided and traversing becomes unnecessary with. But that
makes deleting content rather complicated, TBH.</li></ul>

<p>That&#39;s all I can think of right now, but there might be more questions which
are not yet answered.</p>

<h1 id="problems-are-hard-in-distributed-environments" id="problems-are-hard-in-distributed-environments">Problems are hard in distributed environments</h1>

<p>Distributed systems involve a lot of new complexity where we have to carefully
think about details and how to design our system.
New ways to design systems can be discovered by the “distributed approach” and
new paradigms emerge.</p>

<p>Moving away from a central authority which holds the truth, the global state and
also the data results in a
<a href="https://ruben.verborgh.org/blog/2017/12/20/paradigm-shifts-for-the-decentralized-web/">paradigm shift</a>
we really have to be careful about.</p>

<p>I think we can do it and design new, powerful and fully distributed systems
with user freedom, usability, user-convenience and state-of-the-art in mind.
Users want to have a system which is reliable, failure proof, convenient and
easy to use.
They want to “always be connected”.
I think we can provide such software.
Developers want nice abstractions to build upon, data integrity, failure-proof
software with simplicity designed into the system and reusable data structures
and be able to scale.
I think IPFS is the way to for this.
In addition, I think we can provide free software with free data.</p>

<p>I do not claim to know the final solution to any of the problems layed out in
this article.
Its just that I think of them and would love to get an open conversation started
on the whole subject of distributed social networks and problems that come with
them.</p>

<p>And maybe we can come up with a prototype for this?</p>

<p>tags:  <a href="https://beyermatthias.de/tag:distributed" class="hashtag"><span>#</span><span class="p-category">distributed</span></a> <a href="https://beyermatthias.de/tag:network" class="hashtag"><span>#</span><span class="p-category">network</span></a> <a href="https://beyermatthias.de/tag:open" class="hashtag"><span>#</span><span class="p-category">open</span></a>-source <a href="https://beyermatthias.de/tag:social" class="hashtag"><span>#</span><span class="p-category">social</span></a> <a href="https://beyermatthias.de/tag:software" class="hashtag"><span>#</span><span class="p-category">software</span></a></p>
]]></content:encoded>
      <guid>https://beyermatthias.de/blueprint-of-a-distributed-social-network-on-ipfs-and-its-problems-2</guid>
      <pubDate>Sun, 25 Feb 2018 16:39:10 +0100</pubDate>
    </item>
    <item>
      <title>Blueprint of a distributed social network on IPFS - and its problems</title>
      <link>https://beyermatthias.de/blueprint-of-a-distributed-social-network-on-ipfs-and-its-problems</link>
      <description>&lt;![CDATA[  #matrix , #ipfs , #scuttlebutt and now #mastodon - We&#39;re living in awesome&#xA;  times! centralization &lt; decentralization/federation &lt; distribution!&#xA;  #lovefortech&#xA;&#xA;(me, April 10, 2017, on mastodon)&#xA;&#xA;The idea&#xA;&#xA;With the rise of protocols like the matrix protocol, activitypub and others,&#xA;decentralized social community platforms like matrix, mastodon and others gained&#xA;power and were made real.&#xA;I consider these platforms, especially mastodon and matrix, to be great steps&#xA;into the future and am using both enthusiastically.&#xA;&#xA;But can we do better? Can we do more distribution,? I think so!&#xA;&#xA;So far we have a twitter-like thumbleblog platform (mastodon), a chat platform&#xA;(matrix) and facebook-like platforms (diaspora and friendica) which&#xA;are federated (some form of decentralization). I think we can make a&#xA;completely distributed social network platform reality today.&#xA;&#xA;Let me reiterate on that: I think, we can make a facebook/googleplus/etc clone&#xA;which works without a central component, today. And I would even go one step&#xA;further and state: All we need for this is IPFS (and&#xA;related technology like IPLD and IPNS)!&#xA;&#xA;This platform would feature personal profiles, publishing&#xA;articles/posts/images/videos/voice messages/etc, instant messaging, following&#xA;others, and all the things one would want in such a platform.&#xA;&#xA;How would it work?&#xA;&#xA;What do we need for this? Well, as stated before: not much!&#xA;From what I can think of, we would need IPFS, some sort of public/private key&#xA;functionality (which IPFS already has), a nice frontend-framework and that&#39;s&#xA;basically it.&#xA;&#xA;Let me tell you how I think such a platform would work.&#xA;&#xA;The moment a user starts the application, the application would boot an IPFS&#xA;node.&#xA;The username and all other information about the profile are added to IPFS as&#xA;structured data.&#xA;If the profile changes because the user edits it, it is added to IPFS again,&#xA;using IPLD to link to its previous version.&#xA;&#xA;If a user adds a post to her profile, that post is added to IPFS as well and&#xA;linked from the profile via IPLD.&#xA;All other nodes are informed about the new content via pubsub and are free to&#xA;pin the new content (the new profile version) or only cache it for a while (or&#xA;to not care at all).&#xA;The post itself could add a link to the IPNS hash of the profile under which the&#xA;post is published. This way, a link from the post to the current version of the&#xA;profile would always exist.&#xA;&#xA;Because the profile always links to its previous version as well as to the&#xA;post content, that would imply that the node the user of the profile runs would&#xA;always keep all data the user adds to the network.&#xA;As the data is only kept by links, the user is free to drop published&#xA;content at any point in time.&#xA;&#xA;This means that basically each operation would &#34;generate&#34; a new profile, which&#xA;is of course published as an IPNS name.&#xA;Following others would be a matter of subscribing to their &#34;pub&#34; channel (as&#xA;in &#34;pubsub&#34;) or their IPNS name.&#xA;&#xA;Chat&#xA;&#xA;A chat application using IPFS is already implemented with&#xA;orbit, so that&#39;s a matter of integrating&#xA;one application into another.&#xA;Peer-to-Peer (or rather Profile-to-Profile) messaging is therefore no problem.&#xA;&#xA;Data format&#xA;&#xA;All the data would be saved in a structured format. For example Json (though&#xA;order of serialization is important, because of cryptographic hashes) or Bson&#xA;or any other data serialization format that is widely adopted.&#xA;&#xA;small&#xA;  Sidenote: As long as it is made clear that any client must support all&#xA;  formats, the format itself doesn&#39;t matter that much.&#xA;  For simplicity of this article, I stick to Json (and also because it is most&#xA;  widely known).&#xA;/small&#xA;&#xA;A Profile(-version) would look roughly like this (consider &#39;ipfs hash&#39; to&#xA;mean &#34;some kind of IPLD link&#34; in this context):&#xA;&#xA;{&#xA;  &#34;previous&#34;: [ &#34;ipfs hash&#34; ],&#xA;  &#34;post&#34;: {&#xA;    &#34;type&#34;: &#34;post type&#34;,&#xA;    &#34;nodes&#34;: [&#34;ipfs hash&#34;],&#xA;    &#34;metadata&#34;: {&#xA;      &#34;date&#34;: &#34;2017-12-12T12:00:00+0200&#34;,&#xA;      &#34;tags&#34;: [],&#xA;      &#34;category&#34;: &#34;kittens&#34;,&#xA;      &#34;custom&#34;: {}&#xA;    }&#xA;  }&#xA;}&#xA;&#xA;Let me explain:&#xA;&#xA;The previous key would point to the previous profile version(s).&#xA;  It would only contain IPFS hashes (Why plural, see below in&#xA;  &#34;Multi-Device Support&#34;).&#xA;The post key would contain information about the post published with this&#xA;  profile version.&#xA;  The type of the post could be &#34;article&#34;, &#34;image&#34;, &#34;video&#34;... normal stuff.&#xA;    But also &#34;biography&#34; for the biography shown on the profile or other things.&#xA;    Even &#34;username&#34; would be possible, for adding a user name to the profile.&#xA;  The nodes key would point to an IPFS hash containing the actual payload;&#xA;    either the text of the article (only one hash then) or the ipfs hashes of&#xA;    the pictures, the video(s) or other binary content.&#xA;    Of course, posts could be formatted using Markdown, reStructured Text or&#xA;    whatever format one likes to use. It would be a clients job to render it&#xA;    properly.&#xA;  The metadata field would contain plain meta information, like&#xA;    published date, tags, category and also custom metainformation as&#xA;    key-value pairs.&#xA;&#xA;Maybe a version attribute for protocol version could be added as well.&#xA;Of course, this should be considered an incomplete example, as I almost&#xA;certainly forgot things here.&#xA;&#xA;The idea of linking the previous version of a profile from each new version of&#xA;the profile is very much blockchain-like, of course, with the difference that&#xA;nobody needs to fetch the whole chain but only the latest one to get a&#xA;profile.&#xA;The more content a viewer of the profile wants to see, the more she needs to&#xA;traverse the graph of profile versions (and automatically caching the content&#xA;for others).&#xA;This would automatically result in older content beeing &#34;forgotten&#34; slowly&#xA;(but the content would not be forgotten until the publisher itself and all&#xA;other &#34;pinners&#34; drop it).&#xA;Because the actual payload is not stored in the fetched data, the actual&#xA;amount of data which is required to simply view a profile is rather small.&#xA;A client could be configured to fetch all textual content of a file, but not&#xA;more than 10 versions, or one screenpage, or something like that. The&#xA;possibilities are endless here.&#xA;&#xA;Federated component&#xA;&#xA;One might think &#34;If I go offline with my node, my posts are not accessible if&#xA;nobody else is online having them&#34;. And that&#39;s true.&#xA;&#xA;That&#39;s why I would introduce a federated component, which would run a&#xA;stripped-down version of the application.&#xA;&#xA;As soon as another instance connects and a new post is announced via pubsub,&#xA;the instance automatically pins or caches it.&#xA;Of course, this would mean that all of these federated instances would pin all&#xA;content, which is surely not nice.&#xA;One (rather simple and maybe even stupid) option would be to roll a dice and&#xA;make the chance that a post is pinned a 50-50 thing, or something like that.&#xA;Also, posts which are pinned for a certain amount of time are most likely&#xA;distributed well enough so the federated component nodes can drop them...&#xA;maybe after 90 days, maybe after 10... Details!&#xA;&#xA;Blockchain-Approaches&#xA;&#xA;The fundamental problem with Blockchains is that every peer in the network&#xA;hosts the complete content. Nobody benefits from that, especially if you think&#xA;of a social network which should also work on mobile devices.&#xA;With users loading up images, videos and other large blobs of data, a&#xA;blockchain is the wrong approach.&#xA;&#xA;That&#39;s why I think a social network on Euthereum, Bitcoin or any other&#xA;crypto-currency/blockchain is not an option at all.&#xA;&#xA;IPLD&#xA;&#xA;IPLD can be used not only to link posts and profiles, but&#xA;also to link from content to content. Namely to link from one post to another,&#xA;from a post to an image, a video, a voice message,...&#xA;but also to link from one post to a git commit, an euthereum transaction or&#xA;any other IPLD-supported data structure.&#xA;&#xA;Once nice detail is that one does not have to traverse these links.&#xA;If a user sees a post which links to other posts, for example, she does not&#xA;have to fetch these links to see the post itself, only if she wants to see the&#xA;linked content.&#xA;Caching nodes, on the other hand, can automatically traverse the whole graph&#xA;and fetch all the content into their cache.&#xA;&#xA;That makes a IPLD-based linking approach really beneficial.&#xA;&#xA;Scuttlebutt&#xA;&#xA;Scuttlebutt is a first step into the right direction.&#xA;One can say what one wants about electron and the whole technology stack which&#xA;is used in Scuttlebutt (and like or dislike the whole Javascript world), but&#xA;so far Scuttlebutt seems like the first social network that is completely&#xA;distributed.&#xA;&#xA;I thought about whether it would be a great idea to port Scuttlebutt to use&#xA;IPFS in the backend.&#xA;From what I know right now, it would be a nice way of bringing IPFS and IPLD&#xA;to the mix and therefor enhancing and extending the capabilities of&#xA;Scuttlebutt itself.&#xA;&#xA;I have not final conclusion on that thought, though.&#xA;&#xA;Problems&#xA;&#xA;There are several problems one has to think about when designing such a&#xA;system.&#xA;&#xA;Comments on Posts (and comments)&#xA;&#xA;Consider you want to comment on a post. Of course you create new content,&#xA;which links to the post you just commented.&#xA;But the person who wrote the original post does not automatically link to your&#xA;comment, so is neither able to find the comment (which could be solved via&#xA;pubsub), nor are others able to find them.&#xA;&#xA;The approach to this problem is simple: Notification about comments can be&#xA;done via pubsub.&#xA;And, if a user gets a notification about a new comment, she can approve it and&#xA;automatically publish a new version of her post, with some added meta information:&#xA;&#xA;A link to the comment&#xA;A link to the &#34;old version of the content in IPFS&#34;&#xA;&#xA;Now, if a client fetches all posts of a profile, it resolves all entries for&#xA;their newest version (so basically the one entry which does not link to an&#xA;older version of itself) and only shows the latest versions of it.&#xA;&#xA;Comments on comments (and so on) would be possible with the exact same approach.&#xA;That would, of course, cause a whole tree of comments to be rebuild every time&#xA;a new comment is added.&#xA;&#xA;Maybe not the best idea in that regard.&#xA;&#xA;Multi-Device Support&#xA;&#xA;There are several problems regarding multi-device support.&#xA;&#xA;Publishing content&#xA;&#xA;Publishing from multiple devices with the same profile is possible - one just&#xA;needs to import the private key for the signatures and the profile information&#xA;to the other device.&#xA;&#xA;Though, this needs some sort of merging mechanism if two posts are published&#xA;from two devices (or more) at the same time / without the other devices beeing&#xA;online to get notifications of the new point of truth.&#xA;&#xA;As creating two posts from two seperate devices would create two new versions of&#xA;the profile (because of IPLD linking), which means two points of truth suddenly&#xA;exists, a merging-mechanism must be implemented to merged multiple points of&#xA;truth for the profile.&#xA;This could yield a rather large network of profile versions, but ultimatively&#xA;a DAG (Directed Acyclic Graph).&#xA;&#xA;        Profile Init&#xA;             ^&#xA;             |&#xA;          Post A&#xA;             ^&#xA;             |&#xA;          Post B &lt;----+&#xA;             ^        |&#xA;             |        |&#xA;  +-----  Post C    Post C&#39;&#xA;  |          ^        ^&#xA;  |          |        |&#xA;Post D    Post D&#39;   Post D&#39;&#39;&#xA;  ^          ^        ^&#xA;  |          |        |&#xA;  |          +--------+&#xA;  |          |&#xA;  |       Post E&#xA;  |          ^&#xA;  |          |&#xA;  +----------+&#xA;             |&#xA;             |&#xA;          Post F&#xA;&#xA;A scenario like the one above (each Post also represents a new version of&#xA;the profile) would be easy to create with three devices:&#xA;&#xA;One starts using the network on a notebook&#xA;Post A published from the notebook&#xA;Post B published from the notebook&#xA;Profile added on the workstation&#xA;Post C published from the notebook while off of the internet&#xA;Post C&#39; published on the workstation&#xA;Profile added to the mobile phone (from the notebook)&#xA;Post D published from the mobile while off of the internet&#xA;Post D&#39; published from the notebook while off of the internet&#xA;Post D&#39;&#39; published on the workstation&#xA;Notebook comes back online, Post E published, merging the state from&#xA;   Post D&#39;&#39; from the workstation and Post D&#39; from the notebook itself.&#xA;Phone comes online, one of the devices is used to publish Post F, merging&#xA;   the state from Post D and Post E.&#xA;&#xA;In this scenario, there would still be one problem, though: If the profile is&#xA;published as an IPNS name, branching off of versions would be problematic.&#xA;If C is published while C&#39; is published, both devices would publish their&#xA;version as an IPNS name.&#xA;Now, first come first serve applies. And of course that is problematic,&#xA;because every device would always see one of the posts, but no device could see&#xA;the other.&#xA;Only at E (in the above example), when the branches are merged, both C and&#xA;C&#39; would be visible (though D wouldn&#39;t be visible as long as it isn&#39;t&#xA;merged into the chain).&#xA;But how does a device discover that there are two &#34;current&#34; versions which&#xA;have to be linked to the new post?&#xA;&#xA;So, discoverability is an issue in this approach. Maybe someone can come up&#xA;with a clean and easy solution that would work for netsplit and all those&#xA;scenarios.&#xA;&#xA;One idea would be that there is a profile-key which is used to publish profile&#xA;versions under an IPNS name as well as a device-key, which is used to&#xA;announce profile versions as a seperate IPNS name.&#xA;That IPNS name could be added to the profile, so each other device can find it&#xA;and fetch &#34;current&#34; versions from each device.&#xA;Only the initial setup of a new device would need to be made carefully then.&#xA;&#xA;Or, maybe, the whole approach is wrong and another approach would fit better&#xA;for this kind of problem. I don&#39;t know.&#xA;&#xA;Subscribing&#xA;&#xA;Another issue with multi-device support would be subscribing. For example, if&#xA;a user (lets call her Amy) subscribes to another user (lets call him Sheldon) on&#xA;her Notebook, this information needs to be stored somehow.&#xA;And because Amys machines do not necessarily sync with each other, her&#xA;mobile phone may never know that following Sheldon is a thing now!&#xA;&#xA;This problem could by solved by storing the &#34;follow&#34;-information in her public&#xA;profile. Although, some users might not like everyone to know who to follow.&#xA;Cryptographic things could be considered to fix visibility.&#xA;&#xA;But then, users may want to &#34;categorize&#34; their friends, store them in groups&#xA;or whatever. This information would be stored in the public profile as well,&#xA;which would create even more noise on the network.&#xA;Also, because cryptography is hard and information would be stored forever,&#xA;this might not be an option as some day, the crypto might be broken and reveal&#xA;all the things that were stored privately before.&#xA;&#xA;Deleting profile versions&#xA;&#xA;Some time, a user may want to remove a biography entry or a user name she once&#xA;published.&#xA;Because all the information is chained in a long chain of versions, one may&#xA;think that deleting a node is not possible.&#xA;But it is!&#xA;&#xA;Consider the following (simple) graph of profile versions:&#xA;&#xA;A&lt;---B&lt;---C&lt;---D&lt;---E&#xA;&#xA;If the user now wants to delete node C in this graph, she simply drops it.&#xA;Now, E beeing the latest point of truth, one may think that finding B and&#xA;A is not possible anymore. That&#39;s true. But why not shipping around this by&#xA;creating a new profile version and linking the previous versions:&#xA;&#xA;A&lt;---B     &lt;---D&lt;---E&lt;---F&#xA;      \                 /&#xA;       -----------------&#xA;&#xA;Of course, D would now point to a node which does not exist. But that is not&#xA;a problem. Indeed, its a fundamental concept of the idea - that content may be&#xA;unavailable.&#xA;&#xA;F must not contain new content. It even should not, because dropping F&#xA;because of its content becomes harder this way. Also, new versions of the&#xA;profile is simple and cheap.&#xA;&#xA;Problems are hard in distributed environments&#xA;&#xA;I do not claim to know the final solution to any of these problems. Its just&#xA;that I think of them and would love to get an open conversation started on the&#xA;whole subject of distributed social networks and problems that come with them.&#xA;&#xA;tags:  #distributed #network #open-source #social #software&#xA;]]&gt;</description>
      <content:encoded><![CDATA[<blockquote><p><a href="https://beyermatthias.de/tag:matrix" class="hashtag"><span>#</span><span class="p-category">matrix</span></a> , <a href="https://beyermatthias.de/tag:ipfs" class="hashtag"><span>#</span><span class="p-category">ipfs</span></a> , <a href="https://beyermatthias.de/tag:scuttlebutt" class="hashtag"><span>#</span><span class="p-category">scuttlebutt</span></a> and now <a href="https://beyermatthias.de/tag:mastodon" class="hashtag"><span>#</span><span class="p-category">mastodon</span></a> – We&#39;re living in awesome
times! centralization &lt; decentralization/federation &lt; distribution!
<a href="https://beyermatthias.de/tag:lovefortech" class="hashtag"><span>#</span><span class="p-category">lovefortech</span></a></p></blockquote>

<p>(me, April 10, 2017, on <a href="https://mastodon.fun/@musicmatze/11236">mastodon</a>)</p>

<h1 id="the-idea" id="the-idea">The idea</h1>

<p>With the rise of protocols like the matrix protocol, activitypub and others,
decentralized social community platforms like matrix, mastodon and others gained
power and were made real.
I consider these platforms, especially mastodon and matrix, to be great steps
into the future and am using both enthusiastically.</p>

<p>But can we do better? Can we do <em>more</em> distribution,? I think so!</p>

<p>So far we have a twitter-like thumbleblog platform (mastodon), a chat platform
(matrix) and facebook-like platforms (diaspora and friendica) which
are federated (some form of decentralization). I think we can make a
<em>completely distributed social network platform</em> reality <em>today</em>.</p>

<p>Let me reiterate on that: I think, we can make a facebook/googleplus/etc clone
which works without a central component, today. And I would even go one step
further and state: All we need for this is <a href="https://ipfs.io">IPFS</a> (and
related technology like <a href="https://ipld.io/">IPLD</a> and IPNS)!</p>

<p>This platform would feature personal profiles, publishing
articles/posts/images/videos/voice messages/etc, instant messaging, following
others, and all the things one would want in such a platform.</p>

<h1 id="how-would-it-work" id="how-would-it-work">How would it work?</h1>

<p>What do we need for this? Well, as stated before: not much!
From what I can think of, we would need IPFS, some sort of public/private key
functionality (which IPFS already has), a nice frontend-framework and that&#39;s
basically it.</p>

<p>Let me tell you how I think such a platform would work.</p>

<p>The moment a user starts the application, the application would boot an IPFS
node.
The username and all other information about the profile are added to IPFS as
structured data.
If the profile changes because the user edits it, it is added to IPFS again,
using IPLD to link to its previous version.</p>

<p>If a user adds a post to her profile, that post is added to IPFS as well and
linked from the profile via IPLD.
All other nodes are informed about the new content via pubsub and are free to
pin the new content (the new profile version) or only cache it for a while (or
to not care at all).
The post itself could add a link to the IPNS hash of the profile under which the
post is published. This way, a link from the post to the current version of the
profile would always exist.</p>

<p>Because the profile always links to its previous version as well as to the
post content, that would imply that the node the user of the profile runs would
always keep all data the user adds to the network.
As the data is only kept by <em>links</em>, the user is free to drop published
content at any point in time.</p>

<p>This means that basically each operation would “generate” a new profile, which
is of course published as an IPNS name.
Following others would be a matter of subscribing to their “pub” channel (as
in “pubsub”) or their IPNS name.</p>

<h1 id="chat" id="chat">Chat</h1>

<p>A chat application using IPFS is already implemented with
<a href="https://github.com/orbitdb/orbit">orbit</a>, so that&#39;s a matter of integrating
one application into another.
Peer-to-Peer (or rather Profile-to-Profile) messaging is therefore no problem.</p>

<h1 id="data-format" id="data-format">Data format</h1>

<p>All the data would be saved in a structured format. For example Json (though
order of serialization is important, because of cryptographic hashes) or Bson
or any other data serialization format that is widely adopted.</p>

<p><small>
  Sidenote: As long as it is made clear that any client must support <em>all</em>
  formats, the format itself doesn&#39;t matter that much.
  For simplicity of this article, I stick to Json (and also because it is most
  widely known).
</small></p>

<p>A Profile(-version) would look roughly like this (consider <code>&#39;ipfs hash&#39;</code> to
mean “some kind of IPLD link” in this context):</p>

<pre><code class="language-json">{
  &#34;previous&#34;: [ &#34;&lt;ipfs hash&gt;&#34; ],
  &#34;post&#34;: {
    &#34;type&#34;: &#34;&lt;post type&gt;&#34;,
    &#34;nodes&#34;: [&#34;&lt;ipfs hash&gt;&#34;],
    &#34;metadata&#34;: {
      &#34;date&#34;: &#34;2017-12-12T12:00:00+0200&#34;,
      &#34;tags&#34;: [],
      &#34;category&#34;: &#34;kittens&#34;,
      &#34;custom&#34;: {}
    }
  }
}
</code></pre>

<p>Let me explain:</p>
<ul><li>The <code>previous</code> key would point to the previous profile version(s).
It would only contain IPFS hashes (Why <em>plural</em>, see below in
“Multi-Device Support”).</li>
<li>The <code>post</code> key would contain information about the post published with this
profile version.
<ul><li>The <code>type</code> of the post could be “article”, “image”, “video”... normal stuff.
But also “biography” for the biography shown on the profile or other things.
Even “username” would be possible, for adding a user name to the profile.</li>
<li>The <code>nodes</code> key would point to an IPFS hash containing the actual payload;
either the text of the article (only one hash then) or the ipfs hashes of
the pictures, the video(s) or other binary content.
Of course, posts could be formatted using Markdown, reStructured Text or
whatever format one likes to use. It would be a clients job to render it
properly.</li>
<li>The <code>metadata</code> field would contain plain meta information, like
published date, tags, category and also custom metainformation as
key-value pairs.</li></ul></li></ul>

<p>Maybe a <code>version</code> attribute for protocol version could be added as well.
Of course, this should be considered an incomplete example, as I almost
certainly forgot things here.</p>

<p>The idea of linking the previous version of a profile from each new version of
the profile is very much blockchain-like, of course, with the difference that
nobody needs to fetch the <em>whole</em> chain but only the latest one to get a
profile.
The more content a viewer of the profile wants to see, the more she needs to
traverse the graph of profile versions (and automatically caching the content
for others).
This would automatically result in older content beeing “forgotten” slowly
(but the content would not be forgotten until the publisher itself and all
other “pinners” drop it).
Because the actual <em>payload</em> is not stored in the fetched data, the actual
amount of data which is required to simply <em>view</em> a profile is rather small.
A client could be configured to fetch all textual content of a file, but not
more than 10 versions, or one screenpage, or something like that. The
possibilities are endless here.</p>

<h1 id="federated-component" id="federated-component">Federated component</h1>

<p>One might think “If I go offline with my node, my posts are not accessible if
nobody else is online having them”. And that&#39;s true.</p>

<p>That&#39;s why I would introduce a federated component, which would run a
stripped-down version of the application.</p>

<p>As soon as another instance connects and a new post is announced via pubsub,
the instance automatically pins or caches it.
Of course, this would mean that all of these federated instances would pin all
content, which is surely not nice.
One (rather simple and maybe even stupid) option would be to roll a dice and
make the chance that a post is pinned a 50-50 thing, or something like that.
Also, posts which are pinned for a certain amount of time are most likely
distributed well enough so the federated component nodes can drop them...
maybe after 90 days, maybe after 10... Details!</p>

<h1 id="blockchain-approaches" id="blockchain-approaches">Blockchain-Approaches</h1>

<p>The fundamental problem with Blockchains is that every peer in the network
hosts the complete content. Nobody benefits from that, especially if you think
of a social network which should also work on mobile devices.
With users loading up images, videos and other large blobs of data, a
blockchain is the wrong approach.</p>

<p>That&#39;s why I think a social network on Euthereum, Bitcoin or any other
crypto-currency/blockchain is not an option at all.</p>

<h1 id="ipld" id="ipld">IPLD</h1>

<p><a href="https://ipld.io/">IPLD</a> can be used not only to link posts and profiles, but
also to link from content to content. Namely to link from one post to another,
from a post to an image, a video, a voice message,...
but also to link from one post to a git commit, an euthereum transaction or
any other IPLD-supported data structure.</p>

<p>Once nice detail is that one does not have to traverse these links.
If a user sees a post which links to other posts, for example, she does not
have to fetch these links to see the post itself, only if she wants to see the
linked content.
Caching nodes, on the other hand, can automatically traverse the whole graph
and fetch all the content into their cache.</p>

<p>That makes a IPLD-based linking approach really beneficial.</p>

<h1 id="scuttlebutt" id="scuttlebutt">Scuttlebutt</h1>

<p>Scuttlebutt is a first step into the right direction.
One can say what one wants about electron and the whole technology stack which
is used in Scuttlebutt (and like or dislike the whole Javascript world), but
so far Scuttlebutt seems like the first social network that is completely
distributed.</p>

<p>I thought about whether it would be a great idea to port Scuttlebutt to use
IPFS in the backend.
From what I know right now, it would be a nice way of bringing IPFS and IPLD
to the mix and therefor enhancing and extending the capabilities of
Scuttlebutt itself.</p>

<p>I have not final conclusion on that thought, though.</p>

<h1 id="problems" id="problems">Problems</h1>

<p>There are several problems one has to think about when designing such a
system.</p>

<h2 id="comments-on-posts-and-comments" id="comments-on-posts-and-comments">Comments on Posts (and comments)</h2>

<p>Consider you want to comment on a post. Of course you create new content,
which links to the post you just commented.
But the person who wrote the original post does not automatically link to your
comment, so is neither able to find the comment (which could be solved via
pubsub), nor are others able to find them.</p>

<p>The approach to this problem is simple: Notification about comments can be
done via pubsub.
And, if a user gets a notification about a new comment, she can approve it and
automatically publish a new version of her post, with some added meta information:</p>
<ul><li>A link to the comment</li>
<li>A link to the “old version of the content in IPFS”</li></ul>

<p>Now, if a client fetches all posts of a profile, it resolves all entries for
their newest version (so basically the one entry which does not link to an
older version of itself) and only shows the latest versions of it.</p>

<p>Comments on comments (and so on) would be possible with the exact same approach.
That would, of course, cause a whole tree of comments to be rebuild every time
a new comment is added.</p>

<p>Maybe not the best idea in that regard.</p>

<h2 id="multi-device-support" id="multi-device-support">Multi-Device Support</h2>

<p>There are several problems regarding multi-device support.</p>

<h3 id="publishing-content" id="publishing-content">Publishing content</h3>

<p>Publishing from multiple devices with the same profile is possible – one just
needs to import the private key for the signatures and the profile information
to the other device.</p>

<p>Though, this needs some sort of merging mechanism if two posts are published
from two devices (or more) at the same time / without the other devices beeing
online to get notifications of the new point of truth.</p>

<p>As creating two posts from two seperate devices would create two new versions of
the profile (because of IPLD linking), which means two points of truth suddenly
exists, a merging-mechanism must be implemented to merged multiple points of
truth for the profile.
This could yield a rather large network of profile versions, but ultimatively
a DAG (Directed Acyclic Graph).</p>

<pre><code>        Profile Init
             ^
             |
          Post A
             ^
             |
          Post B &lt;----+
             ^        |
             |        |
  +-----&gt; Post C    Post C&#39;
  |          ^        ^
  |          |        |
Post D    Post D&#39;   Post D&#39;&#39;
  ^          ^        ^
  |          |        |
  |          +--------+
  |          |
  |       Post E
  |          ^
  |          |
  +----------+
             |
             |
          Post F

</code></pre>

<p>A scenario like the one above (each <code>Post</code> also represents a new version of
the profile) would be easy to create with three devices:</p>
<ol><li>One starts using the network on a notebook</li>
<li>Post <code>A</code> published from the notebook</li>
<li>Post <code>B</code> published from the notebook</li>
<li>Profile added on the workstation</li>
<li>Post <code>C</code> published from the notebook while off of the internet</li>
<li>Post <code>C&#39;</code> published on the workstation</li>
<li>Profile added to the mobile phone (from the notebook)</li>
<li>Post <code>D</code> published from the mobile while off of the internet</li>
<li>Post <code>D&#39;</code> published from the notebook while off of the internet</li>
<li>Post <code>D&#39;&#39;</code> published on the workstation</li>
<li>Notebook comes back online, Post <code>E</code> published, merging the state from
<code>Post D&#39;&#39;</code> from the workstation and <code>Post D&#39;</code> from the notebook itself.</li>
<li>Phone comes online, one of the devices is used to publish <code>Post F</code>, merging
the state from <code>Post D</code> and <code>Post E</code>.</li></ol>

<p>In this scenario, there would still be one problem, though: If the profile is
published as an IPNS name, branching off of versions would be problematic.
If <code>C</code> is published while <code>C&#39;</code> is published, both devices would publish their
version as an IPNS name.
Now, first come first serve applies. And of course that is problematic,
because every device would always see one of the posts, but no device could see
the other.
Only at <code>E</code> (in the above example), when the branches are merged, both <code>C</code> and
<code>C&#39;</code> would be visible (though <code>D</code> wouldn&#39;t be visible as long as it isn&#39;t
merged into the chain).
But how does a device discover that there are two “current” versions which
have to be linked to the new post?</p>

<p>So, discoverability is an issue in this approach. Maybe someone can come up
with a clean and easy solution that would work for netsplit and all those
scenarios.</p>

<p>One idea would be that there is a profile-key which is used to publish profile
versions under an IPNS name as well as a <em>device</em>-key, which is used to
announce profile versions as a seperate IPNS name.
That IPNS name could be added to the profile, so each other device can find it
and fetch “current” versions from each device.
Only the initial setup of a new device would need to be made carefully then.</p>

<p>Or, maybe, the whole approach is wrong and another approach would fit better
for this kind of problem. I don&#39;t know.</p>

<h3 id="subscribing" id="subscribing">Subscribing</h3>

<p>Another issue with multi-device support would be subscribing. For example, if
a user (lets call her Amy) subscribes to another user (lets call him Sheldon) on
her Notebook, this information needs to be stored somehow.
And because Amys machines do not necessarily sync with each other, her
mobile phone may never know that following Sheldon is a thing now!</p>

<p>This problem could by solved by storing the “follow”-information in her public
profile. Although, some users might not like everyone to know who to follow.
Cryptographic things could be considered to fix visibility.</p>

<p>But then, users may want to “categorize” their friends, store them in groups
or whatever. This information would be stored in the public profile as well,
which would create even <em>more</em> noise on the network.
Also, because cryptography is hard and information would be stored forever,
this might not be an option as some day, the crypto might be broken and reveal
all the things that were stored privately before.</p>

<h2 id="deleting-profile-versions" id="deleting-profile-versions">Deleting profile versions</h2>

<p>Some time, a user may want to remove a biography entry or a user name she once
published.
Because all the information is chained in a long chain of versions, one may
think that deleting a node is not possible.
But it is!</p>

<p>Consider the following (simple) graph of profile versions:</p>

<pre><code>A&lt;---B&lt;---C&lt;---D&lt;---E
</code></pre>

<p>If the user now wants to delete node <code>C</code> in this graph, she simply drops it.
Now, <code>E</code> beeing the latest point of truth, one may think that finding <code>B</code> and
<code>A</code> is not possible anymore. That&#39;s true. But why not shipping around this by
creating a new profile version and linking the previous versions:</p>

<pre><code>A&lt;---B     &lt;---D&lt;---E&lt;---F
      \                 /
       -----------------
</code></pre>

<p>Of course, <code>D</code> would now point to a node which does not exist. But that is not
a problem. Indeed, its a fundamental concept of the idea – that content may be
unavailable.</p>

<p><code>F</code> must not contain new content. It even <em>should</em> not, because dropping <code>F</code>
because of its content becomes harder this way. Also, new versions of the
profile is simple and cheap.</p>

<h2 id="problems-are-hard-in-distributed-environments" id="problems-are-hard-in-distributed-environments">Problems are hard in distributed environments</h2>

<p>I do not claim to know the final solution to any of these problems. Its just
that I think of them and would love to get an open conversation started on the
whole subject of distributed social networks and problems that come with them.</p>

<p>tags:  <a href="https://beyermatthias.de/tag:distributed" class="hashtag"><span>#</span><span class="p-category">distributed</span></a> <a href="https://beyermatthias.de/tag:network" class="hashtag"><span>#</span><span class="p-category">network</span></a> <a href="https://beyermatthias.de/tag:open" class="hashtag"><span>#</span><span class="p-category">open</span></a>-source <a href="https://beyermatthias.de/tag:social" class="hashtag"><span>#</span><span class="p-category">social</span></a> <a href="https://beyermatthias.de/tag:software" class="hashtag"><span>#</span><span class="p-category">software</span></a></p>
]]></content:encoded>
      <guid>https://beyermatthias.de/blueprint-of-a-distributed-social-network-on-ipfs-and-its-problems</guid>
      <pubDate>Tue, 31 Oct 2017 16:38:24 +0100</pubDate>
    </item>
  </channel>
</rss>