I wrote the following post almost in one sitting. After powering through this, I do not find the energy to review it and fear that I will never post it because of that. That's also why the “Final thoughts” paragraph is rather short.
I don't want it to bit-rot in my editor, so here it goes...
Lately I have started again thinking about writing a distributed social network application. I've been thinking about this since 2017 on and off, but back then the available libraries for Rust were unusable for me, at least it was too much hassle for a free time side project.
But lately I started to think about it again, and some toots I have sent out to the world got some responses that made me think harder.
So let me (again) tell you about that idea.
The outcome I am targeting is the following: you can start up a GUI application on your desktop where you can post text, audio, video, much like you can with mastodon. The difference is that you don't need to be connected to a server or even to the internet at all. The application is an interface to a fully distributed network, where you can connect directly to your followers or the people you follow.
The application is focused on “macro blogging”, so more like diaspora or Lemmy/kbin, not so much like mastodon (User Interface wise) and tree-style discussion are supported (speaking of the interface/UI/UX).
You can boot that application while being air-gapped, you can boot it on several devices and post to the same account (which you can't with for example scuttlebutt, which is a “distributed social network” app).
That sounds not so fancy? Well, the UI and UX shouldn't be “fancy” in that regard at all. The underlying technology should be, though. So let's talk about it.
It is always hard to describe an implementation of something in text. This is no exception.
First of all, let's have a look at the actors in the system I think of.
A user wants to post text, images, videos, maybe polls, calendars, etc to their profile. They want to tag their content. Maybe they want to have different “streams” in their profile.
Maybe they even want several profiles, but that's not very much important since it would only be an implementation detail.
They want to be able to post to that profile from multiple devices. Maybe they have two notebooks, maybe they have another workstation, maybe a tablet, a smartphone and maybe (just maybe) they also have a server from where they want to post via a bot, all to the same profile.
They want to be able to reply to posts from other users, they want to boost them (republish), like them, hide them, block other users, follow other users to see their content.
Maybe they also want to follow a single “stream” of one user, maybe they want to follow single tags.
Actor: discover server
Because of the distributed nature of the system (as we see further down this article), users cannot simply search for “TheAwesomeCat” user and get their profile. Each profile lives on the devices of the user of the profile, at first.
To solve this issue, there should be “discover server” instances. These instances can be hosted by power users or organizations. These servers do nothing but “announce” to connected instances of the application about profiles they know. Users can then decide (via interaction or configuration) whether to fetch these profiles.
Actor: Pin servers
Content lives, at first, only on the devices of the users that post it. If a users follow a profile, they automatically replicate the content that was posted on that profile, which helps distributing that content.
Of course, some might don't want to replicate content, which is totally fine and should be configurable (per followed profile, actually). Reposting content would be a option to ensure content is replicated.
Content replication by users might or might not be enought. To improve on this, there should/could be “pin servers”, that are always online, which a user can tell to pin their content for a decent amount of time. That would help distribution of the content. There could be multiple or a whole network of such servers. They are not really federated, but are more or less distributed as well.
These servers could be run by power users or organizations as well. Power-users can chose to pin only their own content or also that of friends, or even more stuff from other users they select or they get paid by, even.
Data Structures / The “Model”
The above might be a bit confusing because we haven't yet talked about how the system would work from a technical point.
That was on purpose, because I wanted to talk about this top-down rather than bottom-up, since I have better experience describing it this way.
Let's now look at the data structures.
There are two data types that we have to concern ourselves with in this system. The first one is, obviously, the “Profile” which a user posts to.
A profile would be a DAG and Merkle Tree. Think about a profile like a git repository. A user posts to their profile by creating a entry in their “repository” (much like a commit in git). Binary content like audio or video would also be possible. “Commits” would contain metadata about the post, such as time, maybe location, tags, mime types of the content, etc.
In detail, there would be three “layers” in that repository. The lowest would be the actual content: text blobs, audio, video, files. The middle layer would be the “metadata” layer, pointing to the content (if there is any, as we'll see later there could be posts without content), the uppermost layer would be nothing more than a DAG of objects that link to their parent objects (multiple, as we need merges in such a system, or none for the first object in a profile) and contain some minimal metadata about the format (basically just a version number).
The uppermost layer may also contain a list of public keys of the instances that post to that profile. This way, discovering a full profile from just one object in the uppermost layer is possible.
For size reasons, that list of public keys may also be embedded in the “metadata” layer.
The objects from the uppermost layer are really small and users could potentially cache these for other users. As they do not contain much data, but are vital to discovering profiles, they should be highly available in such a system.
Loading a profile of another user would now mean that the application would fetch the newest object on the uppermost layer and then traverse the DAG until it hits its root node. As an optimization, “pack nodes” could be added, which refer to a larger number of ancestor nodes, to speed up traversal. That would make deletion of stuff more complicated though. More on that later.
If a user wants to post from multiple devices, the application must be able to trust other devices.
If a user posts from device A, and device B sees that new post on the DAG, device B could potentially fast-forward its HEAD to the new node. But it must be able to verify that this new node is actually from a device that is owned by the user.
For such a thing, public-private key crypto is there and solves the problem just fine. Each node (in the uppermost layer) could get a signature and there we are.
There could also be another way of solving this, by having the two devices communicate directly (off the DAG), which is something we talk about later.
Multi Device functionality/ Merging
To be able to use multiple devices, the application must be able to merge diverging DAGs.
If a user has two (or more) devices that are not connected, but posts on both, their profile dag advances on both devices and effectively leads to a diverging “HEAD” (in git-speak). For such a scenario, merges must be possible.
Because we're not actually merging content here, creating a merge is trivial. A new node has to be added to the profile, referring to the current two head nodes.
The concerning part here is that the two devices must agree on which one executes the merge, the other device must follow. Such problems are solved with concensus algorithms. As we're talking about a low number of actors here (if a user has more than 10 devices they post from at the same time, that's much, but not for a concensus mechanism), it shouldn't be that much of a problem.
But it is important that the merge is cheap, so it is vital that the objects that represent the DAG are lightweight.
Reposting content would be a rather simple matter. The “metadata layer” we talked about eariler would simply note that the content it points to is actually a repost. It would also point to the metadata object (or even the DAG object that points to it), for discoverability.
Commenting on a post is a bit more complicated. Well, the actual comment is not, but making that comment visible is.
First of all, a comment would just be another post to the profile DAG, with an entry in its metadata linking to the post it replies to (which of course could itself be a reply).
In a system where Alice, Bob and Clara post content, and Bob and Clara follow Alice. Bob and Clara don't know of each other in that system. Now if Alice posts something and Bob publishes a comment on that post, that does not mean that Clara sees that reply.
Also, Alice may not necessarily see Bobs post! But Bobs device(s) should tell the device of Alice that there has been a reply (see “Gossipping Applications” below). Alice is now able to configure her device for different behaviours if a reply is encountered:
- Replies are not allowed, the device ignores the reply
- Replies are allowed, Alices device posts a new object to her DAG that notes the reply on her profile
- And the device also replicates (for a configurable amount of time) the reply content
- The device does not replicate the content
That setting could even be per post, and there could even be a note in the “metadata” object telling followers what would happen with replies (although that is then set in stone, and I don't really like that, because this configuration should be mutable).
Replies on replies would need to be propagated to “upstream” as well, so that trees of replies are possible. They are not necessary from a technical standpoint, because if an instance loads a profile, finds “reply metadata nodes” and loads the profile that posted the replies, it would encounter replies on that reply as well (yeeees I know that sounds confusing).
As an optimization, replies should “bubble up” the chain until they hit the original post. This would give the author of the original post the opportunity to moderate the replies on their post, but it wouldn't give them the possibility to deny people speaking. They can only reduce the replication and visibility of replies.
This is also why metadata objects and uppermost layer objects must be designed to be small! Consider this: Alice publishes a post. Bob and Clara reply. And on each of these replies, 10 other people reply. If Alice has her device configured to re-announce all replies, and replies to replies, that would mean that she now has 23 new objects in her profile:
1 Post Alice made -> 1 DAG object and 1 metadata object = 2 objects
+ 1 Reply from Bob -> 2 objects
+ 1 Reply from Clara -> 2 objects
+ 10 Replies from Bobs followers -> 20 objects + 20 objects in Bobs profile
+ 10 Replies from Claras followers -> also 40 objects
= 46 objects in the profile DAG of Alice
As more replies are posted further down the tree of replies, more objects land in Alices profile DAG. That's why this has to be optimized and configurable.
The second data structure we need is some form of gossipping protocol. Instances of the application, especially instances which are posting to the same profile, must be able to communicate directly with eachother. Not only for the DAG merges we had a look on above, but also for announcing their latest state.
Also, a gossipping protocol can be used to tell other users about the own profile, about profiles one instance “knows” and so on.
We did not yet talk about how an instance can find the current HEAD of other instances. That's exactly what the gossip protocol is (also) for: It sends, periodically, information about itself to that channel.
I considered a system like IPNS for that, but that seems to be not the right tool for the job, especially when we talk about updating that state every other second. Gossipping seems to be a better solution to this problem.
Such a gossipping protocol should basically be a global channel application instances send information into:
- “Hey I am
- “My current HEAD is
- “Hey I know these instances:
list of Profile IDs“
- I just posted
Another concern, especially in scenarios where a user has multiple devices, is synchronization of configuration.
Configuration is not only what color scheme they prefer for their application instance, but also which profiles they “follow”, which ones they “block” and so on.
This data has to be synchronized between devices. That's not too complicated, as this can easily be solved by CRDTs. We do not have to concern ourselves with preservation of history here, only the current state is of relevance.
Network of Trust
We already talked about having public-private key crypto in there for device authentication. Having public-private key crypto could also serve to build a network of trust.
Users might want to make sure that profiles they follow are actually humans. A network of trust would be ideal for that. Each post (or each node on the uppermost layer) should be signed with a private key anyways. Users might want to sign other users keys, so that they can build a network of trust.
Words on networks of trust have been written extensively, so I won't lose any more words about them here. I think they are a viable tool that should be explored for this application.
I've written a lot about how the system would look like and some may think about certain problems that they can imagine such a system to have.
Here I want to address some of them.
First of all: deleting content is not of importance to me, although technically somewhat possible. To make that clear: If devices replicate content that is essentially content-addressed, there's no way to fully delete content from the network. That is the same as with putting a git repository open-source. If someone has a copy, there's just no way to remove it anymore.
It would be possible, though, to re-write the full profile chain and remove a single piece of content. As other instances may still have the “old” version of the profile DAG, they may see the content still, but devices that discover the content newly may never see it.
Moderation is “possible”. I've put that in quotes because it is and is not at the same time. So first of all, everyone can post everything and nobody can forbid them to.
But users are able (and should) to configure their instances for what to replicate and what not. Replies to posts can be replicated for other followers to see, or not be replicated to make discoverability of them harder. But nobody can prevent someone from posting a reply to a post to their own profile.
For Spam and malice, block-lists could potentially exist and there could even be a feature in the “discover servers”. I am not really a fan of that, but I still want to mention that it is possible.
What I think would be a more valuable alternative would be a network of trust, essentially resulting in some form of reputation-system. If I sign another users profile as “is trustworthy for posting content, not spam” for example, my followers would see that and could themselves trust the other users profile better.
A spammer could still create hundreds of profiles and create a network of trusted accounts among them. I think there's no perfect solution to this problem anyways, but I beleive that a network of trust is a really good step towards a solution.
(Real-Time-)Chat is not a concern I have here. I think Matrix is a solution to that problem and I also think that matrix will become fully distributed at some point and matrix servers will go away at some point.
Encrypted (one-to-one or one-to-many) communication could be possible in such a system, given that public-private key crypto would be part of the system anyways.
That said, though, it is not really a good fit, because messages in such communication would still be posted to the profile and be potentially replicated forever, meaning that if crypto breaks some time in the future, the communication wouldn't be private anymore.
Of course that's true for all online communication that's encrypted. I am not sure whether I would implement such a thing, but it is possible.
Now that I've written a lot about what the ideas for the social network application are, I also have to write a bit about the technology I have in mind for implementing that application.
As said, I've been thinking a long time (since 2017) about this, and I have refined my idea several times during that time. The core of the idea is still the same though: there should be no need for servers (the server software mentioned above is simply an optimization and not really necessary technically speaking), a user may post from several devices to the same profile and they should be able to post content without being connected to the internet.
IPFS was the technology that sparked that idea in the first place. IPFS and especially IPLD where the pieces of technology that I thought could provide the functionality that would be necessary for implementing the application.
Also, IPFS/IPLD would be a “common donominator” and linking to git-repositories or other data that is served via IPFS (think of the wikipedia, IIRC there has been an project to put the wikipedia on IPFS) would be possible easily.
But IPFS never had a decent Rust implementation or client libraries. All libraries that where available in my research were too complicated to use or too bare-bones to be a possible solution to the problems I am facing.
I more or less recently discovered libp2p and it seems to be very complex to use, but also powerful enough to solve all the problems. And, and that's really important, there's a decent Rust implementation that seems to be on-par feature-wise.
Developing the application on libp2p is certainly involved, and a good plan must be made before starting such an project. Also I think I would need extensive help from others implementing the networking stuff, as I really do not comprehend where to even start!
I am mainly thinking about implementing a GUI application here. I think there also should be CLI tooling, especially for power users, but it is not my main target here. The CLI tooling should not be for actually posting and “production use” but rather as a “plumbing toolbox” for inspecting profiles and instance state.
For GUI technology I am not really happy with the toolkits/frameworks I have used so far in the Rust world, except for Iced. I am not a fan of the react-style of framework and rather fancy elm-sytle frameworks (hence iced).
I also thought about trying Qt for that application, but given that it is mostly C++ only, and I really don't want to use C++ ever, it is not really an option either.
One framework that I did not yet have the pleasure trying out yet is Slint. From what I read online, it seems to be a really valuable alternative and from what I did read on its website, it seems to be a good framework. I will read more about it and try it out.
I know this might be a major undertaking... I am not sure I am able to do all the things listed here – especially because my motivation drops as soon as I fail to understand docs or see ways to implement certain things.
Still, I think this is whole idea is worth exploring and implementing.
Feel free to contact me via mastodon or mail about this topic, of course!