mastodon

Switching from hugo to writefreely

March 30, 2021

I've been a fan of the #ActivityPub stuff (or rather: the #fediverse) for a long time now, running a #mastodon account on @musicmatze@mastodon.technology for some time already, and I also have a #pixelfed account at @musicmatze@pixelfed.social.

So it is just a logical step to switch the #blogging software I use to an ActivityPub supporting solution as well.

Thus, as of today, this blog runs of writefreely.

Blogging software

I've used a lot of different blogging software until now. Most of the time, it was static site compilers, because this way I did not have to maintain (mainly update) the software running the blog. Because I host on uberspace, I am free to chose from a range of solutions though. In the very beginning, I was running ghost as far as I can remember.

Lately, I did write less and less on this blog, which is a shame. I hope that with this new solution, I start writing more often even though I have to have a internet connection while writing (possible workarounds like preparing locally with vim or some markdown editor exist, of course).

Importing old posts

All my old posts were already written in markdown, so importing was not that complicated. I had to apply some vim-skills to my markdown files and then import them into #writefreely, but I had to adapt the timestamps. Also, the formatting sucks with the imported articles and maybe even links are broken.

TBH, that's not that important to me to make the effort of fixing every single (of the more than 200) articles.

#hugo

Writing a mastodon bot

January 17, 2021

Today, I wrote a mastodon bot.

Shut up and show me the code!

Here you go.

The idea

My idea was, to write a bot that fetches the lastest master from a git repository and counts some commits and then posts a message to mastodon about what it counted.

Because I always complain about people pushing to the master branch of a big community git repository directly, I decided that this would be a perfect fit.

(Whether pushing to master directly is okay and when it is not okay to do this is another topic and I won't discuss this here)

The dependencies

Well, because I didn't want to implement everything myself, I started pulling in some dependencies:

log and env_logger for logging
structopt, toml, serde and config for argument parsing and config reading
anyhow because I don't care too much about error handling. It just has to work
getset for a bit cleaner code (not strictly necessary, tbh)
handlebars for templating the status message that will be posted
elefren as mastodon API crate
git2 for working with the git repository which the bot posts about

The Plan

How the bot should work was rather clear from the outset. First of all, it shouldn't be a always-running-process. I wanted it to be as simple as possible, thus, triggering it via a systemd-timer should suffice. Next, it should only fetch the latest commits, so it should be able to work on a working clone of a repository. This way, we don't need another clone of a potentially huge repository on our disk. The path of the repository should of course not be hardcoded, as shouldn't the “upstream” remote name or the “master” branch name (because you might want to track a “release-xyz” branch or because “master” was renamed to something else).

Also, it should be configurable how many hours of commits should be checked. Maybe the user wants to run this bot once a day, maybe once a week. Both is possible, of course. But if the user runs it once a day, they want to check only the commits of the last 24 hours. If they run it once a week, the last 168 hours would be more appropriate.

The message that gets posted should also not be hardcoded, but a template where the variables the bot counted are available.

All the above goes into the configuration file the bot ready (and which can be set via the --config option on the bots CLI).

The configuration struct for the setup described above is rather trivial, as is the CLI setup.

The setup

The first things the bot has to do is reading the commandline and the configuration after initializing the logger, which is a no-brainer, too:

fn main() -> Result<()> {
    env_logger::init();
    log::debug!("Logger initialized");

    let opts = Opts::from_args_safe()?;
    let config: Conf = {
        let mut config = ::config::Config::default();

        config
            .merge(::config::File::from(opts.config().to_path_buf()))?
            .merge(::config::Environment::with_prefix("COMBOT"))?;
        config.try_into()?
    };
    let mastodon_data: elefren::Data = toml::de::from_str(&std::fs::read_to_string(config.mastodon_data())?)?;

The mastodon data is read from a configuration file that is different from the main configuration file, because it may contain sensitive data and if a user wants to put their configuration of the bot into a (public?) git repository, they might not want to include this data. That's why I opted for another file here, its format is described in the configuration example file (next to the setting where the file actually is).

Next, the mastodon client has to be setup and the repository has to be opened:

    let client = elefren::Mastodon::from(mastodon_data);
    let status_language = elefren::Language::from_639_1(config.status_language())
        .ok_or_else(|| anyhow!("Could not parse status language code: {}", config.status_language()))?;
    log::debug!("config parsed");

    let repo = git2::Repository::open(config.repository_path())?;
    log::debug!("Repo opened successfully");

which is rather trivial, too.

The Calculations

Then, we fetch the appropriate remote branch and count the commits:

    let _ = fetch_main_remote(&repo, &config)?;
    log::debug!("Main branch fetched successfully");

    let (commits, merges, nonmerges) = count_commits_on_main_branch(&repo, &config)?;
    log::debug!("Counted commits successfully");

    log::info!("Commits    = {}", commits);
    log::info!("Merges     = {}", merges);
    log::info!("Non-Merges = {}", nonmerges);

The functions called in this snippet will be described later on. Just consider them working for now, and let's move on to the status posting part of the bot now.

First of all, we use the variables to compute the status message using the template from the configuration file.

    {
        let status_text = {
            let mut hb = handlebars::Handlebars::new();
            hb.register_template_string("status", config.status_template())?;
            let mut data = std::collections::BTreeMap::new();
            data.insert("commits", commits);
            data.insert("merges", merges);
            data.insert("nonmerges", nonmerges);
            hb.render("status", &data)?
        };

Handlebars is a perfect fit for that job, as it is rather trivial to use, albeit a very powerful templating language is used. The user could, for example, even add some conditions to their template, like if there are no commits at all, the status message could just say “I'm a lonely bot, because nobody commits to master these days...” or something like that.

Next, we build the status object we pass to mastodon, and post it.

        let status = elefren::StatusBuilder::new()
            .status(status_text)
            .language(status_language)
            .build()
            .expect("Failed to build status");

        let status = client.new_status(status)
            .expect("Failed to post status");
        if let Some(url) = status.url.as_ref() {
            log::info!("Status posted: {}", url);
        } else {
            log::info!("Status posted, no url");
        }
        log::debug!("New status = {:?}", status);
    }

    Ok(())
} // main()

Some logging is added as well, of course.

And that's the whole main function!

Fetching the repository.

But we are not done yet. First of all, we need the function that fetches the remote repository.

Because of the infamous git2 library, this part is rather trivial to implement as well:

fn fetch_main_remote(repo: &git2::Repository, config: &Conf) -> Result<()> {
    log::debug!("Fetch: {} / {}", config.origin_remote_name(), config.master_branch_name());
    repo.find_remote(config.origin_remote_name())?
        .fetch(&[config.master_branch_name()], None, None)
        .map_err(Error::from)
}

Here we have a function that takes a reference to the repository as well as a reference to our Conf object. We then, after some logging, find the appropriate remote in our repository and simply call fetch for it. In case of Err(_), we map that to our anyhow::Error type and return it, because the callee should handle that.

Counting the commits

Counting the commits is the last part we need to implement.

fn count_commits_on_main_branch(repo: &git2::Repository, config: &Conf) -> Result<(usize, usize, usize)> {

The function, like the fetch_main_remote function, takes a reference to the repository as well as a reference to the Conf object of our program. It returns, in case of success, a tuple with three elements. I did not add strong typing here, because the codebase is rather small (less than 160 lines overall), so there's not need to be very explicit about the types here.

Just keep in mind that the first of the three values is the number of all commits, the second is the number of merges and the last is the number of non-merges.

That also means:

tuple.0 = tuple.1 + tuple.2

Next, let's have a variable that holds the branch name with the remote, like we're used from git itself (this is later required for git2). Also, we need to calculate the timestamp that is the lowest timestamp we consider. Because our configuration file specifies this in hours rather than seconds, we simply * 60 * 60 here.

    let branchname = format!("{}/{}", config.origin_remote_name(), config.master_branch_name());
    let minimum_time_epoch = chrono::offset::Local::now().timestamp() - (config.hours_to_check() * 60 * 60);

    log::debug!("Branch to count     : {}", branchname);
    log::debug!("Earliest commit time: {:?}", minimum_time_epoch);

Next, we need to instruct git2 to create a Revwalk object for us:

    let revwalk_start = repo
        .find_branch(&branchname, git2::BranchType::Remote)?
        .get()
        .peel_to_commit()?
        .id();

    log::debug!("Starting at: {}", revwalk_start);

    let mut rw = repo.revwalk()?;
    rw.simplify_first_parent()?;
    rw.push(revwalk_start)?;

That can be used to iterate over the history of a branch, starting at a certain commit. But before we can do that, we need to actually find that commit, which is the first part of the above snippet. Then, we create a Revwalk object, configure it to consider only the first parent (because that's what we care about) and push the rev to start walking from it.

The last bit of the function implements the actual counting.

    let mut commits = 0;
    let mut merges = 0;
    let mut nonmerges = 0;

    for rev in rw {
        let rev = rev?;
        let commit = repo.find_commit(rev)?;
        log::trace!("Found commit: {:?}", commit);

        if commit.time().seconds() < minimum_time_epoch {
            log::trace!("Commit too old, stopping iteration");
            break;
        }
        commits += 1;

        let is_merge = commit.parent_ids().count() > 1;
        log::trace!("Merge: {:?}", is_merge);

        if is_merge {
            merges += 1;
        } else {
            nonmerges += 1;
        }
    }

    log::trace!("Ready iterating");
    Ok((commits, merges, nonmerges))
}

This is done the simple way, without making use of the excelent iterator API. First, we create our variables for counting and then, we use the Revwalk object and iterate over it. For each rev, we unwrap it using the ? operator and then ask the repo to give us the corresponding commit. We then check whether the time of the commit is before our minimum time and if it is, we abort the iteration. If it is not, we continnue and count the commit. We then check whether the commit has more than one parent, because that is what makes a commit a merge-commit, and increase the appropriate variable.

Last but not least, we return our findings to the caller.

Conclusion

And this is it! It was a rather nice journey to implement this bot. There isn't too much that can fail here, some calculations might wrap and result in false calculations. Possibly a clippy run would find some things that could be improved, of course (feel free to submit patches).

If you want to run this bot on your own instance and for your own repositories, make sure to check the README file first. Also, feel free to ask questions about this bot and of course, you're welcome to send patches (make sure to --signoff your commits).

And now, enjoy the first post of the bot.

tags: #mastodon #bot #rust

34c3

January 1, 2018

34c3 was awesome. I prepared a blog article as my recap, though I failed to provide enough content. That's why I will simply list my “toots” from mastodon here, as a short recap for the whole congress.

(2017-12-26, 4:04 PM) – Arrived at #34c3
(2017-12-27, 9:55 AM) – Hi #31c3 ! Arrived in Adams, am excited for the intro talk in less than 65 min! Yes, I got the tag wrong on this one
(2017-12-27, 10:01 AM) – Oh my god I'm so excited about #34c3 ... this is huge, girls and boys! The best congress ever is about to start!
(2017-12-27, 10:25 AM) – Be awesome to eachother #34c3 ... so far it works beautifully!
(2017-12-27, 10:31 AM) – #34c3 first mate is empty.
(2017-12-27, 10:46 AM) – #34c3 – less than 15 minutes. Oh MY GOOOOOOOOOD
(2017-12-27, 10:49 AM) – Kinda sad that #fefe won't do the Fnord this year at #34c3 ... but I also think that this year was to shitty to laugh about it, right?
(2017-12-27, 10:51 AM) – #34c3 oh my good 10 minutes left!
(2017-12-27, 11:02 AM) – #34c3 GO GO GO GO!
(2017-12-27, 11:16 AM) – Vom Spieltrieb zur Wissbegierig! #34c3
(2017-12-27, 12:17 PM) – People asked me things because I am wearing a #nixos T-shirt! Awesome! #34c3
(2017-12-27, 12:59 PM) – I really hope i will be able to talk to the #secushare people today #34c3
(2017-12-27, 1:44 PM) – I talked to even more people about #nixos ... and also about #rust ... #34c3 barely started and is already awesome!
(2017-12-27, 4:28 PM) – Just found a seat in Adams. Awesome! #34c3
(2017-12-27, 8:16 PM) – Single girls of #34c3 – where are you?
(2017-12-28, 10:25 AM) – Day 2 at #34c3 ... Yeah! Today there will be the #mastodon #meetup ... Really looking forward to that!
(2017-12-28, 12:32 PM) – Just saw ads for a #rust #wayland compositor on an info screen at #34c3 – yeah, awesome!
(2017-12-28, 12:37 PM) – First mate today. Boom. I'm awake! #34c3
(2017-12-28, 12:42 PM) – #mastodon ads on screen! Awesome! #34c3
(2017-12-28, 12:45 PM) – #taskwarrior ads on screen – #34c3
(2017-12-28, 3:14 PM) – I think I will not publish a blog post about the #34c3 but simply list all my toots and post that as an blog article. Seems to be much easier.
(2017-12-28, 3:15 PM) – #34c3 does not feel like a hacker event (at least not like the what I'm used to) because there are so many (beautiful) women around here.
(2017-12-28, 3:36 PM) – The food in the congress center in Leipzig at #34c3 is REALLY expensive IMO. 8.50 for a burger with some fries is too expensive. And it is even less than the Chili in Hamburg was.
(2017-12-28, 3:43 PM) – Prepare your toots! #mastodon meetup in less than 15 minutes! #34c3
(2017-12-28, 3:50 PM) – #34c3 Hi #mastodon #meetup !
(2017-12-28, 3:55 PM) – Whuha... there are much more people than I've expected here at the #mastodon #meetup #34c3
(2017-12-28, 4:03 PM) – Ok. Small #meetup – or not so small. Awesome. Room is packed. #34c3 awesomeness!
(2017-12-28, 4:09 PM) – 10 minutes in ... and we're already discussing pineapples. Community ftw! #34c3 #mastodon #meetup
(2017-12-28, 4:46 PM) – Limiting sharing of #toots does only work if all instances behave! #34c3 #mastodon #meetup
(2017-12-28, 4:56 PM) – Who-is-who #34c3 #mastodon #meetup doesn't work for me... because I don't know the 300 usernames from the top of my head...
(2017-12-28, 5:17 PM) – From one #meetup to the next: #nixos ! #34c3
(2017-12-28, 5:57 PM) – Unfortunately the #nixos community has no space for their #meetup at #34c3 ... kinda ad-hoc now!
(2017-12-28, 7:58 PM) – Now... Where are all the single ladies? #34c3
(2017-12-28, 9:27 PM) – #34c3 can we have #trance #music please?
(2017-12-28, 9:38 PM) – Where are my fellow #34c3 #mastodon #meetup people? Get some #toots posted, come on!
(2017-12-29, 1:44 AM) – Day 2 ends for me now. #34c3
(2017-12-29, 10:30 AM) – Methodisch Inkorrekt. Approx. 1k people waiting in line. Not nice. #34c3
(2017-12-29, 10:43 AM) – Damn. Notebook battery ran out of power last night. Cannot check mails and other unimportant things while waiting in line. One improvement proposal for #34c3 – more power lines outside hackcenter!
(2017-12-29, 10:44 AM) – Nice. Now the wlan is breaking down. #34c3
(2017-12-29, 10:57 AM) – LAOOOOLAAA through the hall! We did it #34c3 !
(2017-12-30, 3:45 AM) – 9h Party. Straight. I'm dead. #34c3
(2017-12-30, 9:08 PM) – After some awesome days at the #34c3 I am intellectually burned out now. That's why the #trance #techno #rave yesterday was exactly the right thing to do!
(2017-12-30, 11:35 PM) – Where can I get the set from yesterday night Chaos Stage #34c3 ??? Would love to trance into the next year with it!
(2017-12-31, 11:05 PM) – My first little #34c3 congress résumé: I should continue on #imag and invest even more time. Not that I do not continue it, but progress is slowing down with the last months of my masters thesis... Understandable I guess.

That was my congress. Yes, there are few toots after 28th... because I was really tired by then and also had people to talk to all the time, so little time for microblogging there. All in all: It was the best congress so far!

tags: #ccc #social

Blueprint of a distributed social network on IPFS – and its problems

October 31, 2017

#matrix , #ipfs , #scuttlebutt and now #mastodon – We're living in awesome times! centralization < decentralization/federation < distribution! #lovefortech

(me, April 10, 2017, on mastodon)

The idea

With the rise of protocols like the matrix protocol, activitypub and others, decentralized social community platforms like matrix, mastodon and others gained power and were made real. I consider these platforms, especially mastodon and matrix, to be great steps into the future and am using both enthusiastically.

But can we do better? Can we do more distribution,? I think so!

So far we have a twitter-like thumbleblog platform (mastodon), a chat platform (matrix) and facebook-like platforms (diaspora and friendica) which are federated (some form of decentralization). I think we can make a completely distributed social network platform reality today.

Let me reiterate on that: I think, we can make a facebook/googleplus/etc clone which works without a central component, today. And I would even go one step further and state: All we need for this is IPFS (and related technology like IPLD and IPNS)!

This platform would feature personal profiles, publishing articles/posts/images/videos/voice messages/etc, instant messaging, following others, and all the things one would want in such a platform.

How would it work?

What do we need for this? Well, as stated before: not much! From what I can think of, we would need IPFS, some sort of public/private key functionality (which IPFS already has), a nice frontend-framework and that's basically it.

Let me tell you how I think such a platform would work.

The moment a user starts the application, the application would boot an IPFS node. The username and all other information about the profile are added to IPFS as structured data. If the profile changes because the user edits it, it is added to IPFS again, using IPLD to link to its previous version.

If a user adds a post to her profile, that post is added to IPFS as well and linked from the profile via IPLD. All other nodes are informed about the new content via pubsub and are free to pin the new content (the new profile version) or only cache it for a while (or to not care at all). The post itself could add a link to the IPNS hash of the profile under which the post is published. This way, a link from the post to the current version of the profile would always exist.

Because the profile always links to its previous version as well as to the post content, that would imply that the node the user of the profile runs would always keep all data the user adds to the network. As the data is only kept by links, the user is free to drop published content at any point in time.

This means that basically each operation would “generate” a new profile, which is of course published as an IPNS name. Following others would be a matter of subscribing to their “pub” channel (as in “pubsub”) or their IPNS name.

Chat

A chat application using IPFS is already implemented with orbit, so that's a matter of integrating one application into another. Peer-to-Peer (or rather Profile-to-Profile) messaging is therefore no problem.

Data format

All the data would be saved in a structured format. For example Json (though order of serialization is important, because of cryptographic hashes) or Bson or any other data serialization format that is widely adopted.

Sidenote: As long as it is made clear that any client must support all formats, the format itself doesn't matter that much. For simplicity of this article, I stick to Json (and also because it is most widely known).

A Profile(-version) would look roughly like this (consider 'ipfs hash' to mean “some kind of IPLD link” in this context):

{
  "previous": [ "<ipfs hash>" ],
  "post": {
    "type": "<post type>",
    "nodes": ["<ipfs hash>"],
    "metadata": {
      "date": "2017-12-12T12:00:00+0200",
      "tags": [],
      "category": "kittens",
      "custom": {}
    }
  }
}

Let me explain:

The previous key would point to the previous profile version(s). It would only contain IPFS hashes (Why plural, see below in “Multi-Device Support”).
The post key would contain information about the post published with this profile version.
- The type of the post could be “article”, “image”, “video”... normal stuff. But also “biography” for the biography shown on the profile or other things. Even “username” would be possible, for adding a user name to the profile.
- The nodes key would point to an IPFS hash containing the actual payload; either the text of the article (only one hash then) or the ipfs hashes of the pictures, the video(s) or other binary content. Of course, posts could be formatted using Markdown, reStructured Text or whatever format one likes to use. It would be a clients job to render it properly.
- The metadata field would contain plain meta information, like published date, tags, category and also custom metainformation as key-value pairs.

Maybe a version attribute for protocol version could be added as well. Of course, this should be considered an incomplete example, as I almost certainly forgot things here.

The idea of linking the previous version of a profile from each new version of the profile is very much blockchain-like, of course, with the difference that nobody needs to fetch the whole chain but only the latest one to get a profile. The more content a viewer of the profile wants to see, the more she needs to traverse the graph of profile versions (and automatically caching the content for others). This would automatically result in older content beeing “forgotten” slowly (but the content would not be forgotten until the publisher itself and all other “pinners” drop it). Because the actual payload is not stored in the fetched data, the actual amount of data which is required to simply view a profile is rather small. A client could be configured to fetch all textual content of a file, but not more than 10 versions, or one screenpage, or something like that. The possibilities are endless here.

Federated component

One might think “If I go offline with my node, my posts are not accessible if nobody else is online having them”. And that's true.

That's why I would introduce a federated component, which would run a stripped-down version of the application.

As soon as another instance connects and a new post is announced via pubsub, the instance automatically pins or caches it. Of course, this would mean that all of these federated instances would pin all content, which is surely not nice. One (rather simple and maybe even stupid) option would be to roll a dice and make the chance that a post is pinned a 50-50 thing, or something like that. Also, posts which are pinned for a certain amount of time are most likely distributed well enough so the federated component nodes can drop them... maybe after 90 days, maybe after 10... Details!

Blockchain-Approaches

The fundamental problem with Blockchains is that every peer in the network hosts the complete content. Nobody benefits from that, especially if you think of a social network which should also work on mobile devices. With users loading up images, videos and other large blobs of data, a blockchain is the wrong approach.

That's why I think a social network on Euthereum, Bitcoin or any other crypto-currency/blockchain is not an option at all.

IPLD

IPLD can be used not only to link posts and profiles, but also to link from content to content. Namely to link from one post to another, from a post to an image, a video, a voice message,... but also to link from one post to a git commit, an euthereum transaction or any other IPLD-supported data structure.

Once nice detail is that one does not have to traverse these links. If a user sees a post which links to other posts, for example, she does not have to fetch these links to see the post itself, only if she wants to see the linked content. Caching nodes, on the other hand, can automatically traverse the whole graph and fetch all the content into their cache.

That makes a IPLD-based linking approach really beneficial.

Scuttlebutt

Scuttlebutt is a first step into the right direction. One can say what one wants about electron and the whole technology stack which is used in Scuttlebutt (and like or dislike the whole Javascript world), but so far Scuttlebutt seems like the first social network that is completely distributed.

I thought about whether it would be a great idea to port Scuttlebutt to use IPFS in the backend. From what I know right now, it would be a nice way of bringing IPFS and IPLD to the mix and therefor enhancing and extending the capabilities of Scuttlebutt itself.

I have not final conclusion on that thought, though.

Problems

There are several problems one has to think about when designing such a system.

Comments on Posts (and comments)

Consider you want to comment on a post. Of course you create new content, which links to the post you just commented. But the person who wrote the original post does not automatically link to your comment, so is neither able to find the comment (which could be solved via pubsub), nor are others able to find them.

The approach to this problem is simple: Notification about comments can be done via pubsub. And, if a user gets a notification about a new comment, she can approve it and automatically publish a new version of her post, with some added meta information:

A link to the comment
A link to the “old version of the content in IPFS”

Now, if a client fetches all posts of a profile, it resolves all entries for their newest version (so basically the one entry which does not link to an older version of itself) and only shows the latest versions of it.

Comments on comments (and so on) would be possible with the exact same approach. That would, of course, cause a whole tree of comments to be rebuild every time a new comment is added.

Maybe not the best idea in that regard.

Multi-Device Support

There are several problems regarding multi-device support.

Publishing content

Publishing from multiple devices with the same profile is possible – one just needs to import the private key for the signatures and the profile information to the other device.

Though, this needs some sort of merging mechanism if two posts are published from two devices (or more) at the same time / without the other devices beeing online to get notifications of the new point of truth.

As creating two posts from two seperate devices would create two new versions of the profile (because of IPLD linking), which means two points of truth suddenly exists, a merging-mechanism must be implemented to merged multiple points of truth for the profile. This could yield a rather large network of profile versions, but ultimatively a DAG (Directed Acyclic Graph).

        Profile Init
             ^
             |
          Post A
             ^
             |
          Post B <----+
             ^        |
             |        |
  +-----> Post C    Post C'
  |          ^        ^
  |          |        |
Post D    Post D'   Post D''
  ^          ^        ^
  |          |        |
  |          +--------+
  |          |
  |       Post E
  |          ^
  |          |
  +----------+
             |
             |
          Post F

A scenario like the one above (each Post also represents a new version of the profile) would be easy to create with three devices:

One starts using the network on a notebook
Post A published from the notebook
Post B published from the notebook
Profile added on the workstation
Post C published from the notebook while off of the internet
Post C' published on the workstation
Profile added to the mobile phone (from the notebook)
Post D published from the mobile while off of the internet
Post D' published from the notebook while off of the internet
Post D'' published on the workstation
Notebook comes back online, Post E published, merging the state from Post D'' from the workstation and Post D' from the notebook itself.
Phone comes online, one of the devices is used to publish Post F, merging the state from Post D and Post E.

In this scenario, there would still be one problem, though: If the profile is published as an IPNS name, branching off of versions would be problematic. If C is published while C' is published, both devices would publish their version as an IPNS name. Now, first come first serve applies. And of course that is problematic, because every device would always see one of the posts, but no device could see the other. Only at E (in the above example), when the branches are merged, both C and C' would be visible (though D wouldn't be visible as long as it isn't merged into the chain). But how does a device discover that there are two “current” versions which have to be linked to the new post?

So, discoverability is an issue in this approach. Maybe someone can come up with a clean and easy solution that would work for netsplit and all those scenarios.

One idea would be that there is a profile-key which is used to publish profile versions under an IPNS name as well as a device-key, which is used to announce profile versions as a seperate IPNS name. That IPNS name could be added to the profile, so each other device can find it and fetch “current” versions from each device. Only the initial setup of a new device would need to be made carefully then.

Or, maybe, the whole approach is wrong and another approach would fit better for this kind of problem. I don't know.

Subscribing

Another issue with multi-device support would be subscribing. For example, if a user (lets call her Amy) subscribes to another user (lets call him Sheldon) on her Notebook, this information needs to be stored somehow. And because Amys machines do not necessarily sync with each other, her mobile phone may never know that following Sheldon is a thing now!

This problem could by solved by storing the “follow”-information in her public profile. Although, some users might not like everyone to know who to follow. Cryptographic things could be considered to fix visibility.

But then, users may want to “categorize” their friends, store them in groups or whatever. This information would be stored in the public profile as well, which would create even more noise on the network. Also, because cryptography is hard and information would be stored forever, this might not be an option as some day, the crypto might be broken and reveal all the things that were stored privately before.

Deleting profile versions

Some time, a user may want to remove a biography entry or a user name she once published. Because all the information is chained in a long chain of versions, one may think that deleting a node is not possible. But it is!

Consider the following (simple) graph of profile versions:

A<---B<---C<---D<---E

If the user now wants to delete node C in this graph, she simply drops it. Now, E beeing the latest point of truth, one may think that finding B and A is not possible anymore. That's true. But why not shipping around this by creating a new profile version and linking the previous versions:

A<---B     <---D<---E<---F
      \                 /
       -----------------

Of course, D would now point to a node which does not exist. But that is not a problem. Indeed, its a fundamental concept of the idea – that content may be unavailable.

F must not contain new content. It even should not, because dropping F because of its content becomes harder this way. Also, new versions of the profile is simple and cheap.

Problems are hard in distributed environments

I do not claim to know the final solution to any of these problems. Its just that I think of them and would love to get an open conversation started on the whole subject of distributed social networks and problems that come with them.

tags: #distributed #network #open-source #social #software