Effective CSAM filters are impossible because what CSAM is depends on context
2 July 2024 | 1:21 am

Automatically tagging or filtering child sexual exploitation materials (CSAM) cannot be effective and preserve privacy at the same time, regardless of what kind of tech one throws at it. Because what is and what is not CSAM is highly dependent on context.

Literally the same photo, bit-by-bit identical, can be an innocent memorabilia when sent between family members, and a case of CSAM if shared on a child porn group.

The information necessary to tell whether or not it is CSAM is not available in the file being shared. It is impossible to tell it apart by any kind of technical means based on the file alone. The current debate about filtering child sexual exploitation materials (CSAM) on end-to-end encrypted messaging services, like all previous such debates (of which there were many), mostly ignores this basic point.

All the tech in the world

Whenever CSAM and filtering of it become a topic of public debate, there is always a bunch of technical tools being peddled as a “solution” to it. These might involve hashing algorithms, machine learning systems (or as the marketing department would call them, “AI”), or whatever else.

And whenever this happens, technical experts spend countless hours verifying the extraordinary claims made by vendors and promoters of these tools, inevitably finding no proof of their purported effectiveness meant to justify their broad deployment to scan everyone’s communication.

But that’s not my point today.

My point today is that even if we had a magical tool that can tell – with perfect 100% accuracy – that a given image or video contains a depiction of a naked child, or that can – with perfect 100% accuracy – identify everything depicted in that media file, it would still not be enough to be able to filter out CSAM accurately and without mis-labeling huge numbers of innocent people’s messages as CSAM.

Context is king

This is because, as already mentioned, the information necessary to say whether a given image or video is or is not CSAM is in most cases not available in the file itself.

A video of a naked toddler, sent between two parents or other close family members, is just innocent family memorabilia. A nude photo of a child sent by a parent to a family doctor can be safely assumed to be a request for medical advice. Explicit materials sent consensually between infatuated teens exploring their sexuality are their business, and their alone.

But should any of these same images or videos leak from a compromised phone or e-mail account, and end up in a context of, say, a child porn sharing ring, they immediately become child sexual exploitation materials. Nothing changed about the media files themselves, and yet everything changed about their classification.

Not just the file

So, to establish whether a given media file is CSAM or not, whatever magical technological tool being used has to have access to the context. The media file alone won’t do.

This means information on who is talking to whom, when, and what about. What is their relation (close relatives? family doctor? dating?). Contents of their other communication, not just images or videos sent between them. This would have to include historical conversations, as the necessary context might not be obvious from messages immediately surrounding the shared file.

There is simply no way around it, and anyone who claims otherwise is either lying, or has no idea what they are talking about.

Importantly, this is not related to any limitations of our current technology. No amount of technological magic could squeeze any information from a media file that is not available in it. And there is always going to be pertinent contextual information that is not going to be contained in any such file.

The problem is real, the “tech solutions” are not

This is not to say that access to all that context is enough to “solve” CSAM. It is not, there are plenty of other problems with any CSAM-filtering proposal and tool out there.

This is also not to say that CSAM – and more broadly, sexual exploitation of children – is not a serious problem. It sadly absolutely is!

But it is not limited to the Internet. It is not going to be solved by a magical filter on our private, intimate communications, even if we could technically build a filter with sufficient accuracy (which we cannot).

If politicians wanted to be serious about solving the problem of sexual exploitation of children they would stop wasting their (and everybody else’s) time and energy on wishful thinking, misdirection, and technosolutionism.

And instead focus all that effort on programs that can actually do something about the problem.


Things I'd like more people to understand in 2024
7 January 2024 | 11:24 pm

We find ourselves in a peculiar place. We are more interconnected, yet more misinformed. At ease with more advanced technologies, but more easily mislead by them. “Doing our own research”, but ending up deeper in conspiratorial rabbit holes.

When discussing complex topics — pandemic, war, the housing crisis, or some thorny family affairs — it is surprisingly easy to jump to conclusions, to oversimplify, ignore crucial nuance, and thus get untethered from reality. To label someone as “evil”, “unethical”, fall back on tribalism. Our brains are always looking for a shortcut, and many of these shortcuts lead us astray. Sometimes we get fooled, sometimes we fool others. Neither helps in the long run.

I am as guilty of this as anyone else. But I also feel the only way we can deal with problems we’re facing, on any level, is by talking them through. Here’s a list of a few rules of thumb I find particularly helpful to keep in mind when thinking about and discussing complex politics- and society-adjacent topics.

They are not absolutes, and do not always apply, but they can help avoid some pitfalls we fall into all too often.

Explanation is not a justification

The fact that there exists an explanation of an action or decision does not automatically mean that the action or decision was justified. Explanation is only about being able to understand why somebody did something. Justification is about the moral judgment over that person and what they did.

It is chillingly easy to fall into the trap of assuming that a person is justifying an unethical act of someone’s just because that person is trying to understand or explain it. Making such an assumption easily leads to dismissing that person as a “supporter” of that unethical act, and thus unethical themselves. This in turn makes it very difficult to talk about causes of a given situation, and about making it less likely it happens in the future.

We have to be able to discuss reasons behind a specific decision or action, regardless of how we  feel about the morality of it. If we want to make sure something bad does not happen again, understanding the reasons it happened is often more important than passing moral judgment.

The flip-side of this is that providing an explanation of something is not the same as providing a justification for it. “This is why I did it” is not the same as “this is why I was in the right doing it”. If by explaining an action somebody is be trying to deflect blame — they probably should get called out on that.

Of course, this is not to say that an explanation can never be an important element of valid justification.  It can, and it often is. But explanation and justification are different, even if one can support the other to some degree.

Hanlon’s razor

We humans are great at ascribing agency and intentionality where there is none. We love to make things about ourselves. We see faces in the clouds, deity’s wrath in volcanic eruptions, and targeted, premeditated malice in somebody else’s decisions or actions — especially ones that affect us in a bad way.

Hanlon’s razor states:

Never attribute to malice that which is adequately explained by stupidity.

I personally expand it to also include incompetence, laziness, and other lesser vices. It is, basically, a tool for assessing explanations of a given set of actions or decisions. In many cases, there is no need to assume malice in order to explain a problematic action or decision. In some cases assuming malice is actually counter-productive.

We don’t need to assume maliciousness on part of civil servants in the Netherlands who deployed the (as it turns out) racist system for flagging “suspicious” use of childcare benefits to know this was unacceptable. Pondering whether that was malicious on their part or not is in this case moot, and can distract from a broader and more immediately important question of: how to fix the broader system such that this never happens again, regardless of malice or incompetence?

That’s not to say that there is never malice, of course. Sometimes there very much is. But in the end, in a lot of cases it might not matter much — bad outcomes are bad regardless of whether they are caused by malice, or by incompetence. Important systems, especially ones on which our livelihoods or health and well-being depends on, should be resilient to either.

Or, as Grey’s law puts it:

Any sufficiently advanced incompetence is indistinguishable from malice

Which is closely related to…

A system’s purpose is what it does

Let’s say we have a complex system — technical, political, social, whatever the kind. And let’s say that it keeps having certain bad outcomes. Everyone involved in creating and maintaining it keeps insisting that these bad outcomes are accidental, and keep promising this can be fixed, but somehow it never is. At some point it just makes sense to treat these bad outcomes as the actual purpose of the system. If they really were not, surely the system would have been fixed already!

Coined by Stafford Beer, of Cybersyn fame, this rule is an great way of cutting through elaborate excuses given about any unacceptable outcomes of a system.

For example: if a government policy supposedly meant to fight the housing crisis (say, by guaranteeing low-interest loans to prospective buyers) ends up raising apartment prices but not causing actual improvement in the overall housing availability, at some point it’s reasonable to say that the purpose of this policy is not to fight the housing crisis — but to funnel free money to real estate developers.

Or: if a policy intended to combat drug abuse ends up predominantly incarcerating only a specific part of the population (say, young Black men), but in no real reduction in overall drug use, then it is reasonable to say that the purpose of the policy is not reduction of drug use — but persecution of a specific group.

Mind you, this doesn’t necessarily mean that the system in question was deliberately designed to be like this! It doesn’t necessarily mean its designers and maintainers are intentionally lying about what its purpose is or was supposed to be, maliciously hiding the fact that the purpose was different (see Hanlon’s razor above). It might be accidental, or related to incompetence, or to the fact that we’re all a product of the society we grew up in and the circumstances we inhabit.

In the end it doesn’t really matter what the original idea for that system was. If a system is allowed to stay in place even though it is clearly ineffective in its stated purpose, then it is fair to say that the actual purpose has to be something else.

Life is not a zero-sum game

There are situations which are a zero-sum game. Trying to get tickets to a popular concert is an example: if you get your tickets, I might not get mine. The resource is strictly limited and we are competing for it. Your win is my loss.

But in a lot of cases, things that are talked about as if they were a zero-sum game — are not. Take immigration: it is often talked about in “us vs. them” terms, with an implied assumption that there is some kind of resource that is strictly limited, and that the migrants, once let into the country, will compete over it with its current residents.

This is simply not the case. Yes, people coming into the country might need education, healthcare, social services — but they will also create more demand for local goods and services, strengthening the economy. Often they might be willing to work jobs that nobody else wants to take. They will pay taxes. They will bring their culture and cuisine with them, enriching the lives of everyone.

This is true for a lot of thorny political and social issues that are portrayed publicly or talked about as if they were a zero-sum game. Sometimes this becomes outright absurd and almost self-parodying, as with the so-called “Schrödinger’s immigrant”, who supposedly “steals our jobs” and simultaneously is “too lazy” to get one, hanging on unemployment benefits instead.

Two things can be true at the same time

In a way, truth is also often not a zero-sum game. For example, it is true I work a lot, but it is also true that I am quite a lazy person. It is true that Titanic’s captain’s actions can be considered reckless by today’s standards, and had contributed to the catastrophe, but it is also true they probably did not appear reckless to him or his peers at the time.

This perhaps sounds obvious, but becomes much less so when strong emotions come into play.

Are COVID vaccines a miracle of science, developed and tested in impossibly short time and saving countless lives? Or are they another vestige of Big Pharma’s flavor of neo-colonialism, based on who gets easy access to them and who doesn’t; who gets to manufacture them and who doesn’t; and who gets to profit from them? Both are true. We should be able to admire the former while insisting the latter is outright unacceptable.

This became particularly stark (and somewhat personal) to me when Putin’s Russia launched a full-scale invasion against Ukraine in February 2022. A lot of left-leaning, anarchist-y people seemed to defend Russian aggression by pointing out atrocities committed by US and NATO in Iraq or Afghanistan. How dare I “take side of NATO” here, have they not done enough evil?

But two things can be true at the same time — the US and NATO should be rightfully made accountable for their actions, of course, but that does not make Russia’s invasion and the atrocities it brought on civilians in Ukraine acceptable or justifiable in any sense.

This is a form of a false dichotomy, making it seem as if we have to “choose a side” out of a limited set of options. But the world is more complex than that. We have to be able to walk and chew gum.

These are not absolutes

All of these are guidelines, not absolute and unshakable rules. In some cases they might even run against one another. That’s okay.

An explanation can be an important part of a justification of some action — it’s just that it should not automatically, always be assumed so. An action or a decision can be underpinned by malice, and in some cases it is important to establish if it is — it’s just that it’s not necessarily always so, and it’s not always worthwhile to get stuck on that question.

A system’s outcomes might misalign with its stated purpose temporarily, and a fix might be on the way — question is, how long has the system been allowed to remain broken, and will it actually get fixed? Even if some problem is not a zero-sum game, resources are rarely truly unlimited and it might still make sense to ask about how they get allocated. And sometimes we do have to choose a side.

To me, these guidelines act as useful safety valves when thinking and discussing complex subjects. They help me notice when an argument might be going astray.

Bringing it all together

I find it startling how easily, how eagerly we retreat into tribalism when discussing important, complex, emotionally charged subjects. How quickly we decide there surely is malice involved, how quickly we can be manipulated into thinking something is a zero-sum game and we better, in our own interest, deny somebody’s access to some perceived “limited resource.”

And once we do, we gleefully dismiss “the other side” — suddenly there’s an “other side”, as if every problem only ever had two possible solutions! — as unethical, outright malicious or at least woefully misinformed. Then we don’t have to consider arguments that go against our strongly-held convictions anymore, we don’t have to deal with the fact that the world is more complex than “us vs. them.” After all we are “us”, and if “they” are not with us, they’re clearly against us.

The complexity, however, does not go away, regardless of how hard we try to ignore or hide it.


Mastodon monoculture problem
7 May 2023 | 12:23 am

Recent moves by Eugen Rochko (known as Gargron on fedi), the CEO of Mastodon-the-non-profit and lead developer of Mastodon-the-software, got some people worried about the outsized influence Mastodon (the software project and the non-profit) has on the rest of the Fediverse.

Good. We should be worried.

Mastodon-the-software is used by far by the most people on fedi. The biggest instance, mastodon.social, is home to over 200.000 active accounts as of this writing. This is roughly 1/10th of the whole Fediverse, on a single instance. Worse, Mastodon-the-software is often identified as the whole social network, obscuring the fact that Fediverse is a much broader system comprised of a much more diverse software.

This has poor consequences now, and it might have worse consequences later. What also really bothers me is that I have seen some of this before.

As seen on OStatus-verse

Years ago, I had an account on a precursor to the Fediverse. It was based mainly around StatusNet-the-software (since renamed as GNU social) and the OStatus protocol. The biggest instance by far was identi.ca — where I had my account. There was also a bunch of other instances, and there were other software projects that also implemented OStatus — notably, Friendica.

For the purpose of this blogpost, let’s call that social network “OStatus-verse”.

Compared to the Fediverse today, OStatus-verse was miniscule. I do not have specific numbers, but my pull-numbers-out-of-thin-air rough estimate is, say, ~100.000 to ~200.000 active accounts on a very good day (if you have the actual numbers, do tell and I will gladly update this blogpost). I do not have exact the numbers for identi.ca either, but my rough estimate is that it had between 10.000 and 20.000 active accounts.

So, around 1/10th of the entire social network.

OStatus-verse was small but lively. There were discussions, threads, and hashtags. It had groups a decade before Mastodon-the-software-project implemented groups. It had (desktop) apps — I still miss the usability of Choqok! And after a bit of nagging I was even able to convince a Polish ministry to have official presence there. As far as I know this is the earliest example of a government-level institution having an official account on a free-software-run, decentralized social network.

Identipocalypse

Then one day, Evan Prodromou, the administrator of identi.ca (and the original creator of StatusNet-the-software), decided to redeploy it as a new service, runningpump.io. The new software was supposed to be better and leaner. A new protocol was created because OStatus had very real limitations.

There was just one snag: that new protocol was incompatible with the rest of OStatus-verse. It tore the heart out of that social network.

People with identi.ca accounts lost their connections on all OStatus-compatible instances. People with accounts on other instances lost contact with people on identi.ca, some of whom were pretty popular in OStatus-verse (sounds familiar?..).

It turned out that if an instance is 1/10th of the whole social network, a lot of social connections lead through it. Even though other instances existed, suddenly a huge chunk of active users just vanished. Many groups fell mostly silent. Even if one had an account on a different instance, and contacts on other instances, a lot of familiar faces just disappeared. I stopped using it soon after that.

From my perspective, this single action set us back at least five if not ten years as far as promoting decentralized social media is concerned. Redeployment of identi.ca fractured the OStatus-verse not just in the social connections sense, but also in the protocol and developer community sense. As pettter, a fellow OStatus-verse veteran put it:

I think a bit of nuance on the huge-blow thing is that it didn’t only impact by cutting social connections, but also in protocol fragmentation, and in fragmenting developer efforts into rebuilding basic blocks of a federated social web time and again. Perhaps it was a necessary step to them come back together in designing AP, but personally I don’t think so.

Of course, Evan had all the right to do that. It was a service he ran, pro bono, on his own terms, with his own money. But that does not change the fact that it crippled the OStatus-verse.

I believe we need to learn from this history. Once we do, we should be worried about the sheer size ofmastodon.social. We should be worried by the apparent monoculture of Mastodon-the-software on the Fediverse. And we should also be worried about identifying all of Fediverse with just “Mastodon”.

Cost of going big

There are real costs and real risks related to going as big as mastodon.social has. Those costs and especially those risks are both to that instance itself, and to the broader Fediverse.

Moderation on the Fediverse is largely instance-centric. A single gigantic instance is difficult to moderate effectively, especially if it has registrations open (as mastodon.social currently does). As the flagship instance, promoted directly in official mobile apps, it draws a lot of new registrations — including quite a few problematic ones.

At the same time, this also makes it more difficult for admins and moderators of other instances to make moderation decisions about mastodon.social.

If an admin of a different instance decides mastodon.social’s moderation is lacking for whatever reason, should they silence it or even defederate from it (as some already have, apparently), thus denying members of their instance access to a lot of popular people who have accounts there? Or should they keep that access, risking exposing their own community to potentially harmful actions?

The sheer size of mastodon.social makes any such decision of another instance immediately a huge deal. This is a form of power: “sure, you can defederate from us if you don’t like how we moderate, but it would be a shame if people on your instance lost access to 1/10th of the whole fedi!” As GoToSocial’s site puts it:

We also don’t believe that flagship instances with thousands and thousands of users are very good for the Fediverse, since they tend towards centralization and can easily become ‘too big to block’.

Mind you, I am not saying this power dynamic is consciously and purposefully exploited! But it undeniably exists.

Being a gigantic flagship instance also means mastodon.social is more likely to be a target of malicious actions. On multiple occasions over the last few months it found itself under DDoS, for example. A couple of times it went down because of it. Resilience of a federated system relies on removing large points of failure, and mastodon.social is a huge one today.

The size of that instance and it being a juicy target also means that certain hard choices need to be made. For example, due to being a likely target of DDoS, it is now behind Fastly. This is a problem from the privacy perspective, and from the perspective of centralization of Internet infrastructure. It is also a problem that smaller instances avoid completely by simply being smaller and thus less interesting targets for anyone to take down with a DDoS.

Apparent monoculture

While the Fediverse is not exactly a monoculture, it is too close to being one for comfort. Mastodon-the-non-profit has outsized influence on all of fedi. This makes things tense for people using the social network, developers of Mastodon-the-software and other instance software projects, and instance admins.

Mastodon is neither the only instance software project on fedi, nor the first. For example, Friendica has been around for a decade and a half, long before Mastodon-the-software got it’s first git commit. There are Friendica instances (e.g. pirati.ca) operating today within Fediverse which had been part of the OStatus-verse a decade ago!

But calling all of Fediverse “Mastodon” makes it seem as if only Mastodon-the-software exists on the Fediverse. This leads people to demand features to be added to Mastodon and to ask for changes that have sometimes already been implemented by other instance software. Calckey already has quote-toots. Friendica has threaded conversations and text formatting.

Identifying Mastodon with the whole fedi is also bad for Mastodon-the-software developers. They find themselves under pressure to implement features that might not entirely fit with Mastodon-the-software. Or, they find themselves dealing with two groups of vocal users, one demanding a certain feature, other insisting it does not get implemented as too big of a change. Many of such situations could probably be more easily dealt with by clearly drawing a line, and pointing people to other instance software that might fit their use-case better.

Finally, Mastodon is currently by far (measured by active users, and by number of instances) the most popular implementation of the ActivityPub protocol. Every implementation has its quirks. With time, and with new features being implemented, Mastodon’s implementation might have to drift further away from the strict spec. It’s tempting, after all: why go through an arduous process of standardizing any protocol extensions if you’re the biggest kid on the block anyway?

If that happens, will every other implementation have to follow it, thus drifting along with it but without actual agency in what changes to the de facto spec are implemented? Will that create more tensions between Mastodon-the-software developers and developers of other instance software projects?

The best solution to “Mastodon misses feature X” is not always “Mastodon should implement feature X.” Often it might be better to just use a different instance software, better suited for a particular task or community. Or to work on a protocol extension that would allow a particularly popular feature to be reliably implemented by as many instances as possible.

But that can only work if it’s clear to everyone that Mastodon is only a part of a bigger social network: the Fediverse. And that we already do have a lot of choice as far as instance software is concerned, and as far as individual instances are concerned, and as far as mobile apps are concerned.

Sadly, that seems to go against recent decisions by Eugen, which go towards a pretty top-down (not quite vertically integrated, but gravitating towards that) model of official Mastodon mobile apps promoting the flagship mastodon.social instance. And that is something to worry about, in my opinion.

A better way

I want to be clear I am not arguing here for freezing Mastodon development and never implementing any new features. I also agree that the signup process needs to be better and more streamlined than it had been before, and that plenty of UI/UX changes need be implemented. But all this can and should be done in a way that improves resilience of the Fediverse, instead of undermining it.

Broader changes

My laundry list for broader needed changes to Mastodon and the Fediverse would be:

  1. Close registrations on mastodon.social, now
    It is already too big and too much of a risk for the rest of the Fediverse.
  2. Make profile migration even easier, also across different instance types
    On Mastodon, profile migration currently only moves followers. Who you follow, bookmarks, block and mute lists can be moved manually. Posts and lists cannot be moved — and that’s a big problem for a lot of people, keeping them tied to the first instance they signed-up for. It’s not insurmountable — I had moved my profile twice and found it perfectly fine. But it is too much friction. Some other instance software projects are working on allowing post migrations too, thankfully. But it’s not going to be a quick and easy fix, as ActivityPub design makes it very hard to move posts between instances.
  3. By default, official apps should offer new people a random instance out of a small list of verified ones
    At least some of these promoted instances should not be controlled by Mastodon-the-non-profit. Ideally, some instances should run different instance software as long as it uses compatible client API.

What can I do myself?

And here are things we ourselves can do, as people using the Fediverse:

  1. Consider moving off of mastodon.social if you have an account there.
    That’s admittedly a big step, but also something you can do that most directly helps fix the situation. I had migrated frommastodon.social years ago, and never looked back.
  2. Consider using an instance based on a different software project
    The more people migrate to instances using other instance software than Mastodon-the-software, the more balanced and resilient Fediverse we get. Hearing a lot of positive opinions about Calckey, for example. GoToSocial is also looking interesting.
  3. Remember that Fediverse is more than just Mastodon
    Language matters. When talking about the Fediverse, calling it “Mastodon” is only making the issues I mention above more difficult to deal with.
  4. If you can, support projects other than the official Mastodon ones
    At this point Mastodon-the-software project has a lot of contributors, a stable development team, and enough solid funding to continue safely for a long while. That’s great! But same cannot be said about other fedi-adjacent projects, including independent mobile apps or instance software. In order to have a diverse, resilient Fediverse, we need to make sure these projects are also supported, including financially.

Closing thoughts

First of all, the Fediverse is a much more resilient, more long-term viable, safer, and more democratized social network than any centralized walled garden. Even with its Mastodon monoculture problem, it is still not (and can’t be) owned or controlled by any single company or person. I also feel that it is a better, safer choice than social networks that only cosplay decentralization and pay lip service to it, like BlueSky.

In a very meaningful way, OStatus-verse can be said to have been an early version of the Fediverse; as noted before, some instances that had been part of it then are still running and part of the Fediverse today. In other words, Fediverse had been around for a decade and a half by now, and survived the Identipocalypse even as it got badly hurt by it, while observing both the birth and the untimely passing of Google+.

I do believe Fediverse is leaps and bounds more resilient today than OStatus-verse had been before the identi.ca redeploy. It’s an order of magnitude (at least) larger in terms of user base. There are dozens of different instance software projects and tens of thousands active instances. There are also serious institutions invested in its future. We should not be panicking over all I wrote above. But I do think we should be worried.

I do not attribute malice to recent actions of Eugen (like making official Mastodon apps funnel new people towards mastodon.social), nor to past actions of Evan (redeploying identi.ca on pump.io). And I don’t think anyone should. This stuff is hard, and we’re all learning as we go, trying to do our best with the limited time we have available and restricted resources in our hands.

Evan went on to be one of the main creators of ActivityPub, the protocol the Fediverse runs on. Eugen had started Mastodon-the-software project in the first place which I strongly believe allowed Fediverse to flourish into what it is today. I really appreciate their work, and recognize that it’s impossible to do anything in social media space without someone having opinions on it.

That does not mean, however, we cannot scrutinize these decisions and should not have these opinions.


Update: I did a silly; mastodon.social is behind Fastly, not CloudFlare, of course. Fixed, thank you to those who poked me about it!

Update 2: Heartfelt thanks to Jorge Maldonado Ventura for providing a Spanish translation of this blogpost, published under CC BY-SA 4.0. ¡Gracias!



More News from this Feed See Full Web Site