Facebook is still struggling to weed out fake accounts and bad content

facebook
Image credit: Lewis Tse Pui Lung / Shutterstock.com

Facebook disclosed Q1 enforcement data for the first time this week, and while it is clear that the company is trying very hard, there are still massive problems that need to be overcome.

During Q1 2018, Facebook deactivated (and I presume deleted) 583 million fake accounts, took down 837 million pieces of spam, 21 million pieces of nudity, 3.5 million pieces of violent content and 2.5 million instances of hate speech. The vast majority of these 583 million fake accounts never made it into the Q1 2018 2.2-billion MaU count, as at any one time 3% to 4% of all accounts are estimated to be fake.

For the spam, nudity and violence, automated performance was pretty good with between 86% and nearly 100% of these pieces of content being found and removed before they were reported by an employee or user. This is a good first step for Facebook’s AI, which I have consistently rated as one of the weakest in the industry, but there remains a very long way to go. Despite its feverish efforts, its competitors remain very far ahead, and I see no reason to change my assessment of its AI just yet.

However, on fake accounts and hate speech, Facebook fared less well and to be fair, these are much more difficult to reliably spot. In the same disclosure (and in its 10Q), Facebook also stated that at the end of Q1 2018, it estimated that around 3-4% of its MaU count (2.2 billion) were fake accounts that were still active. This is a massive increase from 2015 and 2016 where Facebook estimated that at any one time around 1% to 2% were fake (Facebook 10K filings).

I suspect this is a combination of a big increase of targeting of Facebook for fake and spam accounts as well as an improved ability to spot them. In Q1 2018, 3-4% represents about 77 million accounts.

Facebook also said that it was “usually” able to deactivate or delete these fake accounts within minutes of their creation, but the numbers do not bear this out. During Q1 2018 Facebook spotted and deactivated around 6.6 million fake accounts per day. However, if there are still 77 million fake accounts present at any one time, it implies that on average it takes about 11.7 days to spot each fake account.

This strongly suggests that this task is mostly being carried out by the humans that Facebook has been recruiting in droves to do the jobs its machines cannot. Furthermore, only 38% of the hate speech was identified by the technology leaving 62% to be flagged by humans.

To be fair, this is a very difficult task and one that could have disastrous consequences if the company gets it wrong.

Facebook, Twitter, Google, ISPs etc are immune from prosecution triggered by the content they carry as long as they are deemed to be a completely neutral forum for the dissemination of information and ideas. The problem here is that the definition of hate speech is very grey, and the removal of content deemed hate speech by some could be seen as censorship and bias by others.

This is not an issue in itself, but it could result in the loss of this critical immunity, leaving Facebook et al. fully liable for the content on their services, resulting in Facebook becoming a completely different kind of service.

This is why Facebook is on very thin ice when it comes to hate speech in particular and why this is such a difficult AI problem to solve.

The net result is that Facebook is going to need more and more humans to keep its service in order resulting in the decline in operating performance that I have been flagging for some time.

Hence, I think 2018 is going to continue to be a difficult one for Facebook and see no reason to hold the shares. Of those exposed to advertising Google and Baidu are in the best position, but while the privacy theme dominates, privacy advocate Apple is the best place to hide.

This article was originally published at RadioFreeMobile

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.