Defamation by algorithm

By Legal Eagle

In a previous post, I’ve mentioned the annoying algorithmic habits of Facebook (suggesting that you reconnect with dead friends or putting up ads asking, “Do you want to get pregnant?”…um no, not right now!)

Google also uses algorithms which suggest searches related to the one you have just undertaken. A French man has successfully sued Google because he alleged than when his name was entered into the search engine, the terms “rapist” and “satanist” were suggested by Google as associated search terms. The man had in fact been convicted for corruption of a minor, receiving a three year suspended jail sentence. He said that he had tried contacting Google to get them to remove the algorithm, with no success. Saul Weintraub at CNN Fortune reports:

‘Suggest’ is a service that gives additional terms for further searching after a query is done.  Those words are based on terms that are grouped on the web and Google’s PageRank algorithm.

Those results were likely a manifestation of news reports of the man’s crimes and related searches based on those terms.  Google states that the results aren’t its responsibility, they are just a manifestation of their computers reporting what’s out there on the web.

The French court concluded that the search engine’s linking his name to such words was defamatory. Google CEO Eric Schmidt and Google were ordered to pay €1 settlement plus €5,000 for the man’s court costs.  Google was also ordered to take down the results of its algorithm and would be fined daily until such action had been taken.

Interestingly, the court felt that Google France wasn’t liable, but somehow Eric Schmidt in his editorial capacity at the Google HQ in the US were in fact responsible.

Because Google isn’t simply pointing to search results but its robots are making “Editorial decisions”, it may find itself in trouble in other jurisdictions around the globe – at least until the courts understand what is happening behind the scenes.

The judgment is here (en français).

The Google algorithm infers that there is a link between X’s name and the terms “rapist” and “satanist”. Would Google be protected by a defence of “truth” under our defamation laws? It probably depends very much on what the newspaper reports said. If the newspaper reports said, “X was accused of being a satanist and a rapist, but was found not guilty,” then the inference created by the search suggestion would be untrue. I suspect that this is probably what happened.

It is interesting, though, when a “ghost in the machine” creates the defamatory communication. Perhaps the position of Google is analogous to the situation where a defamatory comment is made by a commenter on a blog post. The blogger will be liable for defamation if she leaves the comment up, even if she did not make the comment herself and does not agree with it. Similarly, here the problem is not the algorithm itself, but Google’s failure to deal with the man’s complaint in a timely fashion.

I suspect organisations like Facebook and Google need to have better mechanisms for dealing with things like this. Issues do not just arise in defamation law, either. Facebook is working with Australian police to develop better protocols for dealing with illegal activities after Australian police smashed a pedophilia ring on Facebook. Facebook repeatedly shut down the groups which exchanged indecent images, but did not report them to police, even though one of the users alerted Facebook. In another incident, a mother was worried that her 12-year-old daughter was being stalked by a pedophile on Facebook, and alleged that she was having difficulty getting the social networking site to respond. Facebook does not have an Australian office. (Grotesquely, the stalker turned out to be another 12-year-old girl masquerading as an older male pedophile.)

I suppose that we’re all still learning the pitfalls and possibilities of online communication. Companies, lawmakers and courts are all having to develop better mechanisms to deal with the legal ramifications of online communication.

(Hat tip: Heath G)

45 Comments

  1. Jacques Chester
    Posted September 28, 2010 at 7:40 am | Permalink

    I think truth would be a defence if the algorithm is properly understood.

    The suggest algorithm merely looks for correlations in searches and web pages. It is not an editorial decision and it is not a statement of correctness of the underlying correlation.

    But it is true that the man’s name and the undesirable words were correlated in searches and web pages.

    The only time Google might be seen to exercise some indirect editorial control is if they do not address conscious suborning of the algorithm — ie ‘googlebombing’ (for which, see the Justin Bieber/syphillis incident).

    However Google are quite active in tweaking their algorithms to defeat gaming.

  2. Jacques Chester
    Posted September 28, 2010 at 7:58 am | Permalink

    Analogy: suppose I place a poll online. “John Q Exampleson is guilty of murder. Yes or No?”

    Am I defaming him for publishing the results?

  3. TerjeP
    Posted September 28, 2010 at 7:59 am | Permalink

    I’ve observed kids using pedophile as a term of abuse. A bit disturbing I must say.

  4. desipis
    Posted September 28, 2010 at 9:30 am | Permalink

    Jacques,

    If it was framed as “67% of respondents to an online survey believe John Q Exampleson is guilty of murder.” then I’d argue no. I think however if it was frame “Online survey indicates John Q Exampleson is guilty of murder” then there could be an issue. The later could be seen as using a potential falsehood, that the survey actually indicates anything. to defame Mr Exampleson.

    The man’s name might be statically associated with the terms “rapist” and “satanist”, but is it appropriate to frame that as a “suggestion” rather than simply informing the user of the statistical ‘fact’?

  5. Posted September 28, 2010 at 9:31 am | Permalink

    Speaking as a non-lawyer, I love law 🙂

  6. desipis
    Posted September 28, 2010 at 9:38 am | Permalink

    I’ve observed kids using pedophile as a term of abuse. A bit disturbing I must say.

    I think it’s just a sign of the irrationality and hyperbole adults exhibit when using the term. See “gay”, “bastard”, “communist”, etc

  7. Jacques Chester
    Posted September 28, 2010 at 10:09 am | Permalink

    Google don’t frame the results at all. They merely list related search terms.

    So in my analogy, the framing is simply the question and a table of results. Nothing else is written to preface it. No editorial input whatsoever. Purely numbers.

  8. kvd
    Posted September 28, 2010 at 1:42 pm | Permalink

    Desipis’ reply re JC’s “online murder poll” talked about how the reporting of the results might/might not be defamatory. I must admit I was more worried by the publication of the question/poll in the first place. (“How many times a week does LE beat her children?” Click to view results…)

    And it also worries me that “the algorithm doesn’t mean to infer” and “it’s just a blind machine” seems to imply some sort of excuse for the (human) authors of the software to evade responsibility for their program outcomes. Wish I’d had that when I used to do that.

  9. Posted September 28, 2010 at 3:03 pm | Permalink

    And it also worries me that “the algorithm doesn’t mean to infer” and “it’s just a blind machine” seems to imply some sort of excuse for the (human) authors of the software to evade responsibility for their program outcomes.

    1. You’re asking software engineers to accurately predict all outputs of their algorithms. For certain classes of algorithms this is mathematically impossible. For others it is physically impossible. For most it is merely humanly impossible.

    2. If they knew the outcomes of the algorithm in advance, they wouldn’t need the algorithm. That’s the entire point of this class of algorithms (broadly classifiable as ‘machine learning’). It is supposed to pick out associations that a statistician might have, given enough time and an inhumanly good memory.

  10. Posted September 28, 2010 at 3:07 pm | Permalink

    Jacques, while the algorithm doesn’t mean to infer anything in particular (it’s just a blind machine picking out associations) because of the way that the human mind works, such an inference is created.

    Refer to my last comment. If this is the legal argument, then lawyers are asking for computer scientists to either a) circumvent mathematical impossibilities and/or b) invent hard AI.

    The common law yields to physical laws and mathematical laws. The findings of computer science grew out of the 20th century quest for mathematical certainty. See Logicomix and The Annotated Turing (or a textbook on computability, if ye dare).

  11. desipis
    Posted September 28, 2010 at 3:29 pm | Permalink

    You’re asking software engineers to accurately predict all outputs of their algorithms.

    I’m not sure that’s the case. I think it’s more about expecting engineers to make a reasonable attempt to alter the outcome of the system to prevent certain outcomes that are causing harm. I don’t see how the relative abstract nature of the harm changes that responsibility.

    I think the fact there were basically no damages awarded indicates the court understood that Google couldn’t reasonably be expected to predict the outcome. However the order to take down the result also indicated the court expected that once Google became aware of the outcome that they take steps to prevent it in the future.

  12. Posted September 28, 2010 at 3:41 pm | Permalink

    However the order to take down the result also indicated the court expected that once Google became aware of the outcome that they take steps to prevent it in the future.

    Well they can add a specific clause to the algorithm for this particular case. And perhaps a more general clause to stop ‘satanist’ and ‘pedophile’ appearing next to anyone’s name. But what they cannot reasonably do is to predict all possible undesirable outputs from the algorithm, because the number of combinations of ways people can impute meaning to combinations of words is impossibly large to model.

    Otherwise you need an ‘oracle’: some agent to watch the results and decide whether they’re acceptable. That takes you back to requiring hard AI … which may not be available in time for the court order, very sorry.

    If courts take the view that Google must prevent all such occurrences in future or be considered in breach, then the only guaranteed way to achieve that would be to switch the algorithm off.

    If instead there is a ‘reasonable person’ test, they might get away with a few heuristics (like a ‘no pedophile’ rule-of-thumb).

    But a truly general solution isn’t available. That’s the nature of the class of algorithms.

  13. kvd
    Posted September 28, 2010 at 3:41 pm | Permalink

    Jacques:
    1: I agree that software doesn’t even remotely approach “accuracy” in all outcomes.
    2. Machine Learning? Ha.

    And you are comfortable with that? As applied to a specific human subject?

    JC11: I think lawyers would be better served by unyielding scepticism re computer generated anything – and I would prefer that “the common law yields” to nothing of the sort.

    How do ye dare?

  14. Posted September 28, 2010 at 3:43 pm | Permalink

    The algorithms return association, and even idiots know association does not imply causation. (Imagine plugging in the name of a famous barrister or judge with a heinous crime).

    There is also the problem of the maths genius required to even get a vague understanding google’s algorithms (how many lawyers, judges and juries could!), and the problem of the trial causing public exposure of google’s crown jewels of intellectual property.

    To show to an ordinary juror how silly making an inference is from counts of semantic association without delving into the details of the source documents, plug in the following searches:

    * god+good (247 million hits)
    * god+evil (101 million hits)
    * satan+good (14 million hits)
    * satan+evil (6 million hits)

    How many people would infer that god is much more evil than satan (101/6) or that god and satan aren’t that different when it comes to good/evil proportions (247/101=2.4; 14/6=2.3), especially as most writers on such topics would, presumably, make statements that god is good and satan is evil?

    (For JC’s giggling: imagine trials as a munge of RDFs and ontologies from both sides…)

  15. Posted September 28, 2010 at 3:47 pm | Permalink

    I think lawyers would be better served by unyielding scepticism re computer generated anything

    There are limits to what computers can do. There are limits to what we can determine about a given problem which a computer is applied to. These limits are not limits of expense, inconvenience or technological immaturity. They are mathematical and physical limits which can’t be circumvented by the common law or statute.

    Supposing that a judge takes the least reasonable approach of asking for Google to universally prevent such breaches in future, then he or she is doing something analogical with ordering the meteorology office to only produce good weather and ordering that Pi henceforth equals 3. It’s a nonsense.

    I don’t think that this magistrate has made such an order; I am merely trying to caution against them thinking that such an order is possibly fulfillable.

  16. Posted September 28, 2010 at 3:48 pm | Permalink

    …if someone alerts them to an outcome such as this, and the outcome is defamatory or potentially defamatory, then they should amend the algorithm.

    I think that’s reasonable too. In fact it’s the only reliable option short of giving up the algorithm altogether.

  17. Posted September 28, 2010 at 3:54 pm | Permalink

    (For JC’s giggling: imagine trials as a munge of RDFs and ontologies from both sides…)

    I’d laugh, but somebody got VC money for it.

  18. Posted September 28, 2010 at 4:00 pm | Permalink

    The issue is acting when the (potential) plaintiff brings it to their attention. Much (if not all) of this nastiness can be headed off at the pass by simple courtesy.

    It seems here that the problem emerged when Google sat on its hands despite being ‘put on notice’ as lawyers like to say.

  19. kvd
    Posted September 28, 2010 at 4:11 pm | Permalink

    “There are limits to what computers can do” – thank you.
    JC18: seems to reinforce JC13: “Well they can add a specific clause to the algorithm for this particular case”. You are not seriously suggesting this as reasonable restitution, or as an ongoing “fix”?

    Dave Bath nearly gets the problem, but seems to think it’s funny. Guilt by Google association. Not funny to the Googled.

  20. Posted September 28, 2010 at 4:13 pm | Permalink

    SL: Google have notoriously pisspoor customer service.

  21. Posted September 28, 2010 at 4:14 pm | Permalink

    kvd: I don’t follow what you’re getting at.

  22. kvd
    Posted September 28, 2010 at 4:21 pm | Permalink

    Jacques, you agreed at 18 that the Google algorithm should be amended. Just saying that that short answer is quite unrealistic given the amount of “exceptions” that Google has to deal with. Not an attack – just pointing out the on-the-ground reality.

  23. Posted September 28, 2010 at 4:34 pm | Permalink

    I suppose Google could go Godel, saying a judges order must be possible, and say “dear judge, give us an algorithm that works ro your satisfaction, prove your algorithm will not cause damage, and we’ll implement it”. After all, an order must be at least theoretically capable of being executed, or it is invalid.

    Ho ho ho, formal undecidability…

    Google is very specific about what their tools do. People using a product against suitability statements should be on their own. If I’m stupid enough to drink a topical disinfectant because I have a stomach bug, and the disinfectant kills germs, then it is MY fault. If I affect someone else (give THEM the product contrary to product description), then I am responsible.

    Honi soit qui mal y pense – the people assuming evil without good reasons are themselves evil.

  24. desipis
    Posted September 28, 2010 at 4:53 pm | Permalink

    Just saying that that short answer is quite unrealistic given the amount of “exceptions” that Google has to deal with.

    If google can’t economically provide the service without either limiting the damage it does or providing sufficient restitution to appease those it harms, then it shouldn’t be operating the service.

    That’s besides the point that with all the technological power Google has at its disposal, providing a simple word association filter at the end points of the system would be rather trivial to implement and have minimal impact on the system.

  25. desipis
    Posted September 28, 2010 at 4:58 pm | Permalink

    People using a product against suitability statements should be on their own.

    So if a newspaper puts a disclaimer in fine print that their reports shouldn’t be taken seriously and include a report that “Dave Bath” is a rapist and a Satanist, you’d have no problem with it?

  26. kvd
    Posted September 28, 2010 at 5:03 pm | Permalink

    desipis: “if … it shouldn’t be operating the service”. That’s exactly what JC suggested at 18 – but only as an improbable, unrealistic fallback.

    Dave Bath puts it well, without recognising what he is thereby ceding to the great god Google.

    Anyway, there is no evil in Google that I can see – except the small evil of collating, summarising, and averaging us all, and the occasional guilt by association. What’s not to look forward to, occasionally?

  27. Posted September 28, 2010 at 5:16 pm | Permalink

    So if a newspaper puts a disclaimer in fine print that their reports shouldn’t be taken seriously and include a report that “Dave Bath” is a rapist and a Satanist, you’d have no problem with it?

    Google, and their algorithm, are doing neither of these things.

  28. Posted September 28, 2010 at 5:18 pm | Permalink

    That’s besides the point that with all the technological power Google has at its disposal, providing a simple word association filter at the end points of the system would be rather trivial to implement and have minimal impact on the system.

    And the point Dave and I are raising is that strictly, this is not a general solution. There will always be exceptions, to a mathematical certainty. Asking Google to foresee and prevent all of them is asking for the impossible.

    It Can’t. Be. Done. With. A. Computer.

  29. Posted September 28, 2010 at 5:22 pm | Permalink

    Despite the energy of the thread so far, I can’t help but wonder if we’re all talking past each other. Another potential case of ferocious agreement.

  30. kvd
    Posted September 28, 2010 at 5:23 pm | Permalink

    “Google is very specific about what their tools do.”

    Can’t remember seeing on the Google front page a statement such as:

    “Look, most of these results are just crap really, but if you want to click on something just be our guest, and if you draw some erroneous conclusions from reading any of that crap – well, more fool you.

    Just don’t sue us – that’s not cool. What did you say your name was?”

  31. Posted September 28, 2010 at 7:44 pm | Permalink

    [email protected] makes an important point with lots of associated issues.

    specifically delegated to deal with legal issues in the jurisdiction, and, if necessary, to report issues to the police.

    Here we go into what the product /is/.

    Facebook, a social platform, could well discover things from “inside”, things naturally of legitimate relevance to police, as well as what I’d consider illegitimate interest to oppressive authorites.

    Google’s search engine is a very different beastie: results are as available to the authorities as they are to the general public. The stuff google /does/ have that authorities want, includes search histories. It’s worth noting that when the NSA asked for all searches and ip addresses, Microsoft and Yahoo handed everything over, whereas Google said “that’s trawling… Come back with a decent warrant and we’ll happily oblige”. No such warrant came because it was an illegitimate request, and the NSA /knew/ it.

    We need to be /very/ careful how platforms like Google or Facebook kowtow to governments. It’s a very tricky balance. Should Google bow to every censorship or trawling request of every government? How would any one of us try and figure out appropriate responses to such requests if we ran Google.

    LE’s quick remark, entirely valid, raises so many questions… and such important ones. I suspect, on such issues, I’m as libertarian and antiauthoritarian as any of the regulars. (Shock, horror!)

  32. desipis
    Posted September 28, 2010 at 8:21 pm | Permalink

    There will always be exceptions, to a mathematical certainty. Asking Google to foresee and prevent all of them is asking for the impossible.

    I’m not suggesting they should foresee the exceptions. Rather, they add them to the filter list as they receive judicial orders for them to do so.

  33. desipis
    Posted September 28, 2010 at 8:26 pm | Permalink

    Google, and their algorithm, are doing neither of these things.

    The point was that (arguably) by publish the suggestions Google was creating a false and damaging public perception, and like my hypothetical, was done in spite of it not being an explicitly suitable use of the publication.

  34. Jeremy Dawson
    Posted September 30, 2010 at 9:48 am | Permalink

    The article says

    It is interesting, though, when a “ghost in the machine” creates the defamatory communication. Perhaps the position of Google is analogous to the situation where a defamatory comment is made by a commenter on a blog post. The blogger will be liable for defamation if she leaves the comment up, even if she did not make the comment herself and does not agree with it. Similarly, here the problem is not the algorithm itself, but Google’s failure to deal with the man’s complaint in a timely fashion.

    I totally disagree. If Google write and use a computer program they alone are entirely responsible. Whereas when a person makes a comment on a blog, that person is entirely responsible for making the comment (even if the blog maintainer shares responsibility for the comment’s publication)

  35. kvd
    Posted September 30, 2010 at 4:34 pm | Permalink

    JC, on the general point of the usefulness of generated links, I always scan the Google Ads generated at the bottom of news articles just out of curiosity. Pointless, silly habit, I know.

    So, I am reading LE’s entry re Andrew Bolt, and clicked over to his linked “White fellas in the black” article, and at the bottom are the following (with links removed):

    # Aboriginal Cultural Tourism

    Live our stories as you create your own. Experience Canada’s culture.

    # African Dating

    Most Delicious Women from Africa Seeking Men for Love & Marriage!

    # Gorgeous Colombian Women

    Are Waiting For a Real Man. Sign Up & Choose Your Fiancee Here!

    – which seem to suggest (as usual) that the Google relevance algorithm might need a further refinement or (forty) three.

    This is not a criticism as such, because I do know just how hard it must be to get even this close in relevance and, as you said:

    “It Can’t. Be. Done. With. A. Computer.”

    But it always leaves me wondering why they pretend to try; and how they can charge the advertiser whilst maintaining a straight face – and without being sued?

  36. Posted September 30, 2010 at 6:28 pm | Permalink

    [email protected]
    There are two very different algorithms at play, and very different purposes:

    Advertisement placement:
    This works by an auction algorithm, people paying for higher placement with certain keywords.

    It is nothing to do with truth, it is not a representation of the landscape. If I pay for an ad for cars when the keyword search is trams, that’s my prerogative. Courts can legitimately get involved if customers don’t get fair dealing, if unlawful advertisements are made, etc.

    The other, the search service, is a reflection of the landscape, and /should/ be nothing more, no spin, no political or judicial interference.

    Imagine you publish a dictionary or encyclopaedia that reflects the way language is used, or the world.

    If your dictionary has integrity, it has a fixed set of rules for words that are in or out… Frequencies in works read by the population, or frequencies of use within the population, combined with adjustments for age of readers… so I don’t find phlegmatic phantamasgoria in a pre-school ABC.

    It is simply /wrong/ for a court or politician or pressure group to want the dictionary to change the rules for inclusion or alphabetic ordering in particular cases, to expurgate commonly used words, to avoid inconvenient juxtapositions or acrostics (even thw gubernator’s “f*** you” acrostic veto notice could have been avoided, but would you want a law against it, or him prosecuted?).

    Such twisting of sensible rules for reporting of survey results (whether in the real world, the noosphere, or the landscape of the net) is extremely dangerous, the sort of thing that my libertarian side should fight against tooth and nail.

    Mind you, it can be done techincally to a moderate degree… Reports of dain dramage in microsoft products, using microsoft bing as a search tool, come up /much/ less often than when using other engines. In other words, Microsoft /twist/ truth, but cannot prevent the truths getting out completely.

    If I was Google, I’d be tempted to either pay the fine, handing the cash over on an upraised digitus medius, or just block searches in french or from .fr as the **only** way to provably stop the “offending” truth coming up… China is more important the france, and Google certainly gave the chinese government pause.

  37. Posted September 30, 2010 at 6:35 pm | Permalink

    oh my, dain bramaged typos in my previous post… Tiny keys and screen on phone, missed drugs this morning, 3 yo jumping on me in bed…. Aaaaah! Sorry.

  38. Posted September 30, 2010 at 7:04 pm | Permalink

    actually, in the case of advertisement placement auctions, where /I/ see more legitimate involvements of courts, I wonder how more rightie friends would view restrictions of bidding for keywords:

    Fighter jet manufacturers bidding for:
    * “air supremacy”
    * theater+operations
    * peace
    * (name of a particular country)

    Shotgun/rifle manufacturer bidding for
    * feral pigs
    * kangaroos
    * panda
    * charlton heston (the firearms advocate)
    * (name of politician)

    Sex aids manufacturer bidding for
    * teddies (it /is/ a form of female night attire, not just a kiddy cuddly toy)
    * dolls
    * toy dolls

    Kiddy toy manufacturer:
    * dolls
    * peace
    * toy dolls
    * thomas the tank engine

    Junk food manufacturer targetting kids
    * hamburger
    * food
    * nutrition
    * dolls
    * thomas the tank engine

    Now… Which of those bids are unobjectionable, objectionable but should be lawful, or should be unlawful?

    difficult? That’s just the commercial thing, that’s not the this-is-what-the-net-has-in-it survey and reporting, which I see as /much/ more of a freedom issue.

  39. kvd
    Posted October 1, 2010 at 4:42 am | Permalink

    Thank you, DB for your replies – and much impressed I am if generated on a phone screen! Your points are well put, and reinforce my own small understanding of the keyword advertising process, having twice signed up for promotional campaigns with Adwords attaching to search results. But no longer.

    My general point remains on the usefulness of the generated links, even given your good examples just above – some of which might be morally objectionable. How either the advertiser, or the consumer might benefit from those placements is just beyond me. Anyway, thank you for taking my query as an honest one.

  40. Jacques Chester
    Posted October 1, 2010 at 2:26 pm | Permalink

    There are obviously some manual entries in the algorithm, as I just found out.

    Try searching for “anagram”.

  41. Posted October 2, 2010 at 7:47 am | Permalink

    [email protected] Let the kids cope with advertising I say: that is one area where knowing cynicism is a survival habit. Like not living an excessively germ-free existence.

  42. Posted October 2, 2010 at 8:15 am | Permalink

    [email protected] – I was after where my less lefty friends saw issues, particularly along the rough progressions I tried to give. As to kids and ads, the only message is one of overcomsumption… No competing memes… The open source computing producers don’t have the cash microsoft does. It’s no different between the ads for woo and the messages to kids of my day “atheism=evil” versus the lack of anti-woo and rationalist message.

One Trackback

  1. […] This post was mentioned on Twitter by The Paisley Snail and John Hacking, Legal Eagle. Legal Eagle said: Defamation by algorithm: In a previous post, I’ve mentioned the annoying algorithmic habits of Facebook (suggestin… http://bit.ly/dwSfjp […]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*