In a previous post, I’ve mentioned the annoying algorithmic habits of Facebook (suggesting that you reconnect with dead friends or putting up ads asking, “Do you want to get pregnant?”…um no, not right now!)
Google also uses algorithms which suggest searches related to the one you have just undertaken. A French man has successfully sued Google because he alleged than when his name was entered into the search engine, the terms “rapist” and “satanist” were suggested by Google as associated search terms. The man had in fact been convicted for corruption of a minor, receiving a three year suspended jail sentence. He said that he had tried contacting Google to get them to remove the algorithm, with no success. Saul Weintraub at CNN Fortune reports:
‘Suggest’ is a service that gives additional terms for further searching after a query is done. Those words are based on terms that are grouped on the web and Google’s PageRank algorithm.
Those results were likely a manifestation of news reports of the man’s crimes and related searches based on those terms. Google states that the results aren’t its responsibility, they are just a manifestation of their computers reporting what’s out there on the web.
The French court concluded that the search engine’s linking his name to such words was defamatory. Google CEO Eric Schmidt and Google were ordered to pay €1 settlement plus €5,000 for the man’s court costs. Google was also ordered to take down the results of its algorithm and would be fined daily until such action had been taken.
…
Interestingly, the court felt that Google France wasn’t liable, but somehow Eric Schmidt in his editorial capacity at the Google HQ in the US were in fact responsible.
…
Because Google isn’t simply pointing to search results but its robots are making “Editorial decisions”, it may find itself in trouble in other jurisdictions around the globe – at least until the courts understand what is happening behind the scenes.
The judgment is here (en français).
The Google algorithm infers that there is a link between X’s name and the terms “rapist” and “satanist”. Would Google be protected by a defence of “truth” under our defamation laws? It probably depends very much on what the newspaper reports said. If the newspaper reports said, “X was accused of being a satanist and a rapist, but was found not guilty,” then the inference created by the search suggestion would be untrue. I suspect that this is probably what happened.
It is interesting, though, when a “ghost in the machine” creates the defamatory communication. Perhaps the position of Google is analogous to the situation where a defamatory comment is made by a commenter on a blog post. The blogger will be liable for defamation if she leaves the comment up, even if she did not make the comment herself and does not agree with it. Similarly, here the problem is not the algorithm itself, but Google’s failure to deal with the man’s complaint in a timely fashion.
I suspect organisations like Facebook and Google need to have better mechanisms for dealing with things like this. Issues do not just arise in defamation law, either. Facebook is working with Australian police to develop better protocols for dealing with illegal activities after Australian police smashed a pedophilia ring on Facebook. Facebook repeatedly shut down the groups which exchanged indecent images, but did not report them to police, even though one of the users alerted Facebook. In another incident, a mother was worried that her 12-year-old daughter was being stalked by a pedophile on Facebook, and alleged that she was having difficulty getting the social networking site to respond. Facebook does not have an Australian office. (Grotesquely, the stalker turned out to be another 12-year-old girl masquerading as an older male pedophile.)
I suppose that we’re all still learning the pitfalls and possibilities of online communication. Companies, lawmakers and courts are all having to develop better mechanisms to deal with the legal ramifications of online communication.
(Hat tip: Heath G)

45 Comments
I think truth would be a defence if the algorithm is properly understood.
The suggest algorithm merely looks for correlations in searches and web pages. It is not an editorial decision and it is not a statement of correctness of the underlying correlation.
But it is true that the man’s name and the undesirable words were correlated in searches and web pages.
The only time Google might be seen to exercise some indirect editorial control is if they do not address conscious suborning of the algorithm — ie ‘googlebombing’ (for which, see the Justin Bieber/syphillis incident).
However Google are quite active in tweaking their algorithms to defeat gaming.
Analogy: suppose I place a poll online. “John Q Exampleson is guilty of murder. Yes or No?”
Am I defaming him for publishing the results?
I’ve observed kids using pedophile as a term of abuse. A bit disturbing I must say.
Jacques,
If it was framed as “67% of respondents to an online survey believe John Q Exampleson is guilty of murder.” then I’d argue no. I think however if it was frame “Online survey indicates John Q Exampleson is guilty of murder” then there could be an issue. The later could be seen as using a potential falsehood, that the survey actually indicates anything. to defame Mr Exampleson.
The man’s name might be statically associated with the terms “rapist” and “satanist”, but is it appropriate to frame that as a “suggestion” rather than simply informing the user of the statistical ‘fact’?
Speaking as a non-lawyer, I love law
I think it’s just a sign of the irrationality and hyperbole adults exhibit when using the term. See “gay”, “bastard”, “communist”, etc
Google don’t frame the results at all. They merely list related search terms.
So in my analogy, the framing is simply the question and a table of results. Nothing else is written to preface it. No editorial input whatsoever. Purely numbers.
Jacques, while the algorithm doesn’t mean to infer anything in particular (it’s just a blind machine picking out associations) because of the way that the human mind works, such an inference is created. And, as Weintraub points out above, it would make it very hard if the guy were trying to get a job. Although presumably even if the suggestion were taken away, the story about the corrupting of a minor would come up anyway, and he’s been convicted of that.
Desipis’ reply re JC’s “online murder poll” talked about how the reporting of the results might/might not be defamatory. I must admit I was more worried by the publication of the question/poll in the first place. (“How many times a week does LE beat her children?” Click to view results…)
And it also worries me that “the algorithm doesn’t mean to infer” and “it’s just a blind machine” seems to imply some sort of excuse for the (human) authors of the software to evade responsibility for their program outcomes. Wish I’d had that when I used to do that.
1. You’re asking software engineers to accurately predict all outputs of their algorithms. For certain classes of algorithms this is mathematically impossible. For others it is physically impossible. For most it is merely humanly impossible.
2. If they knew the outcomes of the algorithm in advance, they wouldn’t need the algorithm. That’s the entire point of this class of algorithms (broadly classifiable as ‘machine learning’). It is supposed to pick out associations that a statistician might have, given enough time and an inhumanly good memory.
Refer to my last comment. If this is the legal argument, then lawyers are asking for computer scientists to either a) circumvent mathematical impossibilities and/or b) invent hard AI.
The common law yields to physical laws and mathematical laws. The findings of computer science grew out of the 20th century quest for mathematical certainty. See Logicomix and The Annotated Turing (or a textbook on computability, if ye dare).
I’m not sure that’s the case. I think it’s more about expecting engineers to make a reasonable attempt to alter the outcome of the system to prevent certain outcomes that are causing harm. I don’t see how the relative abstract nature of the harm changes that responsibility.
I think the fact there were basically no damages awarded indicates the court understood that Google couldn’t reasonably be expected to predict the outcome. However the order to take down the result also indicated the court expected that once Google became aware of the outcome that they take steps to prevent it in the future.
Well they can add a specific clause to the algorithm for this particular case. And perhaps a more general clause to stop ‘satanist’ and ‘pedophile’ appearing next to anyone’s name. But what they cannot reasonably do is to predict all possible undesirable outputs from the algorithm, because the number of combinations of ways people can impute meaning to combinations of words is impossibly large to model.
Otherwise you need an ‘oracle’: some agent to watch the results and decide whether they’re acceptable. That takes you back to requiring hard AI … which may not be available in time for the court order, very sorry.
If courts take the view that Google must prevent all such occurrences in future or be considered in breach, then the only guaranteed way to achieve that would be to switch the algorithm off.
If instead there is a ‘reasonable person’ test, they might get away with a few heuristics (like a ‘no pedophile’ rule-of-thumb).
But a truly general solution isn’t available. That’s the nature of the class of algorithms.
Jacques:
1: I agree that software doesn’t even remotely approach “accuracy” in all outcomes.
2. Machine Learning? Ha.
And you are comfortable with that? As applied to a specific human subject?
JC11: I think lawyers would be better served by unyielding scepticism re computer generated anything – and I would prefer that “the common law yields” to nothing of the sort.
How do ye dare?
The algorithms return association, and even idiots know association does not imply causation. (Imagine plugging in the name of a famous barrister or judge with a heinous crime).
There is also the problem of the maths genius required to even get a vague understanding google’s algorithms (how many lawyers, judges and juries could!), and the problem of the trial causing public exposure of google’s crown jewels of intellectual property.
To show to an ordinary juror how silly making an inference is from counts of semantic association without delving into the details of the source documents, plug in the following searches:
* god+good (247 million hits)
* god+evil (101 million hits)
* satan+good (14 million hits)
* satan+evil (6 million hits)
How many people would infer that god is much more evil than satan (101/6) or that god and satan aren’t that different when it comes to good/evil proportions (247/101=2.4; 14/6=2.3), especially as most writers on such topics would, presumably, make statements that god is good and satan is evil?
(For JC’s giggling: imagine trials as a munge of RDFs and ontologies from both sides…)
I’d say that Google ought reasonably only to remove such instances where the complainant brings it to their attention. It’s totally unreasonable to expect them to predict every outcome. But if someone alerts them to an outcome such as this, and the outcome is defamatory or potentially defamatory, then they should amend the algorithm.
There are limits to what computers can do. There are limits to what we can determine about a given problem which a computer is applied to. These limits are not limits of expense, inconvenience or technological immaturity. They are mathematical and physical limits which can’t be circumvented by the common law or statute.
Supposing that a judge takes the least reasonable approach of asking for Google to universally prevent such breaches in future, then he or she is doing something analogical with ordering the meteorology office to only produce good weather and ordering that Pi henceforth equals 3. It’s a nonsense.
I don’t think that this magistrate has made such an order; I am merely trying to caution against them thinking that such an order is possibly fulfillable.
I think that’s reasonable too. In fact it’s the only reliable option short of giving up the algorithm altogether.
I’d laugh, but somebody got VC money for it.
The issue is acting when the (potential) plaintiff brings it to their attention. Much (if not all) of this nastiness can be headed off at the pass by simple courtesy.
It seems here that the problem emerged when Google sat on its hands despite being ‘put on notice’ as lawyers like to say.
“There are limits to what computers can do” – thank you.
JC18: seems to reinforce JC13: “Well they can add a specific clause to the algorithm for this particular case”. You are not seriously suggesting this as reasonable restitution, or as an ongoing “fix”?
Dave Bath nearly gets the problem, but seems to think it’s funny. Guilt by Google association. Not funny to the Googled.
SL: Google have notoriously pisspoor customer service.
kvd: I don’t follow what you’re getting at.
Jacques, you agreed at 18 that the Google algorithm should be amended. Just saying that that short answer is quite unrealistic given the amount of “exceptions” that Google has to deal with. Not an attack – just pointing out the on-the-ground reality.
I suppose Google could go Godel, saying a judges order must be possible, and say “dear judge, give us an algorithm that works ro your satisfaction, prove your algorithm will not cause damage, and we’ll implement it”. After all, an order must be at least theoretically capable of being executed, or it is invalid.
Ho ho ho, formal undecidability…
Google is very specific about what their tools do. People using a product against suitability statements should be on their own. If I’m stupid enough to drink a topical disinfectant because I have a stomach bug, and the disinfectant kills germs, then it is MY fault. If I affect someone else (give THEM the product contrary to product description), then I am responsible.
Honi soit qui mal y pense – the people assuming evil without good reasons are themselves evil.
If google can’t economically provide the service without either limiting the damage it does or providing sufficient restitution to appease those it harms, then it shouldn’t be operating the service.
That’s besides the point that with all the technological power Google has at its disposal, providing a simple word association filter at the end points of the system would be rather trivial to implement and have minimal impact on the system.
So if a newspaper puts a disclaimer in fine print that their reports shouldn’t be taken seriously and include a report that “Dave Bath” is a rapist and a Satanist, you’d have no problem with it?
desipis: “if … it shouldn’t be operating the service”. That’s exactly what JC suggested at 18 – but only as an improbable, unrealistic fallback.
Dave Bath puts it well, without recognising what he is thereby ceding to the great god Google.
Anyway, there is no evil in Google that I can see – except the small evil of collating, summarising, and averaging us all, and the occasional guilt by association. What’s not to look forward to, occasionally?
Google, and their algorithm, are doing neither of these things.
And the point Dave and I are raising is that strictly, this is not a general solution. There will always be exceptions, to a mathematical certainty. Asking Google to foresee and prevent all of them is asking for the impossible.
It Can’t. Be. Done. With. A. Computer.
Despite the energy of the thread so far, I can’t help but wonder if we’re all talking past each other. Another potential case of ferocious agreement.
“Google is very specific about what their tools do.”
Can’t remember seeing on the Google front page a statement such as:
“Look, most of these results are just crap really, but if you want to click on something just be our guest, and if you draw some erroneous conclusions from reading any of that crap – well, more fool you.
Just don’t sue us – that’s not cool. What did you say your name was?”
I suspect that none of us would expect Google to predict what the algorithm would come up with, and would not think it reasonable to punish them for something they could not predict.
I think it’s important that these companies have local representatives who are specifically delegated to deal with legal issues in the jurisdiction, and, if necessary, to report issues to the police.
LE@33 makes an important point with lots of associated issues.
Here we go into what the product /is/.
Facebook, a social platform, could well discover things from “inside”, things naturally of legitimate relevance to police, as well as what I’d consider illegitimate interest to oppressive authorites.
Google’s search engine is a very different beastie: results are as available to the authorities as they are to the general public. The stuff google /does/ have that authorities want, includes search histories. It’s worth noting that when the NSA asked for all searches and ip addresses, Microsoft and Yahoo handed everything over, whereas Google said “that’s trawling… Come back with a decent warrant and we’ll happily oblige”. No such warrant came because it was an illegitimate request, and the NSA /knew/ it.
We need to be /very/ careful how platforms like Google or Facebook kowtow to governments. It’s a very tricky balance. Should Google bow to every censorship or trawling request of every government? How would any one of us try and figure out appropriate responses to such requests if we ran Google.
LE’s quick remark, entirely valid, raises so many questions… and such important ones. I suspect, on such issues, I’m as libertarian and antiauthoritarian as any of the regulars. (Shock, horror!)
I’m not suggesting they should foresee the exceptions. Rather, they add them to the filter list as they receive judicial orders for them to do so.
The point was that (arguably) by publish the suggestions Google was creating a false and damaging public perception, and like my hypothetical, was done in spite of it not being an explicitly suitable use of the publication.
The article says
I totally disagree. If Google write and use a computer program they alone are entirely responsible. Whereas when a person makes a comment on a blog, that person is entirely responsible for making the comment (even if the blog maintainer shares responsibility for the comment’s publication)
JC, on the general point of the usefulness of generated links, I always scan the Google Ads generated at the bottom of news articles just out of curiosity. Pointless, silly habit, I know.
So, I am reading LE’s entry re Andrew Bolt, and clicked over to his linked “White fellas in the black” article, and at the bottom are the following (with links removed):
# Aboriginal Cultural Tourism
Live our stories as you create your own. Experience Canada’s culture.
# African Dating
Most Delicious Women from Africa Seeking Men for Love & Marriage!
# Gorgeous Colombian Women
Are Waiting For a Real Man. Sign Up & Choose Your Fiancee Here!
- which seem to suggest (as usual) that the Google relevance algorithm might need a further refinement or (forty) three.
This is not a criticism as such, because I do know just how hard it must be to get even this close in relevance and, as you said:
“It Can’t. Be. Done. With. A. Computer.”
But it always leaves me wondering why they pretend to try; and how they can charge the advertiser whilst maintaining a straight face – and without being sued?
kvd@38
There are two very different algorithms at play, and very different purposes:
Advertisement placement:
This works by an auction algorithm, people paying for higher placement with certain keywords.
It is nothing to do with truth, it is not a representation of the landscape. If I pay for an ad for cars when the keyword search is trams, that’s my prerogative. Courts can legitimately get involved if customers don’t get fair dealing, if unlawful advertisements are made, etc.
The other, the search service, is a reflection of the landscape, and /should/ be nothing more, no spin, no political or judicial interference.
Imagine you publish a dictionary or encyclopaedia that reflects the way language is used, or the world.
If your dictionary has integrity, it has a fixed set of rules for words that are in or out… Frequencies in works read by the population, or frequencies of use within the population, combined with adjustments for age of readers… so I don’t find phlegmatic phantamasgoria in a pre-school ABC.
It is simply /wrong/ for a court or politician or pressure group to want the dictionary to change the rules for inclusion or alphabetic ordering in particular cases, to expurgate commonly used words, to avoid inconvenient juxtapositions or acrostics (even thw gubernator’s “f*** you” acrostic veto notice could have been avoided, but would you want a law against it, or him prosecuted?).
Such twisting of sensible rules for reporting of survey results (whether in the real world, the noosphere, or the landscape of the net) is extremely dangerous, the sort of thing that my libertarian side should fight against tooth and nail.
Mind you, it can be done techincally to a moderate degree… Reports of dain dramage in microsoft products, using microsoft bing as a search tool, come up /much/ less often than when using other engines. In other words, Microsoft /twist/ truth, but cannot prevent the truths getting out completely.
If I was Google, I’d be tempted to either pay the fine, handing the cash over on an upraised digitus medius, or just block searches in french or from .fr as the **only** way to provably stop the “offending” truth coming up… China is more important the france, and Google certainly gave the chinese government pause.
oh my, dain bramaged typos in my previous post… Tiny keys and screen on phone, missed drugs this morning, 3 yo jumping on me in bed…. Aaaaah! Sorry.
actually, in the case of advertisement placement auctions, where /I/ see more legitimate involvements of courts, I wonder how more rightie friends would view restrictions of bidding for keywords:
Fighter jet manufacturers bidding for:
* “air supremacy”
* theater+operations
* peace
* (name of a particular country)
Shotgun/rifle manufacturer bidding for
* feral pigs
* kangaroos
* panda
* charlton heston (the firearms advocate)
* (name of politician)
Sex aids manufacturer bidding for
* teddies (it /is/ a form of female night attire, not just a kiddy cuddly toy)
* dolls
* toy dolls
Kiddy toy manufacturer:
* dolls
* peace
* toy dolls
* thomas the tank engine
Junk food manufacturer targetting kids
* hamburger
* food
* nutrition
* dolls
* thomas the tank engine
Now… Which of those bids are unobjectionable, objectionable but should be lawful, or should be unlawful?
difficult? That’s just the commercial thing, that’s not the this-is-what-the-net-has-in-it survey and reporting, which I see as /much/ more of a freedom issue.
Thank you, DB for your replies – and much impressed I am if generated on a phone screen! Your points are well put, and reinforce my own small understanding of the keyword advertising process, having twice signed up for promotional campaigns with Adwords attaching to search results. But no longer.
My general point remains on the usefulness of the generated links, even given your good examples just above – some of which might be morally objectionable. How either the advertiser, or the consumer might benefit from those placements is just beyond me. Anyway, thank you for taking my query as an honest one.
There are obviously some manual entries in the algorithm, as I just found out.
Try searching for “anagram”.
DB@41 Let the kids cope with advertising I say: that is one area where knowing cynicism is a survival habit. Like not living an excessively germ-free existence.
L@44 – I was after where my less lefty friends saw issues, particularly along the rough progressions I tried to give. As to kids and ads, the only message is one of overcomsumption… No competing memes… The open source computing producers don’t have the cash microsoft does. It’s no different between the ads for woo and the messages to kids of my day “atheism=evil” versus the lack of anti-woo and rationalist message.
One Trackback
[...] This post was mentioned on Twitter by The Paisley Snail and John Hacking, Legal Eagle. Legal Eagle said: Defamation by algorithm: In a previous post, I’ve mentioned the annoying algorithmic habits of Facebook (suggestin… http://bit.ly/dwSfjp [...]