counter on godaddy
post

AOL Just Did the Unthinkable – Boycott AOL?

(Updated)
Thank you, Google for resisting the DOJ’s effort to obtain user search data. You put up a good fight to protect our privacy, and you won. Too bad it was all in vain.

AOL, in blatant violation of its users privacy just released the log of 3 month’s worth of searches by 650,000 users. Not to the DOJ, but for open download by anyone. The claim:

“This collection is distributed for non-commercial research use only. Any application of this collection for commercial purposes is STRICTLY PROHIBITED”

Prohibited. Yeah, right.  As if they could control it. The data is supposedly “anonymized”, which in AOL-speak means the screen-name is replaced by a unique user number. Anyone a little bit familiar with data mining knows what this means, and obviously some commenters on the AOL blog have already put two and two together, “outing” certain users whose identity was easy to find based on the search patterns. I don’t even want to think what data mining pro’s will do with this.

AOL, you betrayed your users. If they are any smart, they will boycott your services.

Update #1 (8/6): I’m going out on a limb here with this prediction: as they realize the magnitude of what they did (or if they don’t, due to the PR nightmare) AOL will apologize, the fingerpointing starts and heads will roll. They will remove the download link. Not before anyone who wanted the data will have obtained it though.

Update #2 (8/6): TechCrunch further elaborates on the “utter stupidity” of this move by AOL:”

“The data includes personal names, addresses, social security numbers and everything else someone might type into a search box. The most serious problem is the fact that many people often search on their own name, or those of their friends and family, to see what information is available about them on the net. Combine these ego searches with porn queries and you have a serious embarrassment. Combine them with “buy ecstasy” and you have evidence of a crime. Combine it with an address, social security number, etc., and you have an identity theft waiting to happen. The possibilities are endless.

Update #3 (8/6): The download link leads to a blank page. Perhaps AOL Exec’s are waking up… I wish all my predictions (see the first update above) would materialize this fast.   I wonder if there will be a black market for the “limited edition” downloaded dataset… eBay, anyone?

Update #4 (8/6): Dennis pondering about possible ramifications, partly based on our Skype IM:

  1. Zoli estimates maybe 1,500-2,000 downloads by the time AOL woke up to what they’d done. What’s the real number?
  2. How long was the file in the wild?
  3. Could illicit copies end up on eBay?
  4. Could market data derived from the file end up on eBay or as part

    of a market intelligence offering? Almost certainly the second if not

    the first.

  5. What will be the impact on AOLs stock price?
  6. Might shorters speculate on the impact?
  7. What about a class action lawsuit? For once I think there are

    decent grounds for one of the ambulance chasers to send out its hit

    squad – they may even get what they need from the file

  8. Will AOL be able to track who got the file?
  9. What is the potential for wholesale identity theft among those 650,000 AOL users?

Update #5 (8/6): The last thing I expected was to find myself deleting comments; but this situation forced me to. A commenter provided a link to his site where he put up the file for anyone to download. I know the cat is out of the bag, and there will be several other sites, but at least I don’t want to actively promote making a bad situation even worse. Since I can’t edit comments, my only choice was to delete it.

Update #6 (8/7): ZDNet agrees: “People will be boycotting the company because of their blatent disregard for the privacy of users.”
The news is out on Infoworld, was well as mainstream news media all the way to Korea.

Update #7 (8/7):  AOL responded by email to John Battelle, also quoted at SiliconBeat.  “The summary: Man, did we screw up.

Related posts:

Technorati : , , , , , , ,

Comments

  1. Anonymous says:

    “Blatant violence”? Did you mean “Blatant violation”? :D

  2. Anonymous says:

    Oh, thanks for catching it, corrected :-)

  3. Anonymous says:

    Reality check:

    1. anonymized search logs are perfectly legal to distribute, and in compliance with all rules

    2. anonymized Query logs are widely used in research, only that people have to specifically license it from google / yahoo etc, by working for them / with them.

    3, Query logs are an invaluable commodity for the research community. Without them all web research is like trying to design something without having any requirements to cater to (tailoring clothes without measurements!).

    4. Big players like Google and Yahoo DO NOT share this with the research community, using it internally to maintain their monopoly.

    Regarding comparisons with Google:

    1. technically, the “Google Trends” service is also an exposed view of their search logs

    2. Google Research also released internal data on the same day, for the same reason as AOL Research. Maybe you should produce blog post worshipping Google about this.

    3. Google not only studies what you search for, it also studies every bit of your email at Gmail, your calendar, and your IM conversations. Anyone who knows how to de-anonymize these search logs can also use adsense to deanonymize your email (using impression statistics).

    I’m myself am totally against AOL and it’s spammy nature of shoving their CDs around to everyone. However, I’m also against ignorant Google-fanatism, hence this comment. Imagine the advances in search technology that this data will bring. Why should a few kilobytes of data hurt?

  4. Anonymous says:

    Yes, I think about 5,000 AOL heads wiull roll over this.

  5. Anonymous says:

    5000 + 1 ?

  6. AOL releases private search data

    AOL just released information about 20 million web queries from 650,000 users. They just changed usernames into random strings, but they kept user-data association. Techcrunch makes privacy implications very clear.
    Blogs are buzzing, AOL users are gett…

  7. Aol Releases Googles most prized Keyword List… Google is gonna get mega spammed.

    I’m shocked that AOL released this data,

  8. AOL Releases User Data

    TechCrunch is reporting that AOL has made available the search histories of 650,000 of their users. The user account name is replaced with an ID number, but as Michael Arrington correctly points out, there is often enough information in search

  9. AOL screws the pooch – or at least about 650,000 of their own users

  10. Stunning Privacy Breach by AOL

    While most reports have commented on personally identifiable information in the queries, there’s a greater risk of identification due to ability to link “questionable” queries to requests to government web sites.

  11. Anonymous says:

    The real threat to privacy isn’t as much the personal information as the presence of timestamps. That allows potentially any query, and thus user, to be tracked back to IP address. Especially if government owned sites are involved.

    See details at Stunning Privacy Breach by AOL.

  12. Boycott AOL

    Zoli Erdos: “AOL, in blatant violation of its users privacy just released the log of 3 month’s worth of searches by 650,000 users. Not to the DOJ, but for open download by anyone.”
    Luckily, I’ve never used AOL for search. Almost…

  13. AOL discloses 650,000 AOL users’ search data

    Well this isn’t going to help AOL’s image. Over the weekend, AOL researchers posted a 400MB+ tarball of the raw search query data of some 650K AOL users over the period from March 1, 2006 to May 30, 2006. While…

  14. Anonymous says:

    Apart from any ethical issues, AOL has breached its contract with its users. This disclosure contradicts AOL’s own privacy policy, which names search data as being part of a user’s network information, says that a user’s network information will only be disclosed as described in the privacy policy, and makes no mention of just publishing the data for public research. (There is a mention of using the data for researching use of the AOL network, but that’s not the same as letting the whole world do that).

    See:

    http://about.aol.com/aolnetwork/aol_pp

    Sean (http://www.prompt-communications.com)

  15. Anonymous says:

    1) There will not be a black market for this data on ebay. Predictably, it is already mirrored and torrented.

    2) searching for drugs or words with drug connotations is not a crime. I doubt there is even probable cause for any police force to get a warrant. At this point, I could type “buy ecstasy” and this page might be the top hit.

    3) Google has released 6 DVDs worth of 5-gram search terms. They will not give a hoot about this “large” dataset

    4) AOL still has users?

  16. Ver

  17. AOL: לא למשרד המשפטים האמריקאי, כן לכל השאר

    בצעד חסר תקדים, חברת AOL החליטה לשחרר את לוג הפעילות של מנוייה לקהל הרחב. השימוש במידע,

  18. Anonymous says:

    time to short AOL stock!

  19. Anonymous says:

    What Google did was release information informing everyone of how often everything is searched for – not who searched for it. There is absolutely no user specific information available from Google’s (soon to be) published data and it should come to no surprise if “porn” tops the list of words searched for. On the other hand, maybe some names are searched for often and it shouldn’t be too much of a surprise to see some in the list, but that doesn’t mean that Steve Jobs or Bill Gates are typing their own names into Google.

    Hey, even with the AOL data there’s still an amount of deniability, but it’s appalling that anyone should be put in the position of having to deny anything. And the idea of a “unique” user ID means that at some point, somewhere AOL probably have a file that says how that corresponds to the user that did the search. If that file ever gets out, then all anonimity has been completely lost.

  20. Anonymous says:

    It might well be that searching for drugs on AOL is not a crime, but after reading this: http://www.twopercentco.com/rants/archives/2006/08/drop_the_sudafe.html , I wouldn’t be so sure.

  21. Don’t Blame Just AOL — The Bloggers are at Fault Too!

    AOL released a large database of searches that includes 20 million web queries from 650,000 AOL users. Even though they changed the AOL username to a random ID number, they did not filter the results in any other manner. Unfortunately, people&#8217…

  22. AOL’s Appallingly Bone-Headed Move

    Something happened over the weekend that I’m at a complete loss to explain. AOL released a list of over 20 million searches by 500,000 users. The online giant apparently did this for “research” purposes, although a key battle was won…

  23. AOL Releases Searchs From 500,000 Users

    Remember the big hubbub of the Government trying to get search data from Google and Microsoft last year? Well, apparently no one at AOL does, they just released search data from 500,000 users, they removed the AOL username, but just changed it to a ran…

  24. Anonymous says:

    It was BAIT, you morons…

  25. Anonymous says:

    AOL will instead scapegoat the people who mirrored the data. They will use their PR team, dupe a government agency into denouncing the “dangerous” linkers, file a lawsuit, and drive the media to villainizing anyone who dared mention *their* bungle as hackers.

    It’s been done before: http://corphq.livejournal.com/60599.html

    Kill the messenger, and all you get is quiet.

    As long as the public falls for this sort of distraction tactic, they will deserve the world of corporate secrecy and cover ups they get.

  26. Anonymous says:

    The information in the database is _not_ anonymized. There are unique ID’s associated with each query, making it very possible to relate the identity of any given person in this database to their search query. The table that relates the anonymous ID’s to the users isn’t out in public – Yet. But I guarantee that information exists somewhere. You better believe it can be subpoenaed if a law enforcement agency that gets their hands on this database decides to go on a fishing expedition. I hope no innocent people on this list were doing research on questionable subjects or else they can say hello to “probable cause”.

  27. Anonymous says:

    Zoli – yawn. I knew it was coming all along because I am a Precog – -)

    read more at

    http://dealarchitect.typepad.com/deal_architect/2006/08/the_intention_e.html

  28. Anonymous says:

    Reporting about it is good. Distributing it to the public is bad–as bad as what AOL did. In fact, it is what AOL did. Yes, the data’s already out there, but that doesn’t mean the users no longer have rights. Personally, I say boycott everyone who intentionally distributes this data. (e.g., Slashdot)

  29. Anonymous says:

    Agree, in fact that’s why for the very first time I had to delete two comments – they were pointing to mirror sites.

  30. Anonymous says:

    hell, AOL won’t be able to stop it. by now, everyone on the internet has seen it, pretty much.

    http://www.aolsearchdatabase.com if you havent

  31. AOL apologizes for privacy leak

    America Online posted a file containing three months of anonymized

    search queries of 658,000 users…

  32. Anonymous says:

    I wonder if you can now search aol search and find the search logs online? How ironic would that be? The search engine actually showing you where to find somthing that they dont want you to find…

  33. These bastards should be shot for this, no-one in their position should make a mistake like this in their position

  34. counter-reality check:

    1. not in my country they aren’t (the Netherlands), unless you told the user you would *and* had business needs in keeping the data in the first place

    2. the govt. asked for them and AOL complied. all products of American govt. research are in the public domain, last time i heard. is this AOL’s not-so-subtle way of making sure the govt. complies with that?

    3. this is not a few kilobytes but a basic violation of trust. AOL deserves to die over this.

  35. Anonymous says:

    AOL users are too dumb to boycott

  36. Anonymous says:

    Can’t call AOL ?????? always push 2 or 4 call this number

    1-866-859-0176 its for aol collection department always

    there no number to push.

  37. Anonymous says:

    wow. That is too bad AOL would do that.

  38. Anonymous says:

    SCAM!

    Your post about using the 1-866-859-0176 number to call AOL is a SCAM.

    Go this this web page:

    http://800notes.com/Area-Code.aspx/1-866

    and find the number in the list, then check the postings by others who almost were scammed.

    If it’s not a scam then, Anonymous poster, please reveal your identity and your location if you dare.

  39. Anonymous says:

    I suppose he won’t … but I still don’t understand. What’s the purpose of the scam?

  40. That’s absolutely disgusting move from AOL. I would never believed that they would do such a thing. I guess they want to have the government by their side as they are trying to “take over” and own the internet.

  41. […] visit. If someone gets enough information about you, they can potentially identify you — the AOL search fiasco is a great example of […]

  42. I like this theme you are using… what is it?

  43. The public should kick AOL the curb with the rest of the trash. No need to put up with AOL’s extreme “Riech Wing” policies. Do they really think they are above the law?
    Boycotting Time Warner might be a good plan as well.

  44. There are amzing Architecture Resumes in sampleresumes.in. This resumes are very effective in the corporate world.

  45. Wow, this is very nice of you to take your time and do. Yours is the only list I’ve been using, and I report like crazy (don’t know if you can see who reports what, but trust me, I report!), and hopefully it will just continue to grow and get bigger. I’ve found about 10 good blogs that I’m a part of now thanks to your list. Thanks!

Trackbacks

  1. […] visit. If someone gets enough information about you, they can potentially identify you — the AOL search fiasco is a great example of […]

  2. […] query logs can be shipped to the academics so that they can then contribute.  But who wants the AOL fiasco?  Aha, but can’t we obfuscate, encrypt and then ship?  Andrew showed in a paper at WWW2007 […]

Leave a Reply

%d bloggers like this: