Reverse image search and (profile) pics: unwanted linking of private and public information (Airbnb as an example)

Airbnb profile of Christopher Lukezic, marketeer at AirbnbWe are well aware by now that if you post your full name on a public page, that page will become part of public persona, aka your Google (ego)search results. You might not want to link your home address or holiday destinations to be Google-linked to your name. Yesterday I noticed that Airbnb1 features user profiles that are publicly accessible (you do not have to be logged in to see them) and crawlable (there’s no meta information or robots.txt preventing search engines from indexing the profile) by default2.  Airbnb took the wise decision to obfuscate a user’s full name (only first character of the last name is used in the profile), but there’s an other identifier that possibly links your Airbnb to your public profile(s): your profile pic.

Images as linking identifiers

There have been reverse image search engines around for a while, but most of them had indexed just a tiny fraction of online images. Last June Google made its Google Goggles service also available on the web as “Search by Image” (test with your favourite holiday pic to see how powerful it is).

I did a little test with the Twitter profile pic of Airbnb’s marketing person, check out the video (full screen to make it readable):

Turns out that search results for a picture link together all your web presences where you used it, just like it does for a full name – including the places where you actually avoided to use your real name (or where it’s obfuscated, as in Airbnb’s case).

Reverse image search results for Christopher Lukezic's Twitter avatar.

Reverse image search <> Face Recognition

As far as we know, Google Images search results are based on general image similarity, not on face recognition algorithms (that would take into account specific metrics such as relative eye-mouth-nose-cheek distances or hair/skin colour and texture).  Large-scale publicly available reverse face search engines would have far more unsettling consequences – check out the research published last week by Alessandro Acquisti, Ralph Gross and Fred Stutzman (who actually used technology of a company taken over by Google).

You might think the basketball in the example is the defining feature, yet I performed the test with pics of several other (male, female, different picture composition) Airbnb users and got similar results: pics linking Airbnb profile with Twitter, Flickr, weblogs etc3 .

Lessons for Web companies

Airbnb about privacy

Think about image indexability when it comes to your users’ privacy.  You can prevent images from being indexed with User-agent and/or folder exclusions in your robots.txt. Airbnb prides itself on its privacy features, so it would make sense to exclude profile pics from search just like it obfuscates your last name (you still will be able to find full names in the user reviews by unwitting guests or hosts though).  Some 100 000 profile pics seem to be indexed right now.

In general, make indexability a deliberate choice for your user (not some option tucked away in the privacy settings)4.  A consensus document by the EU Data Protection Authorities actually requires you to do so (the document in itself is not law, but in case of litigation or a conflict with a local DPA, a judge will likely let its argumentation weigh in heavily).

Lessons for you as a user

Images are part of your persona.  Use a different set of images for different contexts. Just like you might already limit your real name to your public persona (professional), and use pseudonyms for hobby, dating, activism, night life or family related stuff.

(As an extra: use a separate email address for these contexts as well, as many services use your email addresses to connect your online activities in the backend – Rapleaf probably the most notorious – or make your profile discoverable via your email address)

  1. An online service with social networking features where you can find and offer short-term accommodation – I am a happy user BTW, if you want you can sign up with my referral ID ;-) []
  2. As I am writing this I notice that you can opt-out in the privacy settings – didn’t notice this before and I assume it was part of last night’s upgrade []
  3. Obviously I am not going to publicize these as this probably is an unintended and unwanted consequence for them []
  4. Indexability even still distinct from accessibility without login (in other words: even is a page is publicly accessible, the user must have the option to have noindex tags inserted. []

16 Responses to “Reverse image search and (profile) pics: unwanted linking of private and public information (Airbnb as an example)”

  1. gdupont Says:

    Well, I do understand the problem here: same picture used in different context on the Web and reverse engine enabling to link this (believed) separated context.

    However, being frank that’s not a disclosure of private information… private information is “de facto” not on the web. What is on the web is public. Sometimes more difficult to find that other, but it’s on the web so it’s public (or could be public if the webmaster is not wise, or the crawler is aggressive…). So the problem is not on google side (or any image search engine – why reverse by the way ?), it’s on the side of user and their control on is on the web about them…

  2. Pascal Says:

    @gdupont

    Your quote:

    “private information is “de facto” not on the web. What is on the web is public. Sometimes more difficult to find that other, but it’s on the web so it’s public”

    Seems like a pessimistic vision, no? There’s always a risk of disclosure if security measures fail, but if online services can’t best-effort-guarantee any privacy then a service like Airbnb would simply not have the success it has now.

    Your quote:

    “So the problem is not on google side (or any image search engine – why reverse by the way ?), it’s on the side of user”

    As a user, you have to play within the framework that is offered by the online service. If a service doesn’t give you a possibility to choose your privacy level (or doesn’t make you aware of it) then you do not have control…

  3. dude Says:

    Cool! Or, wait.. are you saying this is a bad thing…? If you don’t want to be found via a profile pic of yourself, then simply don’t upload one?…

  4. Jenna Says:

    This article is stupid. Why would you use a real picture if you’re hiding your real name? You don’t need any software to make that connection.

  5. Anonymous Says:

    @Jenna: I agree that people *should* come to that conclusion, but search-by-image and similar tools don’t seem widely known enough for people to realize that.

  6. Pascal Says:

    @Jenna @Dude

    Have you tried Airbnb? If you want to host someone or be someone’s guest, then you need to win their trust. That’s kindof hard if you’re faceless… using Airbnb without a good face pic doesn’t make any sense.

    And as an Airbnb user, it’s not your intention to hide your name or face to people you will be staying with. It’s just that the rest of the world shouldn’t be able to learn where you stay and who with by either googling your name or (in case you routinely use the same pic) reverse-searching your profile pic.

  7. Chris Norstrom Says:

    Tineye.com has been doing this for years. I think they were the original little startup that did reverse image searching, and they’re good too. Too bad google didn’t buy them, or maybe it’s their fault for not selling.

  8. Christopher Lukezic Says:

    Hey Pascal,

    Thanks for covering us. I’m flattered you chose me as your example! Just wanted to point out a few corrections:

    1. The YouTube video (1:01) says the reverse image search “shows the Airbnb places where I actually stayed”. This is not correct. The page you pull up is a public RSVP list for an Airbnb Meetup that I publicly RSVP’d to. (link: https://www.airbnb.com/meetups/w2m2mn82j-lerne-die-airbnb-grunder-in-berlin-kennen).

    2. At 1:25 in the video you refer to the Meet up page as “the review pages”. These are two different things.

    3. Reviews are publicly posted content. This is stated at the time of leaving a review. In contrast, we do not show everywhere people stay. We only show where people publicly left a review. This is an important fact for us as we maintain privacy for guests who do not wish to post a review.

    4. We go beyond preventing just images from being indexed. You can remove your entire profile if you like. Inside your Airbnb account, it is easy to remove yourself from public search engines (Google, Bing, etc). This setting lives within Airbnb > Profile > Privacy. http://cl.ly/452M0b0V1c2w0F1p1Z1i

    Please let me know if you have any questions. Thanks again for taking the time to correct these things in the video and blog. It’s important to us that our community doesn’t get confused about privacy.

  9. Pascal Says:

    Hello Christopher,
    Thanks for your points of feedback. I will go through them one by one:

    1. The page I pull up is indeed not a review.  I had been testing several users and several images and had lost track.  The search result I had been planning to show was the reverse search result for your Tumblr avatar, and in these search results 3 place reviews and one host review show up (and 2 additional place reviews after clicking the link “repeat the search with the omitted results included”). 
    2. My apologies for the confusion indeed, I will insert a correcting caption in the Youtube video (by tomorrow).
    3. The wording in the video “the places where Christopher actually stayed” is indeed inaccurate.  The search results reveal publicly posted (place and host) reviews indeed.  Of which we can assume that they correspond to places where the reviewer stayed, but most like not the full list.  However, while inaccurate, the point I am trying to make remains the same: reverse image search makes information retrievable that, albeit publicly accessible, most users did not expect was retrievable.
    4. I mentioned the opt-out possibility in a footnote – the wording in the blog post was that the page was crawlable by default. 

    Still a question:
    You mention in point 4: “We go beyond preventing just images from being indexed”.  It is true that if you remove yourself from public search engines, the embedded images will no longer show up in Google image search results (unless those images are embedded elsewhere, but then they still will not refer to Airbnb pages).  But I don’t see any measures preventing images from being indexed by default (whereas you do take effort to not have full names indexed by omitting the last name)?

  10. Christopher Lukezic Says:

    Thanks for the clarification Pascal.

    Your posted prompted me to start exploring how this type of image search could be used down the road for cross matching social profiles (based on the photo) across the web. Think of image matching a piece of thread that will ultimately create the fabric for an “authentic” reputation based web identity which will help collaborative consumption thrive.

    Will keep you posted.

    Cheers,

    C.

  11. Pascal Says:

    As someone who worked in privacy protection before, I would be wary of such a service when it would be about face pics :-)

    But it would be a way to link up discussions about objects (landmarks, clothing, devices…) in a wording/language-independent way. Don’t see however how anyone else but Google/Tineye could do that (and what the business idea would be).

  12. YvesHanoulle Says:

    Hi Pascal,

    isn’t the reverse pictures search based on the name of he picture etc?
    When I search with an lesser know picture of me, (one where I did not change the default name of the picture, I get lots of picture of unknow people to me.
    When I use my profile picture, I do see my online profile, (strangely enough first profiles of people I follow on youtube)

    yep anything you do can and will be used against you, in court, in love or just plainly in real life

  13. Vergeet privacy, alles wat je online zet is publiek (video) « Is het nu generatie X, Y of Einstein? Says:

    [...] de video via Netlash op deze site met nog meer informatie. Deel dit:TwitterFacebookE-mailPrintStumbleUponDiggLinkedInRedditLike this:LikeWees de eerste om [...]

  14. Pascal Says:

    @Yves:

    If I upload a picture, have a look at the results, rename it, upload it again to images.google.com, the textual results are the same (only some of the “similar images” vary slightly)

    I don’t see any influence of the name of the uploaded file. (search for the word “filename”at http://searchengineland.com/up-close-with-google-search-by-image-82313 to find a similar conclusion there).

    It is possible Google uses the content of the page the picture appears on, to finetune the results, that we can only guess.

    (BTW You see profiles of people you follow on youtube because that’s where your profile pic appears among the followers, just like you see conference pages that had your picture http://j.mp/oykxsv )

  15. KenP Says:

    I’m curious, I recently learned of a profile that was created falsely on one of the social websites I won’t name. I was suspicious that the person was not real so I used a combination of TinEye and the Google Reverse Photo Imaging site to uncover the various times these images had been used. However I have been given pictures that I uploaded from the same person and cannot find using these search engines and feel someone else’s photos have been stolen claiming to be someone else entirely. Most of the pictures are consistent with the same two people in the photographs but I guess since they were not used in some advertised form they are not searchable? Is it possible to search “real” photos to verify if they are attached to someone else’s social website?

  16. Janelle Says:

    Hi KenP,

    I have exactly the same concern as you do. I’ve been trying TinEye and Google Reverse Image Search using different photos of the same person to see if it would come up as somebody else’s to no avail. I know that there is an issue here about privacy, but I guess it just depends on how these reverse image searches are used. In this case, I want to know the real person behind the photos since there are doubts on the existence of its current user as a poser. Can anyone help KenP and me regarding this concern? Thanks.