Reverse image search and (profile) pics: unwanted linking of private and public information (Airbnb as an example)
We are well aware by now that if you post your full name on a public page, that page will become part of public persona, aka your Google (ego)search results. You might not want to link your home address or holiday destinations to be Google-linked to your name. Yesterday I noticed that Airbnb1 features user profiles that are publicly accessible (you do not have to be logged in to see them) and crawlable (there’s no meta information or robots.txt preventing search engines from indexing the profile) by default2. Airbnb took the wise decision to obfuscate a user’s full name (only first character of the last name is used in the profile), but there’s an other identifier that possibly links your Airbnb to your public profile(s): your profile pic.
There have been reverse image search engines around for a while, but most of them had indexed just a tiny fraction of online images. Last June Google made its Google Goggles service also available on the web as “Search by Image” (test with your favourite holiday pic to see how powerful it is).
I did a little test with the Twitter profile pic of Airbnb’s marketing person, check out the video (full screen to make it readable):
Turns out that search results for a picture link together all your web presences where you used it, just like it does for a full name – including the places where you actually avoided to use your real name (or where it’s obfuscated, as in Airbnb’s case).
Reverse image search <> Face Recognition
As far as we know, Google Images search results are based on general image similarity, not on face recognition algorithms (that would take into account specific metrics such as relative eye-mouth-nose-cheek distances or hair/skin colour and texture). Large-scale publicly available reverse face search engines would have far more unsettling consequences – check out the research published last week by Alessandro Acquisti, Ralph Gross and Fred Stutzman (who actually used technology of a company taken over by Google).
You might think the basketball in the example is the defining feature, yet I performed the test with pics of several other (male, female, different picture composition) Airbnb users and got similar results: pics linking Airbnb profile with Twitter, Flickr, weblogs etc3 .
Lessons for Web companies
Think about image indexability when it comes to your users’ privacy. You can prevent images from being indexed with User-agent and/or folder exclusions in your robots.txt. Airbnb prides itself on its privacy features, so it would make sense to exclude profile pics from search just like it obfuscates your last name (you still will be able to find full names in the user reviews by unwitting guests or hosts though). Some 100 000 profile pics seem to be indexed right now.
In general, make indexability a deliberate choice for your user (not some option tucked away in the privacy settings)4. A consensus document by the EU Data Protection Authorities actually requires you to do so (the document in itself is not law, but in case of litigation or a conflict with a local DPA, a judge will likely let its argumentation weigh in heavily).
Lessons for you as a user
Images are part of your persona. Use a different set of images for different contexts. Just like you might already limit your real name to your public persona (professional), and use pseudonyms for hobby, dating, activism, night life or family related stuff.
(As an extra: use a separate email address for these contexts as well, as many services use your email addresses to connect your online activities in the backend – Rapleaf probably the most notorious – or make your profile discoverable via your email address)
- An online service with social networking features where you can find and offer short-term accommodation – I am a happy user BTW, if you want you can sign up with my referral ID ;-) [↩]
- As I am writing this I notice that you can opt-out in the privacy settings – didn’t notice this before and I assume it was part of last night’s upgrade [↩]
- Obviously I am not going to publicize these as this probably is an unintended and unwanted consequence for them [↩]
- Indexability even still distinct from accessibility without login (in other words: even is a page is publicly accessible, the user must have the option to have noindex tags inserted. [↩]