Google Now Using ReCAPTCHA To Decode Street View Addresses

Screenshot_recaptcha

Have you started seeing images in online reCAPTCHAs that look suspiciously like house numbers pulled from Google Street View? Well, as it turns out, that’s exactly what they are. Google confirmed it’s currently running an experiment that involves using its reCAPTCHA spam-fighting system to improve data in Google Maps by having users identify things like street names and business addresses.

reCAPTCHA, for those unfamiliar, is the system originally developed at Carnegie Mellon University to improve upon the use of CAPTCHAs (aka, the “Completely Automated Public Turing Test To Tell Computers and Humans Apart”) – it’s the distorted text meant to stop bots from signing up for online accounts. The reCAPTCHA technology was acquired by Google in 2009, and if you use the web, you’ve definitely used it before. It’s what puts those security questions on websites that ask you to identify the words and numbers in the pictures displayed to verify you’re human.

The system is designed to cut down on spam and fraud, but it also helps digitize the text in printed materials, like books and newspapers. Google has been using reCAPTCHA to digitize content for Google Books, for example, as well as for the Google News archives.

Over the past few days, however, some users have been seeing another type of reCAPTCHA appear – photographs. The new reCAPTCHAs present an image where one side contains the warped text users are familiar with, while the other side shows a somewhat blurry (as if zoomed in) photo of numbers. The numbers are clearly street addresses, which has led to some speculation that Google was pulling these from Google Street View.

One place where this new reCAPTCHA has been known to pop up is on Google’s AdWords website, and specifically on the page hosting the keyword tool. You won’t always see this new reCAPTCHA, though – I refreshed this page a dozen or more times this morning, for example, and still couldn’t get it to appear. Your mileage may vary.

The above image is one example of what the new reCAPTCHAs look like.

A larger collection of these images also recently appeared on the Blackhatworld forums (below):

According to a Google spokesperson, the system isn’t limited to street addresses, but also involves street names and even traffic signs. We haven’t spotted any of those other types in the wild, though.

Says Google:

We’re currently running an experiment in which characters from Street View images are appearing in CAPTCHAs. We often extract data such as street names and traffic signs from Street View imagery to improve Google Maps with useful information like business addresses and locations. Based on the data and results of these reCaptcha tests, we’ll determine if using imagery might also be an effective way to further refine our tools for fighting machine and bot-related abuse online.

Although many users are just now noticing the new images appear, Google says the experiment actually began a couple of weeks ago.

Image credit: Ian for the top photo; Blackhatworld user “dirtbag” (heh.)


Meet Duolingo, Google’s Next Acquisition Target; Learn A Language, Help The Web

In 2005, then-Carnegie Mellon PhD grad student Luis von Ahn had an idea for a game. In one of the first examples of true crowdsourcing, he had people looking at images and labeling them to improve image search. Google acquired ESP Game in 2005 and renamed it Google Image Labeler.

In 2007, now-Professor von Ahn had another idea. He realized that all the time people wasted typing in CAPTCHAs could be used for some good: helping to digitize books. Out of Carnegie Mellon he launched the project as reCAPTCHA, a startup. And guess what? In 2009, Google bought it as well.

Now von Ahn is back again.

Duolingo is his latest project. It has been blowing up on Hacker News for the past day, though not too much is known about it. But we got a chance to get a bit more out of von Ahn, and not surprisingly, the idea is very cool.

Von Ahn notes that over the past year and a half, his Carnegie Mellon team has been quietly working on this new idea. It originally arose from a single question: how can you get 100 million people on the web translating everything into different languages for free?

One problem is that there aren’t that many people that are truly bilingual. Another problem is the whole “free” thing.

So along with his PhD student Severin Hacker (yep, that’s his name), von Ahn twisted the idea on its side. Instead of getting people to do something that felt like unpaid work, why not spin it as a learning experience? That’s exactly what Duolingo does.

The solution was to transform language translation into something that millions of people WANT to do, and that helps with the problem of lack of bilinguals: language education,” von Ahn writes. It is estimated that there are over 1 billion people learning a foreign language. So, the site that we’ve been working on, Duolingo, will be a 100% free language learning site in which people learn by helping to translate the Web. That is, they learn by doing,” he continues.

Smart.

Von Ahn didn’t want to give too much else away as their still finalizing the service. He notes that it should be ready for a private beta in a few weeks. “We’re now mostly testing the site, and it really works — it teaches users a foreign language very well, and the combined translations that we get in return are as accurate as those from professional language translators,” he says.

He also says that while it’s currently just a project under Carnegie Mellon, he and Hacker may turn it into a company. And if they do, the countdown to Google buying them will officially be on. I give it 6 months until that happens. Tops.


16 visitors online now
10 guests, 6 bots, 0 members
Max visitors today: 30 at 02:05 am EDT
This month: 41 at 06-19-2013 10:39 am EDT
This year: 112 at 04-11-2013 09:43 am EDT
All time: 112 at 04-11-2013 09:43 am EDT
Get Adobe Flash player