Disambiguating Image Queries at Google

Better Understanding Image Queries

Years ago, I wouldn’t have expected a search engine telling a searcher about objects in a photograph or video, but search engines have been evolving and getting better at what they do

In February, Google was granted a patent to help return image queries from searches involving identifying objects in photographs and videos. A search engine may have trouble trying to understand what a human may be asking, using a natural language query, and this patent focuses upon disambiguating image queries.

The patent provides the following example:

For example, a user may ask a question about a photograph that the user is viewing on the computing device, such as “What is this?”

The patent tells us that the process in it maybe for image queries, with text, or video queries, or any combination of those.

In response to a searcher asking to identify image queries, a computing device may:

  • Capture a respective image that the user is viewing
  • Transcribe the question
  • Transmit that transcription and the image to a server

The server may receive the transcription and the image from the computing device, and:

  • Identify visual and textual content in the image
  • Generate labels for images in the content of the image, such as locations, entities, names, types …read more

