In the next five years, computers will begin to mimic and augment the senses. Through this week, IBM outlines five upcoming advances that will change our world. In Part 1, we explore the future of Sight.
With the explosion in the number of images and videos on the Internet and ongoing amassing of huge volumes of visual data from digital cameras and mobile devices, we are logging our own history. The number of digital cameras and camera phones in the world has surpassed 1 billion, and 500 billion photos are taken each year.
And we’re not just taking pictures. Pictures are being taken all around us, whether it’s diagnostic medical images such as an ultrasound, traffic cameras on our drive to work or video safety and security systems for our homes and buildings.
Scientists believe that within the next five years, systems will not only be able to look at and recognize the contents of this visual data, they will turn the pixels into meaning, beginning to make sense out of it similar to the way a human views and interprets an image.
The growth in Big Data has resulted in a flood of information that today’s systems have a challenge understanding. In fact, 80% of existing data is unstructured, and 70-90% of this unstructured data is in a multimedia format of some kind.
Today, we use systems like Watson’s deep analysis capabilities to navigate the complexities of human language and analyze massive amounts of unstructured data exceptionally effectively. But this data is limited to information that is in the form of text and numbers. In this new era of computing, we will be able to use the computational form of sight to figure out the who, what, when and where behind a picture.
Advances in computing will help bridge the semantic gap by better understanding visual content by recognizing patterns and being taught what to look for that will make analyzing multimedia possible. These systems will use ‚”brain-like‚” capabilities to extract information from visual media, bring more meaning and context to this data and uncover new insights. These systems will require much more sophisticated analytics and enormous computational resources fueling a need for new kinds of software and hardware that make this possible.
Within five years, these capabilities will be put to work in healthcare by making sense out of massive volumes of medical information such as MRIs, CT scans, x-rays and ultrasound images to capture any number of views that include anatomy, organs, pathologies etc. Oftentimes what’s critical in these images is very subtle and requires careful measurement. By being trained to discriminate what to look for in these types of images and correlating them with other information, such as patient records and scientific literature, systems that can ‚’see’ will help doctors make earlier detections of health conditions, making better outcomes for patients possible.
Beyond healthcare, other industries such as retail, energy and utilities, and news and entertainment will benefit from these advances. In the case of retail this explosion of visual data presents new opportunities for consumers and marketers. In the future, we’ll have the ability to make better recommendations and suggestions for things consumers may want to buy, based on their interests and trends as gleaned from images shared on social networks such as Pinterest and Fancy. Utilities could leverage this technology to better manage their electricity networks, and in the case of a natural disaster, analyze the multitude of streaming video generated from cameras on board unmanned aerial vehicles to assess where the worst hit areas are and to suggest where to prioritize restoration.