Computer Vision

Most people know of Larry Roberts due to the fact that he received the Draper Prize in 2001 for the work he did in helping to develop the internet. However, in my opinion he did something far greater which was his work on establishing the field of computer vision. But first lets establish some context: What is Computer Vision? Put simply it is quite literally the field that trains computers to recognize, interpret and understand the literal world!

Computer vision is obviously a field that has developed a lot in the realm of computer science and its important to understand why and how the field was invented. For that we look towards the idea of “What problem needed to be solved and how this solution will do that in a feasible way?”

What is the Problem?

Larry Roberts problems : How can a computer recognize three-dimensional objects from a two-dimensional image? This involves understanding the shape, size, and orientation of objects in an image. He also wanted edged detection, so that the computer would be able to tell where an image ends, and where an object within an image ends. In short he wanted a computer to be able to not just recognize what a given object is but also where it was in relation to other objects in an image. Its almost as if he knew that car manufactures in particular would be extremely interested in this technology as it is developed into a consumer grade technology.

How will it be solved?

For anyone who has ever been frustrated by Google’s re-CAPTCHA, do not feel too bad, after all you are contributing to the evolution of computer vision. Google specifically trains their vision with your answers to their questions. Notice lately, how a lot of these questions seem to be more driving related lately (Click all images with motorcycles/red lights), that is very intentional as it not just Google who wants your data to train their AI’s Vision, its the entire automotive industry plus most other companies.  Is Google Using Us To Train Self Driving Cars? - WingArc Australia

This works out well Google and the companies that pay to use Google’s data as it is the basis of how autonomous self driving vehicles work, they start with the basics, and work their way up to more advanced recognition.

So whats the history of it?

Like most technologies the innovations start slow, but accelerate as time goes on to where it is getting exponentially better year over year. In the 1970s computer vision was used in the automotive manufacturing process, in the Canny Edge detector as it would find defections in the products allowing the manufacturers to fix them as the vision finds them. After that the first smart cameras optical mouse was made by Xerox in the 1980s. In the 1990s, with the rise of Al Gore and the internet to the average American household now either uses or knows the internet since it has become so ubiquitous at this point.  Also with the dot com bubble where many people have invested lots of money into random tech stops almost simply for the fact they were online tech companies it was good for your stock price to be a tech company before the bubble burst. In the 2000s Viola Jones (the framework) was made and it was able to tell faces apart for facial detection. And just a few years later a Japanese cell phone would be the first mobile cellular phone to  have facial recognition built right in. About a literal decade before it would be more common place in the world.

In the 2010s however, Computer Vision finally hit the exponential stage, arguably more has happened in this decade than all previous decades combined in the field of computer vision. In this decade it went from a technology that mainly just a very niche field and many people saw a lot of potential in it, to where people use the power of the potential within the field to be used as a tool to their goals now. In the beginning of the decade it surpassed the average human at image recognition, To than being used by the largest media franchise in the world for their first step into non-gaming orientated mobile software(cell phones) for Pokemon GO in 2016(Also making over 5 billion USD since launch). Than just a year later, the largest company in the world unveiled their new flagship smartphone that dropped the already proven technology of TouchID in favor of FaceID, which like Pokemon GO is built on the literal decades of work that have been put into Computer Vision. Both the iPhone X and Nintendo’s Pokemon GO have been massive successes in their own right proving that Computer Vision in the 2010s was a field that had been realized enough to finally start cashing out the big bucks. Even in 2023 ChatGPT (the current leader in generative AI for home users) added image detection and is able to look a picture with a mug that can say something like “World’s Best Aunt”, and just start sharing information(though not always accurate) about the given topic of image given.

What has it changed about the world and technologies so far?

For a lot of people the bandwagon is always something to be wary of FOMO (fear of missing out). While computer vision is most definitely not a fad like hoverboards from 2015 were, a lot of technologies like AI and NFTs have faced a bandwagon like effect earlier this decade, where despite being built on some great technologies, when the average consumer gets aware of it too early in my opinion it hurts the technology as a whole, since they view it as poorly made when I feel it is not poorly made you simply saw it too earlier in its evolution to appreciate what it is going to be in later time. NFTs became a meme despite being built on the same technology of Blockchain that powers Bitcoin, LyteCoin and Ethereum, because the same people who used the technology for dumb things were also its loudest advocates making the technology itself look idiotic. Bringing it back to computer vision while I would say at this stage it hasn’t full on replaced any technologies yet it has certainly been assistive when put into collaboration with other technologies as computer vision is its namesake it is simply just vision of the computer. That being said in the following fields it has definitely been extremely useful. First in OCR, where text recognition is done to see the words in an image, and some phones like those made by Apple allow to even copy text from them to use. Making note taking and google translate a a lot easier. Moving from images to Barcode scanning, quite simply the barcodes only work due to computer vision as before it, we used to use punch cards, which may be fine 100 years ago, but grocery stores need something a bit more handle/user friendly. Its much easier to scan a barcode for the cashier than it is to use punch cards for everything you are buying at Costco. As brought up earlier manufacturers use computer vision to check/look for defections in their products as selling poorly made products to the final customer is a sure fire way to hurt your company as a whole. Especially with automotive manufactures, who have also been working on self driving/parking cars. While currently self driving cars are a bit riskier since most manufacturers still require you to keep your hands on the wheel and to remain attentive on the road (to prevent accidents), self parking systems are a lot closer to being fully reliable. Also with parking systems from cars to parking lots with security cameras we see computer vision being used to identify cars by their license plate when the average human would never be able to make a sense of the blurry images that CCTV cameras make, despite even budget mobile phones have better cameras than them.

Should a self-driving car kill the baby or the grandma? Depends on where  you're from. | MIT Technology Review

Soon we will have to start solving ethical dilemmas with computer vision as assistance. How do we rank the value of life in the scenario above? Should we value lived experience, or future life to be lived? Or should something different happen where it prioritizes the lives outside the vehicle due to the argument that can be made that the inattentive driver should be punished not the seemingly innocent pedestrians who are simply trying to get their new location safely.

Can Computer Vision be replaced?

Computer Vision will work with AI to most likely get rid of simpler, more repetitive jobs, something like Data Entry ironically is currently in the process of being killed by its own semi-creation. However, can Computer Vision get rid of itself or can it replace something else, just like Barcodes replaced punch cards can something better replace computer vision. I think personally not, Computer Vision is more similar to a concept than a physical technology as software with many different types of implementation. As such it is difficult to find a technology that can replace it, since that given technology would basically also have to replace AI, since AI and Computer Vision go hand in hand. As such due to the research i have done I don’t see Computer Vision being replaced anytime soon, however I am not an innovator so I am not trying to be prophetic I simply think at this time its new enough to stick around for now. How Does Computer Vision Work – InData Labs

The image above shows what I am speaking of. How do you get a computer to show you this without using computer vision, in my head no matter how I think about it it seems just physically impossible it would be like trying to add two numbers without addition, to me it is almost a logical fallacy, if a new technology sees better than all current computer vision right now that doesn’t make it non computer vision it simply just makes it a better computer vision.

Bibliography:

Administrator. (2023, September 27). What is Computer Vision? an introduction. University of San Diego Online Degrees. https://onlinedegrees.sandiego.edu/introduction-to-computer-vision/

Ambika, A. (2023, September 18). What is Computer Vision? (history, applications, challenges). Medium. https://medium.com/@ambika199820/what-is-computer-vision-history-applications-challenges-13f5759b48a5

Azure, M. (n.d.). What is Computer Vision?: Microsoft Azure. What Is Computer Vision? | Microsoft Azure. https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-computer-vision#object-classification

History of computer vision and its principles. alwaysAI. (n.d.). https://alwaysai.co/blog/history-computer-vision-principles

How artificial intelligence revolutionized computer vision: A brief history. Motion Metrics. (2021, April 16). https://www.motionmetrics.com/how-artificial-intelligence-revolutionized-computer-vision-a-brief-history/

IBM. (n.d.). What is Computer Vision? https://www.ibm.com/topics/computer-vision

Team, T. S. (2023, December 11). What is cv? computer vision solutions explained. Gemmo.AI. https://gemmo.ai/computer-vision-solutions-explained/

TELUS International. (2021, July 15). Computer vision through the ages. Computer Vision Through the Ages | TELUS International. https://www.telusinternational.com/insights/ai-data/article/computer-vision-through-the-ages