Artificial Intelligence, Machine Learning, and Deep Learning⁠

What Exactly is "Artificial Intelligence"?

If you use an automated assistant, make a simple Google search, get recommendations on Netflix or Amazon, or find a great deal in your inbox, then you will have interacted with AI (Artificial Intelligence). Indeed, it seems that every company and service today is incorporating AI in some way or another. But let’s dissect what the phrase ‘Artificial Intelligence’ means.

Most people would agree that AI is not so advanced that these companies would have Rosie from The Jetsons analyzing customer data or Skynet making product recommendations on their store page. And on the other end, at some level it is commonly understood that AI is more complex than simple business rules and nested ‘if this, then that’ logical statements.

Things start to get murky when other phrases, often conflated with AI, are added to the mix. Amongst these terms are Machine Learning (ML) and Deep Learning (DL). One company might say they use ML in their analytics, while another might claim to use DL to help enhance creativity. Which one is better or more powerful? Are either of these actually AI? Indeed, a single company may even use these words interchangeably, or use the overlap of definitions to their marketing advantage. Still others may be considering replacing an entire analytics department with DL specialists to take advantage of this new ‘AI Revolution’.

Don’t get swept up by the hype; let’s shine a light on what these terms really mean.

Teasing out the Differences between AI, ML and DL

These three terms⁠—Artificial Intelligence, Machine Learning, and Deep Learning—are critical to understand on their own, but also how they relate to each other; from a sales team explaining the services they provide, to the data scientists who must decide which of these model types to use. And while it is true that each of AI, ML, and DL have their own definitions, data requirements, level of complexity, transparency, and limitations—what that definition is and how each relate is entirely dependent on the context at which you look at them.

For example, what constitutes Machine Learning from a data acquisition perspective might look an awful lot like Deep Learning in that both require massive amounts of labeled data, while neither look at all similar in the context of the types of problems each can solve or even in the context that examines the skill sets that are required to get a specific model up and running.

For the purposes of this thought piece, the context we will be using will be the case of complexity—how the ability of each of Artificial Intelligence, Machine Learning, and Deep Learning simulate human intelligence and how they incrementally build on each other. This simulation of human intelligence, called simply machine intelligence, is measured by the machine’s ability to predict, classify, learn, plan, reason, and/or perceive.

The interlink between Artificial Intelligence, Machine Learning, and Deep Learning is an important one, and it is built on the context of increasing complexity. Due to the strong hierarchical relation between these terms, the graphic above demonstrates how we at Aunalytics have chosen to best to organize these ideas. Artificial Intelligence is the first of the three terms as historically it originated first, as well as the fact that it is the overarching term that covers all work within the field of machine intelligence. AI, as we use it, can be best described in two ways. The most general case definition of Artificial Intelligence is any technique that enables machines to mimic human intelligence.

Indeed, it may seem that any number of things computers are capable of today could be seen as an AI, although the focus here is not the ability to do math or maintain an operating system⁠—these are not ‘intelligent’ enough. Rather, we are considering such application like game AI, assistive programs like Microsoft’s ‘Clippy’, and expert systems which must predict useful material or actions, classify tasks and use cases, or perceive user and environmental behaviors to drive some action. In short, they display machine intelligence.

The key here is that all of these things perform an activity that we might attribute with human intelligence⁠—moving a bar to follow a perceived ball in the classic video game Pong, classifying that you are writing what looks to be a letter and then provide a useful template, or predict an answer for you based on your current problems. In each scenario, the AI is provided some sort of input and must respond with some form of dynamic response based on that input.

Glossary

Artificial Intelligence (AI): Any technique that enables machines to mimic human intelligence, or any rule-based application that simulates human intelligence.

Machine Learning (ML): A subset of AI that incorporates math and statistics in such a way that allows the application to learn from data.

Deep Learning (DL): A subset of ML that uses neural network to learn from unstructured or unlabeled data.

Feature: A measurable attribute of data, determined to be valuable in the learning process.

Neural Network: A set of algorithms inspired by neural connections in the human brain, consisting of thousands to millions of connected processing nodes.

Classification: Identifying to which category a given data point belongs.

Graphics Processing Units (GPUs): Originally designed for graphics processing and output, GPUs are processing components that are capable of performing many operations at once, in parallel, allowing them to perform the more complicated processing tasks necessary for Deep Learning (which was not possible with traditional CPUs).

Reinforcement Learning: A form of Machine Learning where an agent learns to take actions in a well-defined environment to maximize some notion of cumulative reward.

Sampling: Within the context of AI/ML, sampling refers to the act of selecting or generating data points with the objective of improving a downstream algorithm.

Artificial Intelligence: Machines Simulating Human Intelligence

These kinds of activities are all rule-driven, a distinction that leads to our second, more application based definition of AI: any rule-based application that simulates human intelligence. Rule-based activities possess a very limited ability to learn, opting instead to simply execute a predetermined routine given the same input. The easy Pong AI will always execute the rule provided⁠—to follow the ball – and no matter how long it plays it will only be able to play at an easy level. Clippy will always show up on your screen when it thinks that you are writing a letter, no matter how many letters you write or how annoyed you may get. This outright inability to learn leaves much to be desired to reach the bar of what we would consider human-level intelligence.

Machine Learning: Learning from the Data

This brings us to machine learning. Machine learning is a subset of AI that incorporates math and statistics in such a way that allows the application to learn from data. Machine Learning, then, would be primarily considered a data-driven form of Artificial Intelligence, although rule-driven material can still be applied in concert here where appropriate. Again, the key differentiator is that the algorithms used to build a Machine learning model are not hardcoded to yield any particular output behavior. Rather, Machine Learning models are coded such that they are able to ingest data with labels⁠—e.g. this entry refers to flower A, that entry refers to flower B⁠—and then use statistical methods to find relationships within that data in dimensions higher than would be possible for a human to conceptualize. These discovered relationships are key as they represent the actual ‘learning’ in machine learning. Therefore it is the data, not the code, where the desired intelligence is encoded.

Because of this ability to learn from a set of data, generalized models can be made that do great for certain tasks, instead of needing to hardcode a unique AI for each use-case. Common use cases for Machine Learning models include classification tasks, where a Machine Learning model is asked to separate different examples of data into groups based on some learned features. Examples here are such things like decision trees, which learn and show how best to branch features so that you arrive at a homogenous group (all flower A, or all Churning customer). Another common case for Machine Learning is clustering, where an algorithm is not provided labeled data to train on, but rather is given a massive set of data and asked to find what entries are more alike to one another.

In both of these applications there is not only the opportunity for learning, but continual learning⁠—something that hardcoded, rule-based AI simply cannot do effectively. As more data is collected, there is a growing opportunity to retrain the Machine Learning model and thus yield a more robust form of imitated intelligence. Much of modern business intelligence is built on this style of artificial intelligence given the massive amount of data that businesses now posses.

Limitations of Machine Learning

This is not to say that machine learning is the pinnacle of AI, as there are some severe limitations. The largest limitation to this approach is that we, as humans, must painstakingly craft the datasets used to train machine learning models. While there are many generalized models to choose from, they require labeled data and handcrafted ‘features’—categories of data that are determined to be valuable in the learning process. Many datasets already contain useful data, but in some domains this is much less so. Imagine, for example, you wish to build a machine learning model that can intelligently classify cats from cars. Well, perhaps you pull out the texture of fur and the sheen of cars—but this is a very difficult thing to do, and it is made even harder when one considers that the solution of this model should be general enough to apply to all cats and cars, in any environment or position. Sphynx cats don’t have fur, and some older cars have lost their sheen. Even in simpler, non-image cases, the trouble and time spent constructing these datasets can in some cases cost more than the good they can accomplish.

Crafting these feature-rich, labeled datasets is only one of the limitations. Certain data types, like the case with images we already have described, are simply too dimensionally complex to adequately model with machine learning. Indeed, processing images, audio, and video all suffer from this, a reminder that while these forms of AI are powerful, they are not the ultimate solution to every use case. Indeed, there are other use cases, like natural language processing (NLP) where the goal is to understand unstructured text data as well as a human can, where a machine learning model can be constructed—although it should be acknowledged that there exist more powerful approaches that can more accurately model the contextual relations that exist within spoken language.

Deep Learning: A More Powerful Approach Utilizing Neural Networks

We call this more powerful approach ‘Deep Learning’. Deep Learning is a subset of Machine Learning in that it is data-driven modeling, although Deep Learning also adds the concept of neural networks to the mix. Neural networks sound like science fiction and indeed feature prominently in such work, although the concept of neural networks have been around for quite some time. They were first imagined in the field of psychology in the 1940’s around the hypothesis of neural plasticity, and migrated a time later to the field of computer science in 1948 around Turing’s B-type machines. Research around them stagnated, however, due to conceptual gaps and a lack of powerful hardware.

Modern forms of these networks, having bridged those conceptual and hardware gaps, are able to take on the insane level of dimensionality that data-driven tasks demand by simulating, at a naive level, the network-like structure of neurons within a living brain. Inside these artificial networks are hundreds of small nodes that can take in and process a discrete amount of the total data provided, and then pass its output of that interaction onto another layer of neurons. With each successive layer, the connections of the network begin to more accurately model the inherent variability present in the dataset, and thus are able to deliver huge improvements in areas of study previously thought to be beyond the ability of data modeling. With such amazing ability and such a long history, it is important to reiterate that neural networks, and thus Deep Learning, have only become relevant recently due to the availability of cheap, high volume computational power required and the bridging of conceptual gaps.

When people are talking about AI, it is Deep Learning and its derivatives that are at the heart of the most exciting and smartest products. Deep Learning takes the best from Machine Learning and builds upon it, keeping useful abilities like continual learning and data-based modeling to generalize for hundreds of use cases, while adding support for new use cases like image and video classification, or novel data generation. A huge benefit from this impressive ability to learn high dimensional relationships is that we, as humans, do not need to spend hours painstakingly crafting unique features for a machine to digest. Instead of creating custom scripts to extract the presence of fur of a cat, or a shine on a car, we simply provide the Deep Learning models the images of each class we wish to classify. From there, the artificial neurons begin to process the image and learn for itself the features most important to classify the training data. This alone frees up hundreds if not thousands of hours of development and resource time for complex tasks like image and video classification, and yields significantly more accurate results (than other AI approaches).

One of the more exciting possibilities that Deep Learning brings is the capability to learn the gradients of variability in a given dataset. This provides the unique ability to sample along that newly learned function to pull out a new, never-before-seen datapoint that matches the context of the original dataset. NVidia has done some amazing work that demonstrates this, as seen below, using a type of Deep Learning called Generative Adversarial Networks (GANs) which when provided thousands of images of human faces can then sample against the learned feature distribution and by doing so pull out a new human face, one that does not exist in reality, to a startling level of canniness.

Deep Learning Limitations

Like its complexity-predecesor Machine Learning, Deep Learning has its share of drawbacks. For one, Deep Learning yields results in an opaque way due to its methodology, an attribute known as ‘black box modeling’. In other words, the explainability of the model and why it classifies data as it does is not readily apparent. The same functionality that allows Deep Learning so much control in determining its own features is the same functionality that obscures what the model determines as ‘important’. This means that we cannot say why a general Deep Learning model classifies an image as a cat instead of a car—all we can say is that there must be some statistical commonalities within the training set of cats that differs significantly enough from that of the car dataset—and while that is a lot of words, it unfortunately does not give us a lot of actionable information. We cannot say, for example, that because an individual makes above a certain amount of money that they become more likely to repay a loan. This is where Machine Learning techniques, although more limited in their scope, outshine their more conceptually and computationally complex siblings as ML models can and typically do contain this level of information. Especially as DL models become more depended on in fields like self-driving vehicles, this ability to explain decisions will become critical to garner trust in these Artificial Intelligences.

Another large drawback to Deep Learning is the sheer size of the computational workload that it commands. Because these models simulate, even at only a basic degree, the connections present in a human brain, the volume of calculations to propagate information through that network in a time scale that is feasible requires special hardware to complete. This hardware, in the form of Graphics Processing Units (GPUs), are a huge resource cost for any up-and-coming organization digging into Deep Learning. The power of Deep Learning to learn its own features may offset the initial capital expenditure for the required hardware, but even then it is the technical expertise required to integrate GPUs into any technology stack that is still more often than not the true pain point in the whole acquisition, and can be the straw that breaks the camel’s back. Even with such a large prerequisite, the power and flexibility of Deep Learning for a well-structured problem cannot be denied.

Looking Forward

As the technology continues to grow, so too will the organizing ontology we submit today. One such example will be with the rise of what is known as reinforcement learning, a subset of Deep Learning and AI (specific) that learns not necessarily from data alone, but from a combination of data and some well-defined environment. Such technologies take the best of data-driven and rule-driven modeling to become self-training, enabling cheaper data annotation due to a reduction in initial training data required. With these improvements and discoveries, it quickly becomes difficult to predict too far into the future for what may be mainstream next.

The outline of Artificial Intelligence, Machine Learning, and Deep Learning presented here will remain relevant for some time to come. With a larger volume of data every day, and the velocity of data creation increasing with mass adoption of sensors and the mainstream support of the Internet of Things, data-driven modeling will continue to be a requirement for businesses that wish to remain relevant, and important for consumers to be aware of how all this data is actually being used. All of this in the goal of de-mystifying AI, and pulling back the curtain on these models that have drummed up so much public trepidation. Now that the curtain has been pulled back on the fascinating forms of AI available, we can only hope that the magic of mystery has been replaced with the magic of opportunity. Each of AI, ML, and DL has a place in any organization that has the data and problem statements to chew through it, and in return for the effort, unparalleled opportunity to grow and better tailor themselves for their given customer base.

Special thanks to Tyler Danner for compiling this overview.

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot. According to their documentation, whenever HubSpot changes the session cookie, this cookie is also set to determine if the visitor has restarted their browser. If this cookie does not exist when HubSpot manages cookies, it is considered a new session.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
uncode_privacy[consent_types]	1 year	This cookie is set by Uncode WordPress theme and is used to manage privacy settings on the website.
uncodeAI.css	session	This cookie is set by Uncode WordPress theme to run the Adaptive Images system. According to their documentation, these cookies contain runtime information about the viewport and screen resolution, these data are used on any page refresh to calculate the correct Adaptive Images. No personal information are stored within these cookies.
uncodeAI.images	session	This cookie is set by Uncode WordPress theme to run the Adaptive Images system. According to their documentation, these cookies contain runtime information about the viewport and screen resolution, these data are used on any page refresh to calculate the correct Adaptive Images. No personal information are stored within these cookies.
uncodeAI.screen	session	This cookie is set by Uncode WordPress theme to run the Adaptive Images system. According to their documentation, these cookies contain runtime information about the viewport and screen resolution, these data are used on any page refresh to calculate the correct Adaptive Images. No personal information are stored within these cookies.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__hssc	30 minutes	This cookie is set by HubSpot. The purpose of the cookie is to keep track of sessions. This is used to determine if HubSpot should increment the session number and timestamps in the __hstc cookie. It contains the domain, viewCount (increments each pageView in a session), and session start timestamp.
aka_debug		This cookie is set by the provider Vimeo.This cookie is essential for the website to play video functionality. The cookie collects statistical information like how many times the video is displayed and what settings are used for playback.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang		This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
player	1 year	This cookie is used by Vimeo. This cookie is used to save the user's preferences when playing embedded videos from Vimeo.

Cookie	Duration	Description
__hstc	1 year 24 days	This cookie is set by Hubspot and is used for tracking visitors. It contains the domain, utk, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_28928707_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_dc_gtm_UA-28928707-1	1 minute	No description
AnalyticsSyncHistory	1 month	No description
CONSENT	16 years 7 months	No description
li_gc	2 years	No description
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.