What is AI … to a Jury

Sonali Son
4 min readAug 8, 2024

--

AI modernization assumes many things: we have a robust underlying dataset, AI experts on model fine tuning, engineering experts on designing infrastructure at scale, ample resources to spend training. One of the overlooked assumptions is that we understand “AI” to begin with. While Google, OpenAI, Meta race to create compelling LLMs, the ability to leverage the technology to improve productivity hinges on whether executives who can most benefit from AI improvements understand what it is. …..[lawsuits are being filed at a record pace].

In everyday speech, when folks talk about AI, they are really talking about the new range of models which has produced ChatGPT. However, there are still personalization, recommedation engines, image recogniation deep learning models and the range of “Traditional AI” which still powers much of the computer-powered decision making that humans would otherwise do.

What is AI? It’s intended for a non-tech audience hoping for a fun and generalized high-level view of current popular models such as ChatGPT, in the larger and broader context of Artificial Intelligence. At the end of the blog, the reader should be able to understand the key elements of what creates an AI product, and how to evaluate changes to a model over time.

Below is the NIST definition of AI:

“(1) A branch of computer science devoted to developing data processing systems that performs functions normally associated with human intelligence, such as reasoning, learning, and self-improvement.
(2) The capability of a device to perform functions that are normally associated with human intelligence such as reasoning, learning, and self-improvement.”

My favorite definition of AI is far more straightforward by Mirella Lapata:

“We get a computer program to do the work that a human would otherwise do.”

AI is not new. I have worked on machine learning systems for over 15 years developing lifetime value, recommendation and personalization models to essentially replace human curation. A lot of it was done on a local machine and eventually, over time, using distributed system of GPUs over the cloud. Not tens of thousands of GPUs. And certainly not having to deal with the scale of infrastructure and cost powering AI models today. So, what’s different today? Is all the fuss really over a clever chatbot?

It is useful to look at how AI has evolved in the last 30 years to center us on the evolution in the coming decade and why Chat GPT prompted the surge in AI investments.

The earliest examples include algorithmic trading, marketing choices, translation services (google translated launched in 2006) and search engines (google PageRank was invented in 1996). Though algorithmic trading and marketing came with longer latency than we see today, we were using machines and computer programs to choose buy/trade/sell decisions and marketing choices since the 90s. Standard regression techniques relied on existing inputs and outputs to created predictions given the same set of inputs and outputs. With translation we are able to take a sentence, refer to a dictionary for word-by-word translation and on top of that we add language-based rules of how to construct a sentence in the target language. In the earliest search engines, the innovation was gathering data on engagement and associating words to index of relevant sites based on engagement. Both concepts were similar where we have a fixed mapping of an input (words) to an output (translated words, websites).

When the trading, marketing model, search engine or translation service is given something entirely new, models works by extrapolating on the observed relationships based on data it has trained on to then approximate an imposing models. The classic straight line through data that many of us have seen in college statistics still very much applies.

The models used to create the mapping or index rely training with words or data it has seen. In addition, we need words with ample data that we can then use our model to train on. Only then can we create an output.

Then what is Generative AI?

By the early 2000s (10 years after google search), we were able to start working with inputs that the models have not seen. For example, in 2015, style transfer learnings were developed which was a system that used neural networks to separate and recombine content and style of any image to then create new artistic images. By 2020, Instagram, Tik Tok and Snap lenses allow users to take photos on their camera and then “translate” them into paintings, drawing and other fun images. Alexa or Siri and voice recognition AI is able to recognize an entirely new voice and set of words and provide an answer (most of the time). In these cases, the input (image, audio) is entirely new and we use the model, itself, to generate another entirely new image or audio. These are all example of generative AI. When generative AI we are creating new and original content. The model can create a new image, a new piece of code, new text, new sound, new video etc-the model generates completely new output. LLMs are Large Language Models which are a type of generative AI which can communicate using human language. Unlike previous models, where we take new data and try to constrain it on the relationships we have imposed, with genAI the model is not constrained by the data it trained on…but we still do need data to train the models (more on that below).

References

What is Generative AI and how does it work? The Turning Lectures with Mirella Lapata

What is ChatGPT doing..and why does it work? Stephen Wolfram

Embeddings: What are they and why they matter, Simon Willison’s Weblog including his PyBay talk

--

--

Sonali Son

Founder & CEO Pallas Analytics: Responsible AI by Design.