Skip to Main Content
Stony Brook University

Guide to Generative AI

A guide to tools, resources, and issues regarding ChatGPT and other generative AI technologies in academics and research.

Introduction

This guide is meant to provide a general overview of generative AI. It introduces you to basic concepts, effective uses, a sample of possible tools, and additional concerns about using generative AI in research and writing. 

Keep in mind:

  • Follow any rules, directions, or restrictions about AI use that are outlined in your course syllabus. Always discuss any questions you have with your instructor.
  • Be transparent. If you are using AI, document that use according to the current rules of whatever academic citation style you're using.
  • You are the author of your own work and responsible for any assignment you submit for a course.
  • AI and its implementations in relation to library databases and other platforms are constantly changing.
  • You should always review the outputs of AI models to verify their accuracy, ensure they align with your intended purpose, and check for any biases or errors before using or sharing the information.

AI Terms & Concepts

The definitions below come from a combination of these helpful glossaries & guides:

Agent

A computer program or system that is designed to perceive its environment, make decisions and take actions to achieve a specific goal or set of goals. The agent operates autonomously, meaning it is not directly controlled by a human operator.

Artificial intelligence

This term goes back many years, and likely first began to be associated with computer systems that can do complex tasks associated with human skills in the 1950s. The term can mean different things in different contexts and often needs further clarification when used  generally.

Black Box Artificial Intelligence [“Black box” algorithms vs “Black box” LLMs – (see "AI's mysterious 'black box' problem, explained")]

An AI system whose internal workings are a mystery to its users. Users can see the system’s inputs and outputs, but they can’t see what happens within the AI tool to produce those outputs.

Deep Learning

A subset of machine learning that involves neural networks with many layers. It distinguishes itself from other forms of neural networks primarily through its capability to learn features automatically from data.

Deepfake

A fake / artificial image or video generated by deep learning techniques.

Fine-tuning (and fine-tuning vs training)

the process of taking a pre-trained AI model and further training it on a specific, often smaller, dataset, and in a supervised way,  to adapt it to particular tasks or requirements. Fine-tuning requires much less data and compute power than the original pre-training. Well fine-tuned models can outperform much larger models.

Generative AI

Artificial intelligence systems that can generate new content –  text, video, audio or images – in response to prompts from a user. Generative AI analyzes and discerns patterns in extensive training datasets so it can construct new content. Although sometimes generated content might seem human, it does not have human consciousness.

Generative Pre-trained Transformers (GPT)

A type of large language model (LLM) primarily used for natural language processing tasks. GPT models are based on the transformer architecture, which allows them to efficiently process and generate human-like text by learning from vast amounts of data. The “pre-trained” aspect refers to the initial extensive training these models undergo on large text corpora, allowing them to understand and predict language patterns. This pre-training equips the GPT models with a broad understanding of language, context, and aspects of world knowledge.

Hallucinations

Incorrect or misleading results that AI models generate. These errors can be caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model, or biases in the data used to train the model. The concept of AI hallucinations underscores the need for critical evaluation and verification of AI-generated information, as relying solely on AI outputs without scrutiny could lead to the dissemination of misinformation or flawed analyses.

Natural language processing (NLP)

Emerging from the interdisciplinary field of computational linguistics, NLP enables computers to understand, interpret, and generate human language.

Large language models (LLMs)

artificial intelligence systems specifically designed to understand, generate, and interact with human language on a large scale. These models are trained on enormous datasets comprising a wide range of text sources, enabling them to grasp the nuances, complexities, and varied contexts of natural language. LLMs like GPT (Generative Pre-trained Transformer) use deep learning techniques, particularly transformer architectures, to process and predict text sequences, making them adept at tasks such as language translation, question-answering, content generation, and sentiment analysis.

Machine learning

Uses data and algorithms to train computers to make classifications, generate predictions, or uncover similarities or trends across large datasets.

Neural networks

A mathematical system that actively learns skills by identifying and analyzing statistical patterns in data. This system features multiple layers of artificial neurons, which are computational models inspired by the neurons in our brain.

Parameters

The internal variables of an AI model that are learned from the training data. These parameters are the core components that define the behaviour of the model and determine how it processes input data to produce output. In a neural network, parameters typically include weights and biases associated with the neurons.

Prompt engineering

The crafting of input prompts to effectively guide AI models, particularly those like Generative Pre-trained Transformers (GPT), in producing specific and desired outputs. This practice involves formulating and structuring prompts to leverage the AI’s understanding and capabilities, thereby optimizing the relevance, accuracy, and quality of the generated content.

Reinforcement learning

A type of learning algorithm where an agent learns to make decisions by performing actions in an environment to achieve a certain goal. The learning process is guided by feedback in the form of rewards or punishments — positive reinforcement for desired actions and negative reinforcement for undesired actions. The agent learns to maximize its cumulative reward through trial and error, gradually improving its strategy or policy over time.

Retrieval Augmented Generation (RAG)

Combines the strengths of both retrieval-based and generative models. In this approach, an AI system first retrieves information from a large dataset or knowledge base and then uses this retrieved data to generate a response or output. Essentially, the RAG model augments the generation process with additional context or information pulled from relevant sources. Perplexity.ai is an example of a Generative AI tool that uses RAG.

Stochastic parrot

a term that conveys the fact that large language models merely create plausible-sounding text based on probabilities and on the data it's trained on without actually understanding the meaning of what they generate. Like a parrot repeating what it thinks you want to hear. The term was coined in a research paper published in 2021. 

Training

Training is the process by which a machine learning model, such as a neural network, learns to perform a specific task. This is achieved by exposing the model to a large set of data, known as the training dataset, and allowing it to iteratively adjust its internal parameters to minimize errors in its output. During training, the model makes predictions or generates outputs based on its current state. These outputs are then compared to the desired results, and the difference (or error) is used to adjust the model’s parameters. This process is repeated numerous times, with the model gradually improving its accuracy and ability to perform the task. For example, a language model is trained on vast amounts of text so that it learns to understand and generate human-like language.

Transformers

A transformer is a type of architecture in deep learning, a subfield of artificial intelligence (AI) that represents a departure from previous models which processed data sequentially. Instead, transformers use a mechanism known as ‘self-attention’ to process entire sequences of data (like sentences in a paragraph) simultaneously. This approach allows transformers to capture complex relationships and dependencies in the data, regardless of their distance within the sequence.