What is an AI agent?

If you’re like me – when you hear the word “agent” you think of secret agents doing cool stuff. Well, in the world of AI, agents can absolutely be secret agents, but they can also be something like Siri on your iPhone. These AI agents can be any entity that perceives their environment in some way and then can take actions that will help them achieve your goals. For instance, did you know that your computer opponents in Mario Kart are not all trying to come in first? The computer controlled racers are assigned a rank that they are “supposed” to come in at the end of the race. Each agent perceives where they are in the race, how fast they are going, how much they can accelerate, etc. and takes actions like speeding up, slowing down, using items, etc. to reach their goal of finishing the race in their assigned rank. This is a complex task as it involves multiple factors and decision-making processes.

In the constantly changing and evolving landscape of artificial intelligence, “agents” are a cornerstone concept that represent things that do things. Sounds super vague right? Because it intentionally is. Agents can be simple and reflexive, meaning when you drive through a red light, the red light camera snaps a picture of your license plate. Agents can also be consistently monitoring levels of internet traffic over time and using that historic data to determine when an alert needs to be sent because traffic isn’t lining up with what it “should be”. All agents have one thing in common—they are able to perceive their environment and take some kind of action.

Agents have two fundamental components: sensors to perceive their environment and actuators to interact with their environment.

All agents have some sort of mechanism to perceive their environment. This could range from simple data inputs, like a camera at an intersection, to more complex inputs like natural language or other symbolic forms that then have to be processed. This is where the concept of a rational agent comes into play, as it is designed to make the best possible decision in a given situation based on its knowledge and reasoning ability.

And since the ultimate purpose of an agentic AI is to act upon its environment, it must be able to do something. These actuators can range from a camera being able to snap a picture, to a conversational AI chatbot being able to produce a natural language response to a question.

When we move to something like the racers in Mario Kart, we add in reasoning and a goal. Each NPC has a goal position in the lineup that they’re trying to achieve. Their perception involves their current place in the lineup, their speed, whether or not they can accelerate more, whether or not they have items to use, etc. The racer then has to reason about what the best course of action is to reach their goal position. Is it to use an item and then accelerate? Is it to accelerate and save the item for later? It’s likely that there’s a model behind these racers that has been trained through practice races and has the historic experiences to make that decision; but because all of these models are based on probabilities, the racer may make different decisions in different races, producing an engaging experience for the player no matter how many times they play.

Types of AI Agents

So what types of agents are there. Here’s a simple taxonomy:

Simple Reflex Agents

These agents make decisions based solely on their current perception, without considering the history of events. They are reactive agents and lack the ability to plan for the future.

Model-Based Reflex Agents

These agents maintain a model of the world and are able to act in a partially observable environment because of it. These agents have knowledge of how their actions affect their world.

Goal-Based Agents

These agents have been given a specific goal and then evaluate different actions based on how well they contribute to achieving those goals. These agents have some amount of choice.

Utility-Based Agents

These agents assign value and utility to different outcomes and choose actions that maximize that expected utility. This introduces a consideration of preferences and trade-offs.

Learning Agents

These agents incorporate feedback, typically via machine learning, to adapt and improve their behavior over time based on experience. This is a key aspect of generative AI, where the system learns to generate new content based on its training data.

Once you have an understanding of each agent, you can start putting them together into multi-agent systems. Multi-agent systems can be heterogeneous, meaning agents have different capabilities and goals, or homogeneous, meaning all agents have the same capabilities and goals. These systems can be cooperative or competitive, with agents either working together towards a common goal or working against each other to each reach their own goal. We can also create hierarchical agents, with a system where high-level agents set goals and parameters for low-level agents that perform significantly more simple tasks. This allows both high-level and low-level agents to be more efficient and specialized.

Applications of AI Agents

Virtual Assistants

AI agents like Siri, Google Assistant, and Alexa are examples of virtual assistants. They all perceive audio from users, process that audio, and decide the best response to any particular inquiry.

Robotics

AI agents in the field of robotics can range from controlling robots on an assembly line to controlling an autonomous car. These agents perceive things from traffic to dimensions and act in ways like screwing pieces together and changing lanes.

Cybersecurity

AI agents for cybersecurity can detect things like malware, network intrusion, and DDoS attacks. These agents perceive things like atypical network traffic and act in ways like alerting personnel to these abnormalities.

Gaming

AI agents in gaming make worlds feel more alive by adding depth to non-player characters. These agents vary from racers in Mario Kart to The Director in Left 4 Dead, perceiving things from their place in the race to how easily a player is winning fights and acting in ways from speeding up to adjusting the intensity of enemies.

Many other examples of AI agents already in use exist. Some other broad examples include: healthcare agents that provide personalized treatment plans, smart home monitoring agents that adjust temperature for when you’re home and not home, and environmental monitoring agents that track things like weather patterns and crop yields and alerts scientists to changes.

Evolving Challenges and Future Prospects

While AI agents have made remarkable strides, challenges persist. Issues related to ethical decision-making, biases in algorithms, and the explainability of AI agent decisions are areas of active research. As the uses for AI continue evolving and people get more familiar with both the challenges and opportunities, it’ll become increasingly important to address the potential pitfalls. For instance, while ChatGPT from OpenAI clearly states “ChatGPT can make mistakes. Consider checking important information.”, not everyone reads the fine print or takes it to heart and some pass on information as if it were fact when it may not be.

These agents have massively improved the quality of life for people all over the globe and will continue to do so. From luxurious experiences like being able to nap while your car drives itself, to life altering crop production feeding the world, AI agents can certainly be a force for good.

AI Agents You Can Develop With Now (without having to make your own)

GPT-3 by OpenAI: Used for natural language processing tasks, including chatbots and content creation.

TensorFlow by Google: A comprehensive library for machine learning and neural network development.

PyTorch by Facebook: Known for its flexibility and ease of use in research and development for AI.

Azure AI by Microsoft: Offers a suite of AI services including machine learning, knowledge mining, and AI apps.

IBM Watson: Known for its powerful NLP capabilities, suitable for building complex AI solutions.

Dialogflow by Google: A user-friendly platform for creating conversational user interfaces.