Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added a Sessions blog - PENDING APPROVAL until vector DB is done #2228

Merged
merged 8 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import type { Metadata } from "next";
import { Inter } from "next/font/google";
import NavBar from "@/components/layout/navbar";
import Footer from "@/components/layout/footer";
import "@mintlify/mdx/dist/styles.css";
import { Analytics } from "@vercel/analytics/react";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: NavBar, Footer, and Analytics components are imported but not used in the layout


const inter = Inter({ subsets: ["latin"] });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Inter font is defined but not applied to the layout


export const metadata: Metadata = {
title: "Helicone / LLM-Observability for Developers",
description: "The open-source platform for logging, monitoring, and debugging.",
icons: "https://www.helicone.ai/static/logo.png",
openGraph: {
type: "website",
siteName: "Helicone.ai",
title: "Debugging RAG Chatbots and AI Agents with Sessions",
url: "https://www.helicone.ai/blog/debugging-chatbots-and-ai-agents-with-sessions",
description: "How well do you understand your users' intents? At which point in the multi-step process does your model start hallucinating? Do you find consistent problems with a specific part of your AI agent workflow?",
images: "https://www.helicone.ai/static/blog/agent-cover.webp",
locale: "en_US",
},
twitter: {
title: "Debugging RAG Chatbots and AI Agents with Sessions",
description: "How well do you understand your users' intents? At which point in the multi-step process does your model start hallucinating? Do you find consistent problems with a specific part of your AI agent workflow?",
card: "summary_large_image",
images: "https://www.helicone.ai/static/blog/agent-cover.webp",
},
};

export default function RootLayout({
children,
}: Readonly<{
children: React.ReactNode;
}>) {
return (
<>
{children}
</>
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider wrapping children in a main tag for better semantic structure

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import { getCompiledServerMdx } from "@mintlify/mdx";
import path from "path";
import fs from "fs";
import Link from "next/link";
import { ChevronLeftIcon } from "@heroicons/react/20/solid";
import "@mintlify/mdx/dist/styles.css";

export default async function Home() {
const filePath = path.join(
process.cwd(),
"app",
"blog",
"debugging-chatbots-and-ai-agents-with-sessions",
"src.mdx"
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider using a constant for the blog post directory path


const source = fs.readFileSync(filePath, "utf8");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Add error handling for file read operation


const { content, frontmatter } = await getCompiledServerMdx({
source,
});

return (
<div className="w-full bg-[#f8feff] h-full antialiased relative">
<div className="flex flex-col md:flex-row items-start w-full mx-auto max-w-5xl py-16 px-4 md:py-24 relative">
<div className="w-56 h-full flex flex-col space-y-2 md:sticky top-16 md:top-32">
<Link href="/blog" className="flex items-center gap-1">
<ChevronLeftIcon className="w-4 h-4" />
<span className="text-sm font-bold">back</span>
</Link>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Add aria-label to improve accessibility of back link

<h3 className="text-sm font-semibold text-gray-500 pt-8">
<span className="text-black">Time</span>: {String(frontmatter.time)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Use a more semantic HTML element for time, like

</h3>
<h3 className="text-sm font-semibold text-gray-500">
<span className="text-black">Created</span>:{" "}
{String(frontmatter.date)}
</h3>
<h3 className="text-sm font-semibold text-gray-500">
<span className="text-black">Author</span>:{" "}
{String(frontmatter.author)}
</h3>
</div>
<article className="prose w-full h-full">
<h1 className="text-bold text-sky-500 mt-16 md:mt-0">
{String(frontmatter.title)}
</h1>
{content}
</article>
</div>
</div>
);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
---
title: "Debugging RAG Chatbots and AI Agents with Sessions"
description: "How well do you understand your users' intents? At which point in the multi-step process does your model start hallucinating? Do you find consistent problems with a specific part of your AI agent workflow?"
author: "Lina Lam"
date: "Jul 2, 2024"
time: "5 minute read"
icon: "developer"
---

How well do you understand your users' intents?
At which point in the multi-step process does your model start hallucinating?
Do you find consistent problems with a specific part of your AI agent workflow?

![Debugging RAG Chatbots and AI Agents with Sessions](/static/blog/agent-cover.webp)


These are common questions developers face when building AI agents and Retrieval Augmented Generation (RAG) chatbots. Here's the truth: getting reliable responses and minimizing errors like hallucination is incredibly challenging without visibility into how users interact with your Large Language Model (LLM).

**But how can you improve AI responses if you can't measure them?** In this blog, we will discuss how Sessions can help you trace user conversations and pinpoint errors in your agent's task executions.

### Table of content

- **<span style={{color: '#0ea5e9'}}><a href="#what-are-ai-agents">What are AI agents</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#how-do-ai-agents-work">How do AI agents work?</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#challenges-of-building-ai-agents">Challenges of building AI agents</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#what-are-sessions">What are sessions?</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#using-sessions-in-helicone">Setting up Sessions in Helicone</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#how-different-industries-debug-ai-agents-with-sessions">How different industries debug AI agents with Sessions</a></span>**
- **<span style={{color: '#0ea5e9'}}><a href="#are-ai-agents-the-future">Are AI agents the future?</a></span>**

First, let's talk about AI agents. If you are familiar with the concept, please skip ahead to **"Setting up Sessions in Helicone"**.



---

## What are AI agents

An AI agent is a software program that interacts with its environment, collects data, and uses this data to perform tasks autonomously to achieve set goals. While we set the goals, the AI agent decides the best actions to reach them.

For example, an AI agent used in a Contact Center can handle customer queries by asking questions, searching internal documents, and providing sound solutions. If it can't resolve the issue on its own, it will escalate it to a human.


## How do AI agents work?

AI agents are different from regular software in that they autonomously perform tasks based on rational decision-making principles after being given some predefined goals or instructions.

![How AI Agents work](/static/blog/how-agents-work.webp)

They simplify and automate complex tasks by following a structured workflow:

1. **Setting goals**: AI agents are given specific goals from the user, which they break down into smaller actionable tasks.
2. **Acquiring data**: they collect necessary information from their environment, such as data from physical sensors for robots or software inputs like customer queries for chatbots. They often access external sources on the internet or interact with other agents or models to gather data.
3. **Implementing the task**: With the acquired data, AI agents methodically implement the tasks, they evaluate their progress and adjust as needed based on feedback and internal logs.

By analyzing this data, AI agents predict the optimal outcomes aligned with the preset goals and determine what actions to take next. For example, self-driving cars use sensor data to navigate obstacles effectively. This iterative process continues until the agent achieves the designated goal.



## Challenges of Debugging AI agents

1. **Complex decision-making**:
AI agents make decisions based on a multitude of inputs and data sources. Understanding the rationale behind each decision requires deep insight into how the agent processes and interprets this data, which can be intricate and multifaceted.
2. **Lack of visibility**:
Without structured ways to group related traces and data points, gaining a comprehensive view of an entire interaction or task execution flow is challenging. This lack of visibility hampers the ability to understand the context of errors, making debugging difficult.
3. **Interpretable models**:
Many AI models, especially deep learning models, act as "black boxes" with internal workings that are difficult to interpret. This opacity makes it challenging to understand why an agent made a particular decision, complicating the debugging process.

While it is incredibly difficult to understand how the internal workings of a model work, with Session, a key feature offered by many LLM monitoring tools like Helicone, helps to facilitate more effective debugging.

---

## What are Sessions?

**Sessions** provide a simple way to organize and group related LLM calls, making it easier for developers to trace nested agent workflows and visualize interactions between the user and the AI chatbot or agent.

Instead of looking at isolated data points, sessions allow you to see a comprehensive view of an entire conversation or interactive flow. This holistic approach helps you understand the context of each interaction, allowing you to:

- drill down on specific LLM calls to view your agent's flow of task execution.
- simplify the debugging process as you can identify issues quicker given a better understanding of the context of errors.
- refine your chatbot's responses based on specific contexts.

## Using Sessions in Helicone

1. Simply add `Helicone-Session-Id` to start tracking your sessions.
2. Add `Helicone-Session-Path`to specific parent and child traces.

Here is an example in **TypeScript**:

```ts
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://oai.helicone.ai/v1",
defaultHeaders: {
"Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
},
});

const session = randomUUID();

openai.chat.completions.create(
{
messages: [
{
role: "user",
content: "Generate an abstract for a course on space.",
},
],
model: "gpt-4",
},
{
headers: {
"Helicone-Session-Id": session,
"Helicone-Session-Path": "/abstract",
},
}
);
```



## How Different Industries Debug Agents with Sessions

1. **Resolving errors in a multi-step process**

Travel chatbots assist users with booking flights, hotels, and rental cars. The booking process involves multiple steps and requires gathering details, which can be prone to errors. Developers can trace the entire booking process to identify where users are encountering issues.

**For example,** if users consistently report problems with flight booking confirmations, you can review each trace in the Session to identify where the process failed (e.g., incorrect data parsing or integration issues with the airline's API).

2. **Understanding user intents**

Health and fitness chatbots can provide personalized workout plans and dietary advice. These chatbots are useful if they can deliver personalized experiences, which can only be done with an adequate understanding of the user's intents.

Developers can see what users are inquiring about through Sessions to understand their fitness goals and their expectations of the responses and fine-tune the prompts to generate responses that aligners closer to the user's preferences.


**For example,** if users often ask about specific types of workouts (i.e., strength training vs. cardio), developers can prompt the chatbot to offer more personalized plans and advice, thereby improving user satisfaction.

1. **Improving response accuracy**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: This numbered item should be '3.' instead of '1.' to maintain the correct sequence.


Virtual assistants like Siri, Alexa, and Google Assistant rely on AI to handle diverse user queries effectively. By leveraging Session data, developers can track user interactions over time, to ensure continuity and improved response accuracy.

**For example,** analyzing session logs allows developers to understand how users interact with the assistant across different queries and contexts. This insight helps in refining the assistant's algorithms to better anticipate user needs and provide more seamless and helpful assistance.


---

## Are AI agents the future?

AI agents are helpful because of their ability to autonomously perform tasks and make decisions without human intervention. As their responses become more accurate and tailored, AI agents have the potential to play a significant role in our future. AI agents are already being used in various fields such as customer service, healthcare, education, legal and autonomous driving.

However, the success of AI agents also depends on overcoming challenges like ensuring ethical use, improving reliability and accuracy, and addressing concerns related to job displacement across different industries. To tackle these challenges, it's increasingly important to adopt tools that monitor and maximize the visibility of AI agents and chatbot performance, to ensure they effectively address user inquiries and achieve successful widespread adoption.

## Resources

- Sessions doc: https://docs.helicone.ai/features/sessions
17 changes: 17 additions & 0 deletions bifrost/app/blog/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,23 @@ export type BlogStructure = {
};

const blogContent: BlogStructure[] = [
{
title: "Debugging RAG Chatbots and AI Agents with Sessions",
description:
"How well do you understand your users' intents? At which point in the multi-step process does your model start hallucinating? Do you find consistent problems with a specific part of your AI agent workflow?",
badgeText: "developer",
date: "July 2, 2024",
href: "/blog/debugging-chatbots-and-ai-agents-with-sessions",
imageUrl: "/static/blog/agent-cover.webp",
authors: [
{
name: "Lina Lam",
imageUrl: "/static/blog/linalam-headshot.webp",
imageAlt: "Lina Lam's headshot",
},
],
time: "5 minute read",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider making the reading time more specific, e.g., '5-minute read' instead of '5 minute read' for consistency with other entries.

},
{
title: "Best Practices for AI Developers: Full Guide (June 2024)",
description:
Expand Down
Binary file added bifrost/public/static/blog/agent-cover.webp
Binary file not shown.
Binary file added bifrost/public/static/blog/how-agents-work.webp
Binary file not shown.
Loading