LangGraph范式-Chat Bot Evaluation as Multi-agent Simulation

Chat Bot Evaluation as Multi-agent Simulation 是一种使用 LangGraph 模拟用户与聊天机器人交互的方法，旨在自动化评估聊天机器人的性能。传统的手动测试耗时且难以复现，而通过模拟用户（Simulated User）与聊天机器人（Chat Bot）之间的多代理交互，可以更高效地测试机器人在不同场景下的表现。这种方法特别适用于客户支持机器人等需要处理复杂对话的应用。

主要内容

Simulated User Node：通常基于 LLM（如 ChatOpenAI）生成用户输入，结合提示模板（Prompt Template）模拟真实用户行为。例如，模拟用户可能是一个航空公司的客户，提出退款或预订请求。

Chat Bot Node：实现聊天机器人逻辑，可以调用外部 API 或基于 LLM生成响应。例如，机器人可能被配置为航空公司的客户支持代理，遵循特定政策响应用户请求。

停止条件（should_continue）：检查是否满足终止条件（如消息数量超过阈值或用户发送“FINISHED”）

一个关键挑战是区分模拟用户和聊天机器人的消息，因为两者都由 LLM 生成（均为 AI Messages）。LangGraph 的解决方案是：

假设 HumanMessages 表示模拟用户的消息，AIMessages 表示聊天机器人的消息。

以下是完整源代码

from typing import List
from openai import OpenAI
from langchain_community.llms import Tongyi

###############################################################################################
# 定义参数
###############################################################################################
import f_common
import os

###############################################################################################
# 请求LLM的消息和返回内容
###############################################################################################
# This is flexible, but you can define your agent here, or call your agent API here.
def my_chat_bot(messages: List[dict]) -> dict:
    system_message = {
        "role": "system",
        "content": "You are a customer support agent for an airline.",
    }
    messages = [system_message] + messages
    completion = f_common.myQwen_LLM.chat.completions.create(
        messages=messages, model="qwen-plus"
    )
    return completion.choices[0].message.model_dump()

# 测试调用一次该方法
my_chat_bot([{"role": "user", "content": "hi!"}])

###############################################################################################
# 模拟一个用户的请求
###############################################################################################
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI

system_prompt_template = """You are a customer of an airline company. \
You are interacting with a user who is a customer support person. \

{instructions}

When you are finished with the conversation, respond with a single word 'FINISHED'"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt_template),
        MessagesPlaceholder(variable_name="messages"),
    ]
)
instructions = """Your name is Harrison. You are trying to get a refund for the trip you took to Alaska. \
You want them to give you ALL the money back. \
This trip happened 5 years ago."""

prompt = prompt.partial(name="Harrison", instructions=instructions)
# 以下代码套用了一个回答模板
model=Tongyi(temperature=1)
#model = ChatOpenAI()

simulated_user = prompt | model

from langchain_core.messages import HumanMessage

messages = [HumanMessage(content="Hi! How can I help you?")]
simulated_user.invoke({"messages": messages})

###############################################################################################
# # 这里定义了一个Bot的Node，模拟Bot的对话请求
###############################################################################################

from langchain_community.adapters.openai import convert_message_to_dict
from langchain_core.messages import AIMessage

def chat_bot_node(state):
    messages = state["messages"]
    # Convert from LangChain format to the OpenAI format, which our chatbot function expects.
    messages = [convert_message_to_dict(m) for m in messages]
    # Call the chat bot
    chat_bot_response = my_chat_bot(messages)
    # Respond with an AI Message
    return {"messages": [AIMessage(content=chat_bot_response["content"])]}

def _swap_roles(messages):
    new_messages = []
    for m in messages:
        if isinstance(m, AIMessage):
            new_messages.append(HumanMessage(content=m.content))
        else:
            new_messages.append(AIMessage(content=m.content))
    return new_messages

# 这里定义了一个用户的Node，模拟用户的对话请求
def simulated_user_node(state):
    messages = state["messages"]
    # Swap roles of messages
    new_messages = _swap_roles(messages)
    # Call the simulated user
    response = simulated_user.invoke({"messages": new_messages})
    # This response is an AI message - we need to flip this to be a human message
    return {"messages": [HumanMessage(content=response)]}

# 这里简单的定义消息的数量超过6条就结束。
def should_continue(state):
    messages = state["messages"]
    if len(messages) > 6:
        return "end"
    elif messages[-1].content == "FINISHED":
        return "end"
    else:
        return "continue"

from langgraph.graph import END, StateGraph, START
from langgraph.graph.message import add_messages
from typing import Annotated
from typing_extensions import TypedDict

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
graph_builder.add_node("user", simulated_user_node)
graph_builder.add_node("chat_bot", chat_bot_node)
# Every response from  your chat bot will automatically go to the
# simulated user
graph_builder.add_edge("chat_bot", "user")
graph_builder.add_conditional_edges(
    "user",
    should_continue,
    # If the finish criteria are met, we will stop the simulation,
    # otherwise, the virtual user's message will be sent to your chat bot
    {
        "end": END,
        "continue": "chat_bot",
    },
)
# The input will first go to your chat bot
graph_builder.add_edge(START, "chat_bot")
simulation = graph_builder.compile()

# 这里是主要的测试内容，相互的输出对话消息
for chunk in simulation.stream({"messages": []}):
    # Print out all events aside from the final end chunk
    if END not in chunk:
        print(chunk)
        print("----")