콜백: 에이전트 행동 관찰, 맞춤화 및 제어¶

소개: 콜백이란 무엇이며 왜 사용해야 할까요?¶

콜백은 ADK의 핵심 기능으로, 에이전트의 실행 프로세스에 연결할 수 있는 강력한 메커니즘을 제공합니다. 이를 통해 핵심 ADK 프레임워크 코드를 수정하지 않고도 특정, 미리 정의된 지점에서 에이전트의 행동을 관찰, 맞춤화 및 제어할 수 있습니다.

콜백이란 무엇인가요? 본질적으로 콜백은 여러분이 정의하는 표준 함수입니다. 그런 다음 에이전트를 생성할 때 이러한 함수를 에이전트와 연결합니다. ADK 프레임워크는 주요 단계에서 여러분의 함수를 자동으로 호출하여 관찰하거나 개입할 수 있도록 합니다. 에이전트 프로세스 중의 체크포인트와 같다고 생각하면 됩니다:

에이전트가 요청에 대한 주요 작업을 시작하기 전과 완료한 후: 에이전트에게 어떤 일을 하도록 요청하면(예: 질문에 답하기), 응답을 알아내기 위해 내부 로직을 실행합니다.
Before Agent 콜백은 이 주요 작업이 해당 특정 요청에 대해 시작되기 직전에 실행됩니다.
After Agent 콜백은 에이전트가 해당 요청에 대한 모든 단계를 완료하고 최종 결과를 준비했지만, 결과가 반환되기 직전에 바로 실행됩니다.
이 "주요 작업"은 해당 단일 요청을 처리하기 위한 에이전트의 전체 프로세스를 포함합니다. 여기에는 LLM을 호출하기로 결정하고, 실제로 LLM을 호출하고, 도구를 사용하기로 결정하고, 도구를 사용하고, 결과를 처리하고, 마지막으로 답변을 종합하는 것이 포함될 수 있습니다. 이러한 콜백은 본질적으로 입력을 받아 해당 상호작용에 대한 최종 출력을 생성하는 전체 시퀀스를 감쌉니다.
거대 언어 모델(LLM)에 요청을 보내기 전 또는 응답을 받은 후: 이러한 콜백(Before Model, After Model)을 사용하면 LLM으로 오가는 데이터를 구체적으로 검사하거나 수정할 수 있습니다.
도구(Python 함수 또는 다른 에이전트 등)를 실행하기 전 또는 완료된 후: 마찬가지로 Before Tool 및 After Tool 콜백은 에이전트가 호출한 도구의 실행을 중심으로 제어 지점을 제공합니다.

왜 사용해야 할까요? 콜백은 상당한 유연성을 제공하고 고급 에이전트 기능을 가능하게 합니다:

관찰 및 디버그: 모니터링 및 문제 해결을 위해 중요한 단계에서 상세 정보를 기록합니다.
맞춤화 및 제어: 에이전트를 통해 흐르는 데이터(LLM 요청 또는 도구 결과 등)를 수정하거나 로직에 따라 특정 단계를 완전히 우회합니다.
가드레일 구현: 안전 규칙을 시행하고, 입력/출력을 검증하거나, 허용되지 않는 작업을 방지합니다.
상태 관리: 실행 중에 에이전트의 세션 상태를 읽거나 동적으로 업데이트합니다.
통합 및 향상: 외부 작업(API 호출, 알림)을 트리거하거나 캐싱과 같은 기능을 추가합니다.

어떻게 추가하나요?

코드

PythonJava

from google.adk.agents import LlmAgent
from google.adk.agents.callback_context import CallbackContext
from google.adk.models import LlmResponse, LlmRequest
from typing import Optional

# --- Define your callback function ---
def my_before_model_logic(
    callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
    print(f"Callback running before model call for agent: {callback_context.agent_name}")
    # ... your custom logic here ...
    return None # Allow the model call to proceed

# --- Register it during Agent creation ---
my_agent = LlmAgent(
    name="MyCallbackAgent",
    model="gemini-2.0-flash", # Or your desired model
    instruction="Be helpful.",
    # Other agent parameters...
    before_model_callback=my_before_model_logic # Pass the function here
)

import com.google.adk.agents.CallbackContext;
import com.google.adk.agents.Callbacks;
import com.google.adk.agents.LlmAgent;
import com.google.adk.models.LlmRequest;
import java.util.Optional;

public class AgentWithBeforeModelCallback {

  public static void main(String[] args) {
    // --- Define your callback logic ---
    Callbacks.BeforeModelCallbackSync myBeforeModelLogic =
        (CallbackContext callbackContext, LlmRequest llmRequest) -> {
          System.out.println(
              "Callback running before model call for agent: " + callbackContext.agentName());
          // ... your custom logic here ...

          // Return Optional.empty() to allow the model call to proceed,
          // similar to returning None in the Python example.
          // If you wanted to return a response and skip the model call,
          // you would return Optional.of(yourLlmResponse).
          return Optional.empty();
        };

    // --- Register it during Agent creation ---
    LlmAgent myAgent =
        LlmAgent.builder()
            .name("MyCallbackAgent")
            .model("gemini-2.0-flash") // Or your desired model
            .instruction("Be helpful.")
            // Other agent parameters...
            .beforeModelCallbackSync(myBeforeModelLogic) // Pass the callback implementation here
            .build();
  }
}

콜백 메커니즘: 가로채기 및 제어¶

ADK 프레임워크가 콜백이 실행될 수 있는 지점(예: LLM을 호출하기 직전)에 도달하면, 해당 에이전트에 대해 해당하는 콜백 함수를 제공했는지 확인합니다. 만약 제공했다면 프레임워크는 여러분의 함수를 실행합니다.

컨텍스트가 핵심입니다: 여러분의 콜백 함수는 고립되어 호출되지 않습니다. 프레임워크는 특별한 컨텍스트 객체(CallbackContext 또는 ToolContext)를 인수로 제공합니다. 이러한 객체에는 호출 세부 정보, 세션 상태, 그리고 아티팩트나 메모리와 같은 서비스에 대한 참조를 포함하여 에이전트 실행의 현재 상태에 대한 중요한 정보가 포함되어 있습니다. 이러한 컨텍스트 객체를 사용하여 상황을 이해하고 프레임워크와 상호 작용합니다. (자세한 내용은 전용 "컨텍스트 객체" 섹션을 참조하십시오).

흐름 제어 (핵심 메커니즘): 콜백의 가장 강력한 측면은 반환 값이 에이전트의 후속 작업에 어떻게 영향을 미치는지에 있습니다. 이것이 실행 흐름을 가로채고 제어하는 방법입니다:

return None (기본 동작 허용):
- 특정 반환 유형은 언어에 따라 다를 수 있습니다. Java에서는 동등한 반환 유형이 Optional.empty()입니다. 언어별 지침은 API 문서를 참조하십시오.
- 이는 콜백이 작업(예: 로깅, 검사, 변경 가능한 입력 인수(예: llm_request)에 대한 사소한 수정)을 완료했으며 ADK 에이전트가 정상적인 작업을 계속해야 함을 알리는 표준적인 방법입니다.
- before_* 콜백(before_agent, before_model, before_tool)의 경우, None을 반환하면 다음 단계(에이전트 로직 실행, LLM 호출, 도구 실행)가 발생합니다.
- after_* 콜백(after_agent, after_model, after_tool)의 경우, None을 반환하면 이전 단계에서 방금 생성된 결과(에이전트의 출력, LLM의 응답, 도구의 결과)가 그대로 사용됩니다.
return <Specific Object> (기본 동작 재정의):
- None 대신 특정 유형의 객체를 반환하는 것은 ADK 에이전트의 기본 동작을 재정의하는 방법입니다. 프레임워크는 여러분이 반환한 객체를 사용하고 일반적으로 뒤따를 단계를 건너뛰거나 방금 생성된 결과를 대체합니다.
- before_agent_callback → types.Content: 에이전트의 주요 실행 로직(_run_async_impl / _run_live_impl)을 건너뜁니다. 반환된 Content 객체는 즉시 이 턴에 대한 에이전트의 최종 출력으로 처리됩니다. 간단한 요청을 직접 처리하거나 접근 제어를 시행하는 데 유용합니다.
- before_model_callback → LlmResponse: 외부 거대 언어 모델 호출을 건너뜁니다. 반환된 LlmResponse 객체는 LLM의 실제 응답인 것처럼 처리됩니다. 입력 가드레일, 프롬프트 유효성 검사 또는 캐시된 응답 제공에 이상적입니다.
- before_tool_callback → dict 또는 Map: 실제 도구 함수(또는 하위 에이전트)의 실행을 건너뜁니다. 반환된 dict는 도구 호출의 결과로 사용되며, 이는 일반적으로 LLM으로 다시 전달됩니다. 도구 인수 유효성 검사, 정책 제한 적용 또는 모의/캐시된 도구 결과 반환에 적합합니다.
- after_agent_callback → types.Content: 에이전트의 실행 로직이 방금 생성한 Content를 대체합니다.
- after_model_callback → LlmResponse: LLM에서 받은 LlmResponse를 대체합니다. 출력 정화, 표준 면책 조항 추가 또는 LLM의 응답 구조 수정에 유용합니다.
- after_tool_callback → dict 또는 Map: 도구에서 반환된 dict 결과를 대체합니다. LLM으로 다시 보내기 전에 도구 출력의 후처리 또는 표준화를 허용합니다.

개념적 코드 예제 (가드레일):

이 예제는 before_model_callback을 사용한 가드레일의 일반적인 패턴을 보여줍니다.

코드

PythonJava

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from google.adk.agents import LlmAgent
from google.adk.agents.callback_context import CallbackContext
from google.adk.models import LlmResponse, LlmRequest
from google.adk.runners import Runner
from typing import Optional
from google.genai import types 
from google.adk.sessions import InMemorySessionService

GEMINI_2_FLASH="gemini-2.0-flash"

# --- Define the Callback Function ---
def simple_before_model_modifier(
    callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
    """Inspects/modifies the LLM request or skips the call."""
    agent_name = callback_context.agent_name
    print(f"[Callback] Before model call for agent: {agent_name}")

    # Inspect the last user message in the request contents
    last_user_message = ""
    if llm_request.contents and llm_request.contents[-1].role == 'user':
         if llm_request.contents[-1].parts:
            last_user_message = llm_request.contents[-1].parts[0].text
    print(f"[Callback] Inspecting last user message: '{last_user_message}'")

    # --- Modification Example ---
    # Add a prefix to the system instruction
    original_instruction = llm_request.config.system_instruction or types.Content(role="system", parts=[])
    prefix = "[Modified by Callback] "
    # Ensure system_instruction is Content and parts list exists
    if not isinstance(original_instruction, types.Content):
         # Handle case where it might be a string (though config expects Content)
         original_instruction = types.Content(role="system", parts=[types.Part(text=str(original_instruction))])
    if not original_instruction.parts:
        original_instruction.parts.append(types.Part(text="")) # Add an empty part if none exist

    # Modify the text of the first part
    modified_text = prefix + (original_instruction.parts[0].text or "")
    original_instruction.parts[0].text = modified_text
    llm_request.config.system_instruction = original_instruction
    print(f"[Callback] Modified system instruction to: '{modified_text}'")

    # --- Skip Example ---
    # Check if the last user message contains "BLOCK"
    if "BLOCK" in last_user_message.upper():
        print("[Callback] 'BLOCK' keyword found. Skipping LLM call.")
        # Return an LlmResponse to skip the actual LLM call
        return LlmResponse(
            content=types.Content(
                role="model",
                parts=[types.Part(text="LLM call was blocked by before_model_callback.")],
            )
        )
    else:
        print("[Callback] Proceeding with LLM call.")
        # Return None to allow the (modified) request to go to the LLM
        return None


# Create LlmAgent and Assign Callback
my_llm_agent = LlmAgent(
        name="ModelCallbackAgent",
        model=GEMINI_2_FLASH,
        instruction="You are a helpful assistant.", # Base instruction
        description="An LLM agent demonstrating before_model_callback",
        before_model_callback=simple_before_model_modifier # Assign the function here
)

APP_NAME = "guardrail_app"
USER_ID = "user_1"
SESSION_ID = "session_001"

# Session and Runner
async def setup_session_and_runner():
    session_service = InMemorySessionService()
    session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
    runner = Runner(agent=my_llm_agent, app_name=APP_NAME, session_service=session_service)
    return session, runner


# Agent Interaction
async def call_agent_async(query):
    content = types.Content(role='user', parts=[types.Part(text=query)])
    session, runner = await setup_session_and_runner()
    events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)

    async for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)

# Note: In Colab, you can directly use 'await' at the top level.
# If running this code as a standalone Python script, you'll need to use asyncio.run() or manage the event loop.
await call_agent_async("write a joke on BLOCK")

import com.google.adk.agents.CallbackContext;
import com.google.adk.agents.LlmAgent;
import com.google.adk.events.Event;
import com.google.adk.models.LlmRequest;
import com.google.adk.models.LlmResponse;
import com.google.adk.runner.InMemoryRunner;
import com.google.adk.sessions.Session;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.Part;
import io.reactivex.rxjava3.core.Flowable;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
import java.util.stream.Collectors;

public class BeforeModelGuardrailExample {

  private static final String MODEL_ID = "gemini-2.0-flash";
  private static final String APP_NAME = "guardrail_app";
  private static final String USER_ID = "user_1";

  public static void main(String[] args) {
    BeforeModelGuardrailExample example = new BeforeModelGuardrailExample();
    example.defineAgentAndRun("Tell me about quantum computing. This is a test.");
  }

  // --- Define your callback logic ---
  // Looks for the word "BLOCK" in the user prompt and blocks the call to LLM if found.
  // Otherwise the LLM call proceeds as usual.
  public Optional<LlmResponse> simpleBeforeModelModifier(
      CallbackContext callbackContext, LlmRequest llmRequest) {
    System.out.println("[Callback] Before model call for agent: " + callbackContext.agentName());

    // Inspect the last user message in the request contents
    String lastUserMessageText = "";
    List<Content> requestContents = llmRequest.contents();
    if (requestContents != null && !requestContents.isEmpty()) {
      Content lastContent = requestContents.get(requestContents.size() - 1);
      if (lastContent.role().isPresent() && "user".equals(lastContent.role().get())) {
        lastUserMessageText =
            lastContent.parts().orElse(List.of()).stream()
                .flatMap(part -> part.text().stream())
                .collect(Collectors.joining(" ")); // Concatenate text from all parts
      }
    }
    System.out.println("[Callback] Inspecting last user message: '" + lastUserMessageText + "'");

    String prefix = "[Modified by Callback] ";
    GenerateContentConfig currentConfig =
        llmRequest.config().orElse(GenerateContentConfig.builder().build());
    Optional<Content> optOriginalSystemInstruction = currentConfig.systemInstruction();

    Content conceptualModifiedSystemInstruction;
    if (optOriginalSystemInstruction.isPresent()) {
      Content originalSystemInstruction = optOriginalSystemInstruction.get();
      List<Part> originalParts =
          new ArrayList<>(originalSystemInstruction.parts().orElse(List.of()));
      String originalText = "";

      if (!originalParts.isEmpty()) {
        Part firstPart = originalParts.get(0);
        if (firstPart.text().isPresent()) {
          originalText = firstPart.text().get();
        }
        originalParts.set(0, Part.fromText(prefix + originalText));
      } else {
        originalParts.add(Part.fromText(prefix));
      }
      conceptualModifiedSystemInstruction =
          originalSystemInstruction.toBuilder().parts(originalParts).build();
    } else {
      conceptualModifiedSystemInstruction =
          Content.builder()
              .role("system")
              .parts(List.of(Part.fromText(prefix)))
              .build();
    }

    // This demonstrates building a new LlmRequest with the modified config.
    llmRequest =
        llmRequest.toBuilder()
            .config(
                currentConfig.toBuilder()
                    .systemInstruction(conceptualModifiedSystemInstruction)
                    .build())
            .build();

    System.out.println(
        "[Callback] Conceptually modified system instruction is: '"
            + llmRequest.config().get().systemInstruction().get().parts().get().get(0).text().get());

    // --- Skip Example ---
    // Check if the last user message contains "BLOCK"
    if (lastUserMessageText.toUpperCase().contains("BLOCK")) {
      System.out.println("[Callback] 'BLOCK' keyword found. Skipping LLM call.");
      LlmResponse skipResponse =
          LlmResponse.builder()
              .content(
                  Content.builder()
                      .role("model")
                      .parts(
                          List.of(
                              Part.builder()
                                  .text("LLM call was blocked by before_model_callback.")
                                  .build()))
                      .build())
              .build();
      return Optional.of(skipResponse);
    }
    System.out.println("[Callback] Proceeding with LLM call.");
    // Return Optional.empty() to allow the (modified) request to go to the LLM
    return Optional.empty();
  }

  public void defineAgentAndRun(String prompt) {
    // --- Create LlmAgent and Assign Callback ---
    LlmAgent myLlmAgent =
        LlmAgent.builder()
            .name("ModelCallbackAgent")
            .model(MODEL_ID)
            .instruction("You are a helpful assistant.") // Base instruction
            .description("An LLM agent demonstrating before_model_callback")
            .beforeModelCallbackSync(this::simpleBeforeModelModifier) // Assign the callback here
            .build();

    // Session and Runner
    InMemoryRunner runner = new InMemoryRunner(myLlmAgent, APP_NAME);
    // InMemoryRunner automatically creates a session service. Create a session using the service
    Session session = runner.sessionService().createSession(APP_NAME, USER_ID).blockingGet();
    Content userMessage =
        Content.fromParts(Part.fromText(prompt));

    // Run the agent
    Flowable<Event> eventStream = runner.runAsync(USER_ID, session.id(), userMessage);

    // Stream event response
    eventStream.blockingForEach(
        event -> {
          if (event.finalResponse()) {
            System.out.println(event.stringifyContent());
          }
        });
  }
}

None을 반환하는 것과 특정 객체를 반환하는 것의 이 메커니즘을 이해함으로써 에이전트의 실행 경로를 정밀하게 제어할 수 있으며, 이는 ADK로 정교하고 신뢰할 수 있는 에이전트를 구축하는 데 필수적인 도구가 됩니다.