How to track token usage

Prerequisites

This guide assumes familiarity with the following concepts:

Chat models

This notebook goes over how to track your token usage for specific calls.

Using `AIMessage.response_metadata`

A number of model providers return token usage information as part of the chat generation response. When available, this is included in the AIMessage.response_metadata field. Here's an example with OpenAI:

tip

See this section for general instructions on installing integration packages.

npm
Yarn
pnpm

npm install @langchain/openai

yarn add @langchain/openai

pnpm add @langchain/openai

import { ChatOpenAI } from "@langchain/openai";

const chatModel = new ChatOpenAI({
  model: "gpt-4-turbo",
});

const res = await chatModel.invoke("Tell me a joke.");

console.log(res.response_metadata);

/*
  {
    tokenUsage: { completionTokens: 15, promptTokens: 12, totalTokens: 27 },
    finish_reason: 'stop'
  }
*/

API Reference:

ChatOpenAI from @langchain/openai

And here's an example with Anthropic:

npm
Yarn
pnpm

npm install @langchain/anthropic

yarn add @langchain/anthropic

pnpm add @langchain/anthropic

import { ChatAnthropic } from "@langchain/anthropic";

const chatModel = new ChatAnthropic({
  model: "claude-3-sonnet-20240229",
});

const res = await chatModel.invoke("Tell me a joke.");

console.log(res.response_metadata);

/*
  {
    id: 'msg_017Mgz6HdgNbi3cwL1LNB9Dw',
    model: 'claude-3-sonnet-20240229',
    stop_sequence: null,
    usage: { input_tokens: 12, output_tokens: 30 },
    stop_reason: 'end_turn'
  }
*/

API Reference:

ChatAnthropic from @langchain/anthropic

Using callbacks

You can also use the handleLLMEnd callback to get the full output from the LLM, including token usage for supported models. Here's an example of how you could do that:

import { ChatOpenAI } from "@langchain/openai";

const chatModel = new ChatOpenAI({
  model: "gpt-4-turbo",
  callbacks: [
    {
      handleLLMEnd(output) {
        console.log(JSON.stringify(output, null, 2));
      },
    },
  ],
});

await chatModel.invoke("Tell me a joke.");

/*
  {
    "generations": [
      [
        {
          "text": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
          "message": {
            "lc": 1,
            "type": "constructor",
            "id": [
              "langchain_core",
              "messages",
              "AIMessage"
            ],
            "kwargs": {
              "content": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
              "tool_calls": [],
              "invalid_tool_calls": [],
              "additional_kwargs": {},
              "response_metadata": {
                "tokenUsage": {
                  "completionTokens": 17,
                  "promptTokens": 12,
                  "totalTokens": 29
                },
                "finish_reason": "stop"
              }
            }
          },
          "generationInfo": {
            "finish_reason": "stop"
          }
        }
      ]
    ],
    "llmOutput": {
      "tokenUsage": {
        "completionTokens": 17,
        "promptTokens": 12,
        "totalTokens": 29
      }
    }
  }
*/

API Reference:

ChatOpenAI from @langchain/openai

Next steps

You've now seen a few examples of how to track chat model token usage for supported providers.

Next, check out the other how-to guides on chat models in this section, like how to get a model to return structured output or how to add caching to your chat models.

How to track token usage

Using `AIMessage.response_metadata`

API Reference:

API Reference:

Using callbacks

API Reference:

Next steps

Was this page helpful?

You can leave detailed feedback on GitHub.

How to track token usage

Using AIMessage.response_metadata​

API Reference:

API Reference:

Using callbacks​

API Reference:

Next steps​

Was this page helpful?

You can leave detailed feedback on GitHub.

Using `AIMessage.response_metadata`

Using callbacks

Next steps