Skip to main content

How to track token usage

Prerequisites

This guide assumes familiarity with the following concepts:

This notebook goes over how to track your token usage for specific calls.

Using AIMessage.response_metadata​

A number of model providers return token usage information as part of the chat generation response. When available, this is included in the AIMessage.response_metadata field. Here's an example with OpenAI:

npm install @langchain/openai
import { ChatOpenAI } from "@langchain/openai";

const chatModel = new ChatOpenAI({
model: "gpt-4-turbo",
});

const res = await chatModel.invoke("Tell me a joke.");

console.log(res.response_metadata);

/*
{
tokenUsage: { completionTokens: 15, promptTokens: 12, totalTokens: 27 },
finish_reason: 'stop'
}
*/

API Reference:

And here's an example with Anthropic:

npm install @langchain/anthropic
import { ChatAnthropic } from "@langchain/anthropic";

const chatModel = new ChatAnthropic({
model: "claude-3-sonnet-20240229",
});

const res = await chatModel.invoke("Tell me a joke.");

console.log(res.response_metadata);

/*
{
id: 'msg_017Mgz6HdgNbi3cwL1LNB9Dw',
model: 'claude-3-sonnet-20240229',
stop_sequence: null,
usage: { input_tokens: 12, output_tokens: 30 },
stop_reason: 'end_turn'
}
*/

API Reference:

Using callbacks​

You can also use the handleLLMEnd callback to get the full output from the LLM, including token usage for supported models. Here's an example of how you could do that:

import { ChatOpenAI } from "@langchain/openai";

const chatModel = new ChatOpenAI({
model: "gpt-4-turbo",
callbacks: [
{
handleLLMEnd(output) {
console.log(JSON.stringify(output, null, 2));
},
},
],
});

await chatModel.invoke("Tell me a joke.");

/*
{
"generations": [
[
{
"text": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain_core",
"messages",
"AIMessage"
],
"kwargs": {
"content": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
"tool_calls": [],
"invalid_tool_calls": [],
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 17,
"promptTokens": 12,
"totalTokens": 29
},
"finish_reason": "stop"
}
}
},
"generationInfo": {
"finish_reason": "stop"
}
}
]
],
"llmOutput": {
"tokenUsage": {
"completionTokens": 17,
"promptTokens": 12,
"totalTokens": 29
}
}
}
*/

API Reference:

Next steps​

You've now seen a few examples of how to track chat model token usage for supported providers.

Next, check out the other how-to guides on chat models in this section, like how to get a model to return structured output or how to add caching to your chat models.


Was this page helpful?


You can leave detailed feedback on GitHub.