You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RateLimitError: Error code: 429 -
{'error':
{'message': 'Request too large for gpt-4o in organization org-78asdf87asdf9aaa8976 on tokens per min (TPM):
Limit 30000, Requested 31538.
The input or output tokens must be reduced in order to run successfully.
Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}}
The only thing that helps here is to start a new chat:
sgpt --chat myNEWlongchat "Please calculate 2+2?"
But with this change I loose all the chat context, which I would like to avoid.
Instead of limiting chat HISTORY, limit input tokens instead. When tokens are above the quota, delete the oldest message(s) from chat history.
Example
Something like (added to class ChatSession:):
import tiktoken
TOKEN_LIMIT = 30000
def _limit_tokens(self, chat_id):
while True:
current_token_estimate = self._count_tokens(chat_id)
if current_token_estimate > TOKEN_LIMIT:
messages = self._read(chat_id)
# Remove 2nd and 3rd message (first question and answer)
truncated_messages = messages[:1] + messages[3:]
self._write(truncated_messages, chat_id)
else:
break
def _count_tokens(self, chat_id: str) -> int:
file_path = self.storage_path / chat_id
parsed_cache = json.loads(file_path.read_text())
text_to_encode = " ".join(message["content"] for message in parsed_cache if "content" in message)
tokenizer = tiktoken.get_encoding("o200k_base") # chatgpt 4.0
return len(tokenizer.encode(text_to_encode))
Call _limit_tokens() from ChatSession.wrapper().
The text was updated successfully, but these errors were encountered:
Limit tokens
I'm using the
--chat
option constantly.After some days, I got the error:
The only thing that helps here is to start a new chat:
But with this change I loose all the chat context, which I would like to avoid.
Instead of limiting chat HISTORY, limit input tokens instead. When tokens are above the quota, delete the oldest message(s) from chat history.
Example
Something like (added to
class ChatSession:
):Call
_limit_tokens()
fromChatSession.wrapper()
.The text was updated successfully, but these errors were encountered: