Inconsistent response lengths even when using the same prompt

prabhu 20 Reputation points
2025-03-20T17:17:50.76+00:00

I'm using Azure OpenAI GPT-4o for text generation in a chatbot, but I'm noticing inconsistent response lengths even when using the same prompt. Sometimes the output is too short, even though I set a high max_tokens value.

I've tried adjusting temperature, top_p, and frequency penalty, but the issue persists. Is there a way to ensure more consistent and longer responses without forcing a fixed output length? Would fine-tuning help, or is there a better approach using Azure AI settings?

Azure AI Custom Vision
Azure AI Custom Vision
An Azure artificial intelligence service and end-to-end platform for applying computer vision to specific domains.
284 questions
0 comments No comments
{count} votes

Accepted answer
  1. Azar 28,155 Reputation points MVP
    2025-03-20T17:45:19.01+00:00

    Hi there prabhu

    Thanks for using QandA platform

    try increasing max_tokens to allow for extended replies and adjusting temperature (e.g., 0.5) and top_p for better control over randomness. Providing explicit instructions in your prompt, such as "Give a detailed explanation with examples," can also improve response length. Ensure that stop_sequences aren’t unintentionally cutting off responses. If context is lacking, use Azure AI Search to retrieve relevant information before querying the model.

    If this helps kindly accept the answr thanks.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.