Streaming AskAsync response #625
Replies: 8 comments 7 replies
-
+1 that's on the list, I agree it's highly needed for responsive UI. We don't have a timeline yet, but it's high on the priority list |
Beta Was this translation helpful? Give feedback.
-
Here here. Since virtually all of the LLMs do this by default it would be nice to just add AskStreamingAsync and return IAsyncEnumerable. I think just one method needs to be added to the ITextGenerator and then the included ones updated. |
Beta Was this translation helpful? Give feedback.
-
@dluc We also need this, GPT4 takes 10+ seconds which is not a good user experience, people expect streaming results. I may be able to assist for the implementation. I've analysed the solution and I think it's a bit more than just adding an ITextGenerator as @JohnGalt1717 suggested, but I may be wrong. SearchClient
For the API endpoint
For the WebClient
Lots of moving parts, but would be awesome to have that feature! |
Beta Was this translation helpful? Give feedback.
-
@dluc Any update on this feature? It seems that SearchClient.cs:AskAsync() already consumes an IAsyncEnumerable<string> but buffers this text to include citations etc. As @roldengarm mentioned above it might not be the best solution to send the relevant sources with each part of the stream. Maybe a good solution would be to just return an answer (IAsyncEnumerable<string>) seeing as the relevant sources can be fetched beforehand by the calling code using SearchClient.cs:SearchAsync(). |
Beta Was this translation helpful? Give feedback.
-
@JonathanVelkeneers I should be able to look at this next week - been busy with the new file download feature (just merged) |
Beta Was this translation helpful? Give feedback.
-
To be honest - I'm not sure it makes sense to keep the idea of returning a object at all. We cannot have a value for NoResult for example - as we just don't know if there will be one. So I'm pretty much with @JonathanVelkeneers on that - if you need the sources (and cannot tell the LLM to directly include it in the response), you might need to get them separately. |
Beta Was this translation helpful? Give feedback.
-
I now use kernel by first searching and then splicing prompt InvokeStreamingAsync. It would be better if the Ask could support streaming |
Beta Was this translation helpful? Give feedback.
-
hi all, FYI response streaming is now available in |
Beta Was this translation helpful? Give feedback.
-
Hi,
We're using your library in our project. It get us up to speed preety fast in these new AI topics, so thank you for that :).
I was wondering if there is an option to recieve streaming response from our semantic memory service. This could improve user experience, to recieve some first tokens of response instead of waiting for whole thing. OpenAI chatgpt is working this way.
BR,
Dawid
Beta Was this translation helpful? Give feedback.
All reactions