Replies: 3 comments 7 replies
-
One opportunity is if you have a tool call fetch data from an API endpoint you are building out. Then you could do RAG there and return data, which the LLM should then use for inference. |
Beta Was this translation helpful? Give feedback.
-
Yes, RAG could be represented either through resource templates or tools. MCP can also be useful to standardize RAG ingest in addition to retrieval. We haven't yet felt the need for any specific RAG-related features in the spec or SDKs, but if you think some are necessary, we'd love to hear more. |
Beta Was this translation helpful? Give feedback.
-
I think there's an argument for RAG to be supported, even if indirectly by some kind of context enrichment/transformation mechanism. RAG is really context-dependent context. Could this be represented with a tool? Yes, but so could resources and prompts in principle. So, I think if RAG is to be supported the best is through some kind of primitive that takes a context (message array?) and returns some kind of injection or mutation. A delta would be most efficient but adds a lot of server complexity, so perhaps (context, additional_params) to (context_addition, hint) could work instead. It moves the complexity to the client. This capability could facilitate RAG by providing a standard way to enrich context. It doesn't work for RAG that is post-inference but tools can already do that. |
Beta Was this translation helpful? Give feedback.
-
Pre-submission Checklist
Discussion Topic
I've been having a hard time understanding how RAG is supposed to work with MCP. My best guess is that a server that facilitates RAG would expose one resource template like
Is this how one would do RAG through MCP? I didn't see any reference to RAG in the documentation. I would have expected to find some given that
Beta Was this translation helpful? Give feedback.
All reactions