Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Bug: Some vector DBs return scores that do not match the definition of the distance function #10103

Open
westey-m opened this issue Jan 7, 2025 · 0 comments
Assignees
Labels
bug Something isn't working memory connector .NET Issue or Pull requests regarding .NET code sk team issue A tag to denote issues that where created by the Semantic Kernel team (i.e., not the community)

Comments

@westey-m
Copy link
Contributor

westey-m commented Jan 7, 2025

Describe the bug
For the purposes of this bug description the following vectors have been used:
[1, 0, 0, 0]
[-1, 0, 0, 0]

Examples:

DB DistanceFunction IdenticalActual IdenticalExpected OppositeActual OppositeExpected
AzureAISearch CosineSimilarity 1 1 0.3333334 -1
AzureAISearch DotProductSimilarity 1 1 0.2 -1
AzureAISearch EuclideanDistance 1 0 0.05882353 2
Pinecone EuclideanDistance 0 0 4 2
Redis DotProductSimilarity 0 1 2 -1
Redis EuclideanDistance 0 0 4 2
Weaviate DotProductSimilarity -1 1 1 -1

We should test all distance functions for all connectors to ensure that they return the expected values.

To Reproduce

Create a redis vector store collection using DotProductSimilarity. Add a record with the following vector [1, 0, 0, 0] and do a vector search using the same vector. The score returned is -1.

Expected behavior
The score returned is 1.

Platform

  • Language: C#
@westey-m westey-m added .NET Issue or Pull requests regarding .NET code bug Something isn't working memory connector sk team issue A tag to denote issues that where created by the Semantic Kernel team (i.e., not the community) labels Jan 7, 2025
@westey-m westey-m self-assigned this Jan 7, 2025
@github-actions github-actions bot changed the title Bug: Some vector store connectors return DotProductDistance/Negative DotProduct scores when choosing DotProductSimilarity .Net: Bug: Some vector store connectors return DotProductDistance/Negative DotProduct scores when choosing DotProductSimilarity Jan 7, 2025
@markwallace-microsoft markwallace-microsoft moved this to Sprint: Planned in Semantic Kernel Jan 7, 2025
@westey-m westey-m changed the title .Net: Bug: Some vector store connectors return DotProductDistance/Negative DotProduct scores when choosing DotProductSimilarity .Net: Bug: Some vector DBs return scores that do not match the definition of the distance function Jan 8, 2025
@markwallace-microsoft markwallace-microsoft moved this from Sprint: Planned to Sprint: In Progress in Semantic Kernel Jan 8, 2025
github-merge-queue bot pushed a commit that referenced this issue Jan 10, 2025
…10144)

### Motivation and Context

Adds tests for verifying the scores as mentioned in #10103
Only adding tests for passing connectors right now.

### Description

Adding a common base test class with integration tests for checking the
returned vector scores.
Adding subclasses for InMemory, Qdrant and Postgres, since these are
passing end to end.

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working memory connector .NET Issue or Pull requests regarding .NET code sk team issue A tag to denote issues that where created by the Semantic Kernel team (i.e., not the community)
Projects
Status: Sprint: In Progress
Development

No branches or pull requests

2 participants