Skip to content

Picking the right ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น is crucial.

Can't go to warp without a good vectorization engine!

Choosing Embedding Models

Qdrant supports dense, sparse, and ๐—บ๐˜‚๐—น๐˜๐—ถ๐˜ƒ๐—ฒ๐—ฐ๐˜๐—ผ๐—ฟ embeddings. However, the vectors' performance largely depends on the embedding model and its ability to capture meaning.

What's the best model for you?

1๏ธโƒฃ Look at benchmarks like ๐— ๐—ง๐—˜๐—• or ๐—•๐—˜๐—œ๐—ฅ. But remember - the top model might not perform the same in your setup. You need to experiment!

2๏ธโƒฃ Ask the community & experts. On ๐—ค๐—ฑ๐—ฟ๐—ฎ๐—ป๐˜'๐˜€ ๐——๐—ถ๐˜€๐—ฐ๐—ผ๐—ฟ๐—ฑ, I am seeing ๐—ป๐—ผ๐—บ๐—ถ๐—ฐ-๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ-๐˜๐—ฒ๐˜…๐˜ perform better than OpenAIโ€™s ๐˜๐—ฒ๐˜…๐˜-๐˜€๐—บ๐—ฎ๐—น๐—น-๐Ÿฏ for text retrieval. Try to learn from the experiences of others.

๐Ÿ’ฌ Whatโ€™s your go-to embedding model? Are you using ๐˜ฆ5, ๐˜ฏ๐˜ฐ๐˜ฎ๐˜ช๐˜ค, ๐˜–๐˜ฑ๐˜ฆ๐˜ฏ๐˜ˆ๐˜โ€™๐˜ด ๐˜ฆ๐˜ฎ๐˜ฃ๐˜ฆ๐˜ฅ๐˜ฅ๐˜ช๐˜ฏ๐˜จ๐˜ด, ๐˜ฐ๐˜ณ ๐˜ด๐˜ฐ๐˜ฎ๐˜ฆ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜จ ๐˜ฆ๐˜ญ๐˜ด๐˜ฆ? Let me know in the comments!

P.S. ๐—–๐—ผ๐—ต๐—ฒ๐—ฟ๐—ฒ ๐—˜๐—บ๐—ฏ๐—ฒ๐—ฑ ๐Ÿฏ just droppedโ€”a multimodal model that handles complex docs and catalogs better than CLIP. Worth a try?