A Practical Guide to Deploying LMM-Powered Apps with CLIP and pgvector
In this article we’ll show how we built an image search demo in Aiven Apps. The demo uses the CLIP Large Multimodal Model (LMM) to turn a user’s text prompts into a vector that can be compared with the precomputed vectors for a corpus of images, allowing the user to find images based on their text. While in this example the LMM input (the text prompt) is coming from the user, the principle is the same as for an internally generated query.