📝 A curated list about Personalized Multimodal Models, Personalized Representation Learning~ 📚
Problem Settings: Using 3-5 images of a novel concept/subject (e.g., a pet named <bo> ), can we personalize Large Multimodal Models so that:(1) They retain their original capabilities (e.g., Describe a dog) while (2) Enabling tailored their capabilities for the novel concept? (e.g., Describe <bo> ) |
Please feel free to create pull requests or an issue to add/ correct anything. I really appreciate any help or clarification!
* 🙋♀️ Personalization has been extensively explored in AI/ML/CV… It’s now time for personalizing Large Multimodal Models! 🙋♀️* |
---|
Over the years, we’ve witnessed the evolution of personalization across various tasks (e.g., object segmentation, image generation). Now, with the rise of Large Multimodal Models (LMMs) – We have opportunities to personalizing these generalist, large-scale AI systems. It’s time to take the leap and bring personalization into the realm of Large Multimodal Models, making them not only powerful but also user-specific! |
^ Above caption are actually generated by GPT-4o, I feed it the figure and asked it to generate a caption, haha! |
(This figure is created by me. If there is anything incorrect, please feel free to correct me! Thank you!)
⚠️ Minor Note: The listed works below are specified for settings where users provide 3-5 images, and the system needs to learn about those concepts. There is research on other subtopics (e.g., role-playing, persona, etc.). For these topics, this repo might provide better coverage.
Title | Venue | Year | Link/ Code |
---|---|---|---|
Personalized Representation from Personalized Generation | arXiv | 2024 | Code |
“This is my unicorn, Fluffy”: Personalizing frozen vision-language representations | ECCV | 2024 | Code |
Name | Year | # Concepts | Link | Notes |
---|---|---|---|---|
ConCon-Chi | 2024 | 20 | GitHub | with ConCon-Chi |
PODS | 2024 | 100 | GitHub | with personalized-rep |
MC-LLaVA | 2024 | – | GitHub | with MC-LLaVA, multiple concepts |
Yo’LLaVA | 2024 | 40 | GitHub | with Yo’LLaVA, single concept |
MyVLM | 2024 | 29 | GitHub | with MyVLM, single concept |
⣶⣶⣶⣶⣶⣖⣒⡄⠀⣶⡖⠲⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⣤⠠⡄⠀⠀⠀⠀ ⠙⠛⣿⣿⣿⡟⠛⠃⢀⣿⣿⣆⣦⣴⠂⠤⠀⠀⠀⣠⣤⣴⣆⠠⢄⠀⠀⠀⣤⡤⢤⣤⣤⠤⢄⠀⠀⢻⣿⣦⡇⢀⣤⢤⠀ ⠀⢀⣿⣿⣿⡇⠀⠀⢸⣿⣿⣿⠛⣿⣷⣄⡇⠀⣼⣿⣿⡟⢿⣷⡄⣣⠀⢘⣿⣿⣿⠿⣿⣧⣈⡆⠀⢹⣿⣿⣷⣾⣧⣴⠀ ⠀⢰⣿⣿⣿⠀⠀⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⠙⠛⣻⣧⣾⣿⣿⡷⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⣿⡇⠀ ⠀⢸⣿⣿⣿⠀⠀⠀⢸⣿⣿⡿⠀⣿⣿⣿⠃⠀⣰⣾⣿⡿⣿⣿⣿⣟⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⡏⢇⠀ ⠀⣼⣿⣿⣿⠀⠀⠀⣸⣿⣿⣟⢠⣿⣿⣿⠀⠀⣿⣿⡟⣇⣾⣿⣿⣯⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢼⣿⣿⣿⣿⣷⡈⡀ ⠀⠻⠿⠿⠟⠀⠀⠀⠻⠿⠿⠏⠸⣿⣿⣿⠀⠀⢿⣿⣿⣿⣿⣿⣿⡇⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⣿⣿⣿⡟⢻⣿⣧⣇ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠀⠀⠉⠉⠀⠀⠀⠉⠉⠁⠀⠉⠉⠉⠀⠀⠘⠙⠋⠁⠈⠋⠛⠉ ⠀⠀⠀⠀⠀⠀⢀⣠⣤⡀⠀⢀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⡤⠠⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⢹⣿⣄⠱⣠⣿⣧⣴⠀⠀⣠⣤⣤⣀⣀⡀⠀⠀⢀⣤⠤⡀⢀⣠⡤⢄⠀⠈⣿⣿⣦⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠈⢿⣿⣷⣿⣿⣿⡏⠀⣾⣿⣿⣿⣶⣄⡉⡄⠀⣿⣿⣤⣝⢸⣿⣦⣼⠀⠀⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢿⣿⣿⣿⠏⠀⠐⣿⣿⣿⠉⣿⣿⣷⡇⠀⣽⣿⣿⣯⢸⣿⣿⣿⠀⠀⢹⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⢠⣿⣿⣿⠀⣿⣿⣿⡇⠀⣻⣿⣿⡷⢸⣿⣿⣿⠀⠀⢸⣿⣿⠇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⠀⢿⣿⣿⣄⣿⣿⣿⠇⠀⢹⣿⣿⣿⣸⣿⣿⣿⠀⠀⢠⣽⣧⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⠛⠋⠀⠀⠀⠈⠛⠛⠛⠛⠛⠉⠀⠀⠈⠛⠛⠛⠋⠛⠛⠋⠀⠀⠈⠛⠛⠁⠀⠀⠀⠀⠀⠀⠀
And good luck with your research! 🤗✨