Multimodal LLM with a robot arm, SDXL 1.0, HealthScribe by Amazon, OverflowAI, Generative-AI based virtual room styler by Wayfair and more

Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.

In today’s issue (Issue #25 ):

AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build

Stability AI released SDXL 1.0, the next iteration of their open text-to-image generation model. SDXL 1.0 has one of the largest parameter counts of any open access image model, built on a new architecture composed of a 3.5B parameter base model and a 6.6B parameter refiner [Details].
Amazon introduced AWS HealthScribe, an API to create transcripts, extract details and create summaries from doctor-patient discussions that can be entered into an electronic health record (EHR) system. The transcripts from HealthScribe can be converted into patient notes by the platform’s machine learning models [Details].
Researchers from Nvidia and Stanford, among others, unveiled VIMA, a multimodal LLM with a robot arm attached. VIMA is an embodied AI agent that perceives its environment and takes actions in the physical world, one step at a time [Details].
Stack Overflow announced its own generative AI initiative OverflowAI. It includes Generative AI-based search and assistant based on their database of 58 million Q&As, complete with sources cited in the answers. A Visual Studio plugin will also be released [YouTube Demo | Details].
Google researchers present Med-PaLM M, a large multimodal generative model fine-tuned for biomedical applications. It interprets biomedical data including clinical language, imaging, and genomics with the same set of model weights [Paper].
Meta AI introduced Open Catalyst Demo, a service to expedite material science research. It allows researchers to simulate the reactivity of catalyst materials about 1000 times faster than current methods through AI [Details].
Poe, the Chatbot app from Quora, adds three new bots based on Meta’s Llama 2: Llama-2-70b, Llama-2-13b, and Llama-2-7b. Developers experimenting with fine tuning Llama and wanting to use Poe as a frontend can reach out at developers@poe.com [Twitter Link]
Researches from CMU build WebArena, a self-hosted simulated web environment for building autonomous agents [Details].
Stability AI introduced FreeWilly1 and FreeWilly2, open access Large Language Models, with the former fine-tuned using a synthetic dataset based on original LLaMA 65B, and the latter leveraging LlaMA 2 70B [Details].
Wayfair launched Decorify, a generative AI tool for virtual room styling. By uploading a photo, users can see shoppable, photorealistic images of their spaces in new styles [Details].
Cohere introduced Coral, a conversational knowledge assistant for enterprises with 100+ integrations across CRMs, collaboration tools, databases, and more [Details].
Amazon's Bedrock platform for building generative AI-powered apps now supports conversational agents and new third-party models, including Anthropic’s Claude 2 and SDXL 1.0 [Details].
Stability AI released open-source StableSwarmUI - a Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible [Link].
As actors strike for AI protections, Netflix is offering as much as $900,000 for a single AI product manager [Details].
Google researchers have developed a new technique to recreate music from brain activity recorded through fMRI scans [Details].
Australian researchers, who previously demonstrated a Petri-dish cultured cluster of human brain cells playing "Pong," received a $600,000 grant to investigate AI and brain cell integration [Details].
Sam Altman's Worldcoin, a cryptocurrency project that uses eye scans to verify identities with the aim to differentiate between humans and AI, has officially launched [Details]
Microsoft is rolling out Bing’s AI chatbot on Google Chrome and Safari [Details].
Anthropic, Google, Microsoft and OpenAI are launching the Frontier Model Forum, an industry body focused on ensuring safe and responsible development of frontier AI models [Details].
OpenAI has shut down its AI text-detection tool over inaccuracies [Details].
ChatGPT for Android is now available for download in the US, India, Bangladesh, and Brazil with rollout to additional countries over the next week [Link]

AI Video Leveled Up Again: A look at the latest update of Runway ML's Gen-2
that enables generation of video from an initial image [YouTube Link].
The NeverEnding Game: How AI will create a new category of games [Link]
Opportunities in AI: areas where startups utilizing generative AI have the biggest advantage [Link].
ShortGPT - an open-source AI framework for automated short/video content creation [GitHub Link]

Rewind: Rewind captures everything you’ve seen on your Mac and iPhone and makes it searchable with AI.
Eden AI: Seamlessly merging the top AI APIs into one, Eden AI provides a unique API connected to the best AI engines.
3D tool by CSM: Generates 3D model from an image. 3D outputs are public and open source on the trial version.

What We Know About LLMs (Primer) [Link].
Extended Guide: Instruction-tune Llama 2 [Link]
Building Generative AI Applications with Gradio - new course by DeepLearning.ai and Hugging Face [Link]
Fine-Tuning LLaMA 2 Models using a single GPU, QLoRA and AI Notebooks [Link]

✨ 📌 If you find value in AI Brews, you can support the work behind it on Patreon . Thanks for reading and have a nice weekend! 🎉 Mariam.

Share AI Brews

Multimodal LLM with a robot arm, SDXL 1.0, HealthScribe by Amazon, OverflowAI, Generative-AI based virtual room styler by Wayfair and more

Discussion about this post

Ready for more?