Multimodal LLM with a robot arm, SDXL 1.0, HealthScribe by Amazon, OverflowAI, Generative-AI based virtual room styler by Wayfair and more

5 min read Original article ↗

Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.

In today’s issue (Issue #25 ):

  1. AI Pulse: Weekly News & Insights at a Glance

  2. AI Toolbox: Product Picks of the Week

  3. AI Skillset: Learn & Build

  1. Stability AI released SDXL 1.0, the next iteration of their open text-to-image generation model. SDXL 1.0 has one of the largest parameter counts of any open access image model, built on a new architecture composed of a 3.5B parameter base model and a 6.6B parameter refiner [Details].

  2. Amazon introduced AWS HealthScribe, an API to create transcripts, extract details and create summaries from doctor-patient discussions that can be entered into an electronic health record (EHR) system. The transcripts from HealthScribe can be converted into patient notes by the platform’s machine learning models [Details].

  3. Researchers from Nvidia and Stanford, among others, unveiled VIMA, a multimodal LLM with a robot arm attached. VIMA is an embodied AI agent that perceives its environment and takes actions in the physical world, one step at a time [Details].

  4. Stack Overflow announced its own generative AI initiative OverflowAI. It includes Generative AI-based search and assistant based on their database of 58 million Q&As, complete with sources cited in the answers. A Visual Studio plugin will also be released [YouTube Demo | Details].

  5. Google researchers present Med-PaLM M, a large multimodal generative model fine-tuned for biomedical applications. It interprets biomedical data including clinical language, imaging, and genomics with the same set of model weights [Paper].

  6. Meta AI introduced Open Catalyst Demo, a service to expedite material science research. It allows researchers to simulate the reactivity of catalyst materials about 1000 times faster than current methods through AI [Details].

  7. Poe, the Chatbot app from Quora, adds three new bots based on Meta’s Llama 2: Llama-2-70b, Llama-2-13b, and Llama-2-7b. Developers experimenting with fine tuning Llama and wanting to use Poe as a frontend can reach out at developers@poe.com [Twitter Link]

  8. Researches from CMU build WebArena, a self-hosted simulated web environment for building autonomous agents [Details].

  9. Stability AI introduced FreeWilly1 and FreeWilly2, open access Large Language Models, with the former fine-tuned using a synthetic dataset based on original LLaMA 65B, and the latter leveraging LlaMA 2 70B [Details].

  10. Wayfair launched Decorify, a generative AI tool for virtual room styling. By uploading a photo, users can see shoppable, photorealistic images of their spaces in new styles [Details].

  11. Cohere introduced Coral, a conversational knowledge assistant for enterprises with 100+ integrations across CRMs, collaboration tools, databases, and more [Details].

  12. Amazon's Bedrock platform for building generative AI-powered apps now supports conversational agents and new third-party models, including Anthropic’s Claude 2 and SDXL 1.0 [Details].

  13. Stability AI released open-source StableSwarmUI - a Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible [Link].

  14. As actors strike for AI protections, Netflix is offering as much as $900,000 for a single AI product manager [Details].

  15. Google researchers have developed a new technique to recreate music from brain activity recorded through fMRI scans [Details].

  16. Australian researchers, who previously demonstrated a Petri-dish cultured cluster of human brain cells playing "Pong," received a $600,000 grant to investigate AI and brain cell integration [Details].

  17. Sam Altman's Worldcoin, a cryptocurrency project that uses eye scans to verify identities with the aim to differentiate between humans and AI, has officially launched [Details]

  18. Microsoft is rolling out Bing’s AI chatbot on Google Chrome and Safari [Details].

  19. Anthropic, Google, Microsoft and OpenAI are launching the Frontier Model Forum, an industry body focused on ensuring safe and responsible development of frontier AI models [Details].

  20. OpenAI has shut down its AI text-detection tool over inaccuracies [Details].

  21. ChatGPT for Android is now available for download in the US, India, Bangladesh, and Brazil with rollout to additional countries over the next week [Link]

  1. AI Video Leveled Up Again: A look at the latest update of Runway ML's Gen-2

    that enables generation of video from an initial image [YouTube Link].

  2. The NeverEnding Game: How AI will create a new category of games [Link]

  3. Opportunities in AI: areas where startups utilizing generative AI have the biggest advantage [Link].

  4. ShortGPT - an open-source AI framework for automated short/video content creation [GitHub Link]

  1. Rewind: Rewind captures everything you’ve seen on your Mac and iPhone and makes it searchable with AI.

  2. Eden AI: Seamlessly merging the top AI APIs into one, Eden AI provides a unique API connected to the best AI engines.

  3. 3D tool by CSM: Generates 3D model from an image. 3D outputs are public and open source on the trial version.

  1. What We Know About LLMs (Primer) [Link].

  2. Extended Guide: Instruction-tune Llama 2 [Link]

  3. Building Generative AI Applications with Gradio - new course by DeepLearning.ai and Hugging Face [Link]

  4. Fine-Tuning LLaMA 2 Models using a single GPU, QLoRA and AI Notebooks [Link]

✨ 📌 If you find value in AI Brews, you can support the work behind it on Patreon . Thanks for reading and have a nice weekend! 🎉 Mariam.

Share AI Brews

Discussion about this post

Ready for more?