AI Blog Fire in your pants

Fire in your pants

AI Blog

Issue #26

27/09/2023

Competition leads to announcements

OpenAI last week made a series of announcements, here we will cover the two most important ones.

It looks like we will finally have the full multimodal capabilities of GPT-4 at our disposal, as OpenAI is starting to roll out its new voice and image capabilities in ChatGPT.

It’s not exactly an update of GPT-4, since we already knew it was multimodal when it was announced, but when it was released these features were not available to the general public, in fact as far as we know only Be My Eyes had access to them.

What does this mean?

ChatGPT will be able to accept as input images and sound and produce sound.

A few days ago OpenAI announced the upcoming arrival of Dall-E 3, which will be available in the next few weeks (no exact date was given).

The points to keep from the announcement are:

It will be able to include text in images, something most text-to-image AI tools struggle with, with the exception of Ideogram.
It won’t require particularly detailed prompting, a simple description of what you want will suffice.
It’s designed to do decline requests for images in the style of living artists. In fact artists can opt out and their images can be opted out of being used in the training of future models.

Why does that matter?

Google announced last week that its new AI model, Gemini, will be multimodal and at the same time Anthropic, a solid competitor to OpenAI, is steadily gathering funding, enterprise partnerships and user traffic. On the image generation side, Midjourney remains consistently the best text-to-image AI and new players in the space, such as Ideogram, have steadily growing hype, while Dall-E 2 seems unable to capture user interest in the last four months.

We believe, therefore, that these circumstances have put “fire in OpenAI’s breeches” and led it to respond immediately in order to increase its competitive advantage and stay ahead in the AI race.

Side note: Sam Altman stated that GPT-5 and 6 will increase reliability and personalization, but they are far from AGI (Artificial General Intelligence).

YouTube & AI for content creation

YouTube launches 4 new AI tools for content creators.

AI insights for Creators: it can generate content ideas based on insights on trending data.
Aloud: ΑΙ dubbing tool, which translates your video into other languages.
Creator Music tool: you enter the description of your video and it suggests songs you can use and thinks they fit.
Dream Screen: creates videos and backgrounds from text prompts.

Why does that matter?

As AI tools enter every platform that deals with content creation, how quickly you adapt and take advantage of their potential can give you a significant advantage whether you’re a company or a content creator.

Resources

AI Insider 📰

Anthropic has secured $4 billion in funding from Amazon, the first tranche will be at $1.25 billion for a minority stake and the rest will come in due course. Now Anthropic is now 2nd on the list of well-funded generative AI unicorn startups, just behind OpenAI.
Spotify is piloting a new translation feature for podcasts. Essentially the AI tool, behind OpenAI’s Whisper, will automatically translate the podcast into other languages using the voice of the announcer. Something similar to what Mr. Beast does with his videos. Of course he hires actors for this task, and now podcasters will be able to do the same for free.
Getty Images has announced its own AI image generator. Trained on the huge stock images in its library, it promotes it as “commercially safer”, as does Adobe. It’s interesting how the company went from initiations to developing its own AI tool.
Microsoft’s “SwiftKey” keyboard gets a new AI editor, AI camera lens and stickers. Also, and quite a bit more importantly, Microsoft announced Microsoft Copilot. This is the well-known Copilot, but now it comes as an “organized”, collective solution across its ecosystem. It’s available on Windows 11, Microsoft 365 and in web browsers with Edge & Bing. Copilot will begin rolling out in its original form with the free Windows 11 update, starting on 9/26/2023. Bing, Edge, and Microsoft 365 will receive it this fall. In addition, Microsoft Designer will support
Amazon adds generative AI to Alexa.

Learning Bytes 🧐

Interview with Max Tegmark (author of the open letter) about halting AI development, in which he talks about the goal of the open letter. Spoiler Alert! His goal was to legitimize and attract more public attention to the discussion around the potential risks of AI.
Surprise or not, ChatGPT can’t make non-coders good programmers, according to research conducted by the DiverSE team.
AI generated εικόνες, have started to appear in Google’s’ top search results.

Cool Finds 🤯

Greek podcast about AI from RealFM, titled “Artificial Intelligence: Conversing with the Future” and hosted by Alexandros Kontis. In the first episode, ChatGPT introduces itself and you can listen to it here.
The Nasher Museum of Art at Duke University decided to conduct an experiment to see if AI could take on the role of a museum curator. Utilizing ChatGPT, it collaborated with the museum staff to create the exhibition “Dreams of Tomorrow: Utopian and Dystopian Visions”. ChatGPT chose the title of the exhibition, the introductory text (the one seen in the image), wrote the texts for the labels accompanying the artworks, which it also selected, and determined the order in which they would be placed. Here you can see some of the works that were chosen.