The Quiet-STaR AI Technique

This week in AI: Mimicking the human mind in AI with Quiet-STaR, Amazon-Anthropic partnership, 0G AI blockchain startup, Adobe Firefly Services, and I launched a GenAI crash course!

Apr 01, 2024

Don’t have time to read? Watch the briefing on YouTube.

Advancement #1: Language Models Can Teach Themselves to Think Before Speaking With New AI Technique called Quiet-STaR

This week, I discovered a fascinating advancement in AI research that is starting to circulate as of this week: training AI to think like humans.

Known as "Quiet-STaR," it’s a new method that instructs AI systems to pause and engage in an "inner monologue," contemplating multiple rationales before responding.

In contrast to traditional AI chatbots like ChatGPT, which replies immediately without pause or weighing future conversation turns, the novel Quiet-STaR method empowers AI to foresee and adapt to upcoming dialogues. This innovative approach, (detailed in a paper awaiting peer review) enhances AI's decision-making by producing a variety of predictions, both with and without underlying reasons, and then picking the most accurate response while eliminating the less accurate ones.

Tested on the large language model Mistral 7B, Quiet-STaR led to a significant improvement, with the AI scoring 47.2% on a reasoning test, a leap from the pre-training score of 36.3%. Despite challenges in performing well on a school-level math test — scoring 10.9%, it still marked a noteworthy improvement from the initial 5.9% score, showcasing the potential of this training method in enhancing AI reasoning capabilities.

Quiet-STaR is not the only AI training technique that is attempting to teach AI systems to think like the human mind. Other similar approaches like Microsoft's Algorithm of Thoughts (AoT) and the "deep distilling" method inspired by brain studies are pushing the boundaries of AI's problem-solving and learning capabilities. These methods not only improve AI's efficiency but also aim at making AI's logic more precise and similar to humans.

My Initial Thoughts: The best way I can explain the Quiet-STaR method is this. When you are engaging with an AI model like ChatGPT, the model immediately begins to respond regardless of how difficult the question is.

For example, if I as a human was given two questions, “what color is the sky” vs. “explain to me the ongoing scientific questions that concern the nature of dark matter” my response would be instant on the color of the sky, but I would need to stop and think about about ongoing scientific questions about dark matter. The problem currently is AI models do not have that “pause”, and answer immediately right away without considering all of the different options.

The Quiet-STaR method is aiming to emulate this human-level of thinking, in addition to allocating more resources (or less resources) depending on the complexity of the question.

Advancement #2: The Amazon-Anthropic Partnership Gets More Serious

Amazon’s investment in Anthropic hit a total of $4 billion as of last week – the largest investment in its 30-year history. Mirroring the influential partnership between OpenAI-Microsoft, Amazon is solidifying its ownership stake in Anthropic in a similar fashion.

Signaling Amazon’s confidence in Anthropic (especially after the launch of Claude 3, which allegedly outperforms GPT-4) has positioned Amazonas a key player in the AI industry in a time where Amazon’s role in the AI race has been weirdly unknown and quiet.

Anthropic is using Amazon Web Services (AWS) as its main cloud provider for critical tasks, including safety research and development of new AI models. Using AWS's Trainium and Inferentia chips (in competition to Nvidia), Anthropic will build, train, and launch its next-generation models using Amazon’s infrastructure. The partnership is also enabling AWS customers globally access to these models through Amazon Bedrock.

This investment showcases the immense potential Amazon sees in Anthropic's capabilities, particularly following the debut of Claude 3. This latest AI model not only promises to outshine competitors like OpenAI's GPT-4 and Google’s Gemini Ultra but also positions the partnership of Amazon-Anthropic as a pivotal team in the AI space.

My Initial Thoughts: Finally, Amazon has entered the chat.

All of my startups are on AWS (I much prefer it to Google Cloud and Microsoft Azure) but honestly, they have been so severely behind in this AI race in terms of infrastructure that I have really had to experiment outside of their ecosystem (mainly using Azure, which I hate). I have no sense of loyalty to OpenAI and have been quite intrigued by Anthropic for the last few months, so this deepened partnership put a lot of faith back into AWS for me. I worried for awhile they weren’t going to catch up.

Google - Deep Mind
Microsoft - OpenAI
Amazon - Anthropic

Things are definitely heating up.

Advancement #3: The ZeroGravity AI Blockchain Startup Announces 35 Million in Pre-Seed Funding

0G Labs, also known as ZeroGravity, is pioneering a modular AI blockchain designed to address common challenges in on-chain AI applications within the web3 ecosystem, such as improving speed and cost efficiency. Announcing this week that they raised $35 million in a pre-seed round, their modular approach allegedly will allow developers to customize blockchain systems or applications to their specific needs: a stark contrast to the monolithic structure of blockchains like Ethereum, which lacks flexibility and customization.

0G hopes to make blockchain technology as performant and affordable as traditional web2 applications through its modularity, enabling scalability and efficient data storage for broader use. With claims of superior speed and cost-effectiveness compared to its competitors, 0G Labs focuses on security and high throughput, aiming for a network capability of 50 gbps in comparison to the competition’s 1.5 mbps.

0G Labs is set to explore a variety of use cases, from combating deepfakes in AI to fostering decentralized models and supporting high-performance applications on the blockchain, with the goal of benefiting public good and serving humanity in diverse ways. Currently, 0G does not have its own cryptocurrency token, but indicated plans for a future token release, though further details remain undisclosed at this time.

My Initial Thoughts: With the rise of crypto pricing recently, the interest of AI blockchain technology inevitably comes with it. But $35 million in pre-seed funding (pre-seed meaning they don’t necessary even have a MVP yet) is absolute insanity. Maybe this tech can help the development of a decentralized marketplace for AI models and algorithms, where developers can buy, sell, or lease AI solutions in a secure, transparent, and efficient manner. It could potentially revolutionize how AI and machine learning resources are distributed and monetized in the web3 ecosystem.

But whoever is their sales person, I’d really love to take a lesson. Like, man.

Advancement #4: Adobe Releases a Suite of Generative API's for Enterprise Developers

Adobe has unveiled Firefly Services, a comprehensive suite of over 20 generative and creative APIs, tools, and services (derived from the AI capabilities of its Creative Cloud offerings like Photoshop) aimed at empowering enterprise developers. This initiative is designed to enhance content creation within custom workflows or enable the development of entirely new solutions.

Alongside this announcement, Adobe also introduced Custom Models, allowing businesses to tailor Firefly's AI models to their specific needs integrated within Adobe's GenStudio. Firefly Services seeks to automate and streamline workflows through generative AI and creative APIs, offering features for image enhancement, text layer editing, content tagging, and application of Lightroom presets. Aimed at facilitating faster content production for brands while addressing concerns about brand safety, Adobe positions Firefly as a secure alternative in the generative AI space.

My Initial Thoughts: Not much to say on this one, but for those working on creative applications (or using a lot of computer vision), if you can afford it, this is a good suite to be able to plug into. However, Adobe is notorious for being extremely expensive and I’m sure access to this suite (especially due to their designation of targeting enterprise developers) may make it highly inaccessible for many.

Announcements

I have launched a GenAI Crash Course on uDemy!

Do you know all those AI buzz words you hear all of the time, but have no idea what they mean? Well, it's time to change that. This course was design by me (with two AI awards just in the last 6 months, including TAUS AI Revolutionary of the Year and Slator's Language AI 50 Under 50), and is intended for a learner that is trying to understand how AI and Generative AI really work.

This course is for everyone: software background or not. Complex technical problems are broken down into things that everyone can understand. Transformers, neural networks, MoE, vector databases, RAG, you name it. You'll understand all of it by the end.

You will even walk through some other lesser known topics in the Generative AI sphere, including questions of copyright, data privacy protection, how Government entities should start to approach AI, and how to set your own AI initiatives to become "AI Ready".

You can find it at the below link. For the first 50 students, I am offering 30% off. Use code WELCOME50

https://www.udemy.com/course/gen-ai-crash-course/referralCode=CF12E2E9C12C217B0514