[D66] STaR: 'Bootstrapping Reasoning With Reasoning'
René Oudeweg
roudeweg at gmail.com
Wed Mar 6 09:22:51 CET 2024
L.S.
Een kantiaanse kritiek van de pure (rationale) rede zal wel niet aan
Zelikman en de googlegasten/OpenAI besteed zijn...
Het is niet toevallig dat "treason en reason" slechts 1 letter van
elkaar verschillen in het engels.
Hoe meer pre-AGI systemen zichzelf kunnen leren redeneren, hoe minder we
op de rede kunnen vertrouwen. Reguleren middels wetgeving is onmogelijk
want je kunt nu eenmaal onze stralende AI-zon aan de hemel niet
reguleren, net zomin als je de Goden kan reguleren...
R.O.
--
https://arxiv.org/pdf/2203.14465.pdf
STaR: Self-Taught Reasoner
Bootstrapping Reasoning With Reasoning
Eric Zelikman∗1, Yuhuai Wu∗12, Jesse Mu1, Noah D. Goodman1
1Department of Computer Science, Stanford University
2 Google Research
{ezelikman, yuhuai, muj, ngoodman}@stanford.edu
Abstract
Generating step-by-step "chain-of-thought" rationales improves language
model performance on complex reasoning tasks like mathematics or
commonsense question-answering. However, inducing language model
rationale generation currently requires either constructing massive
rationale datasets or sacrificing accuracy by using only few-shot
inference. We propose a technique to iteratively leverage a small number
of rationale examples and a large dataset without rationales, to boot-
strap the ability to perform successively more complex reasoning. This
technique, the "Self-Taught Reasoner" (STaR), relies on a simple loop:
generate rationales to answer many questions, prompted with a few
rationale examples; if the generated answers are wrong, try again to
generate a rationale given the correct answer; fine-tune on all the
rationales that ultimately yielded correct answers; repeat. We show
that STaR significantly improves performance on multiple datasets
compared to a model fine-tuned to directly predict final answers, and
performs comparably to fine-tuning a 30× larger state-of-the-art
language model on CommensenseQA. Thus, STaR lets a model improve itself
by learning from its own generated reasoning
--
digitaltrends.com
What is Project Q*? The mysterious AI breakthrough, explained | Digital
Trends
By Luke Larsen November 24, 2023
10–13 minutes
What is OpenAI Q*? The mysterious breakthrough that could ‘threaten
humanity’
Among the whirlwind of speculation around the sudden firing and
reinstatement of OpenAI CEO Sam Altman, there’s been one central
question mark at the heart of the controversy. Why was Altman fired by
the board to begin with?
Contents
What is Project Q*?
Was Q* really why Sam Altman was fired?
Is it really the beginning of AGI?
We may finally have part of the answer, and it has to do with the
handling of a mysterious OpenAI project with the internal codename, “Q*”
— or Q Star. Information is limited, but here’s everything we know about
the potentially game-changing developments so far.
Before moving forward, it should be noted that all the details about
Project Q* — including its existence — comes from some fresh reports
following the drama around Altman’s firing. Reporters at Reuters said on
November 22 that it had been given the information by “two people
familiar with the matter,” providing a peek behind the curtain of what
was happening internally in the weeks leading up to the firing.
According to the article, Project Q* was a new model that excelled in
learning and performing mathematics. It was still reportedly only at the
level of solving grade-school mathematics, but as a beginning point, it
looked promising for demonstrating a previously unseen intelligence from
the researchers involved.
Seems harmless enough, right? Well, not so fast. The existence of Q* was
reportedly scary enough to prompt several staff researchers to write a
letter to the board to raise the alarm about the project, claiming it
could “threaten humanity.”
On the other hand, other attempts at explaining Q* aren’t quite as novel
— and certainly aren’t so earth-shattering. The Chief AI scientist at
Meta, Yann LeCun, tweeted that Q* has to do with replacing
“auto-regressive token prediction with planning” as a way of improving
LLM (large language model) reliability. LeCun says all of OpenAI’s
competitors have been working on, and that OpenAI made a specific hire
to address this problem.
LeCun’s point doesn’t seem to be that such a development isn’t important
— but that it’s not some unknown development that no other AI
researchers aren’t currently discussing. Then again, in the replies to
this tweet, LeCun is dismissive of Altman, saying he has a “long history
of self-delusion” and suggests that the reporting around Q* don’t
convince him that a significant advancement in the problem of planning
in learned models has been made.
Was Q* really why Sam Altman was fired?
Sam Altman at the OpenAI developer conference.
OpenAI
From the very beginning of the speculation around the firing of Sam
Altman, one of the chief suspects was his approach to safetyism. Altman
was the one who pushed OpenAI to turn away from its roots as a
non-profit and move toward commercialization. This started with the
public launch of ChatGPT and the eventual roll-out of ChatGPT Plus, both
of which kickstarted this new era of generative AI, causing companies
like Google to go public with their technology as well.
The ethical and safety concerns around this technology being publicly
available have always been present, despite all the excitement behind
how it has already changed the world. Larger concerns about how fast the
technology was developing have been well-documented as well, especially
with the jump from GPT-3.5 to GPT-4. Some think the technology is moving
too fast without enough regulation or oversight, and according to the
Reuters report, “commercializing advances before understanding the
consequences” was listed as one of the reasons for Altman’s initial firing.
Although we don’t know if Altman was specifically mentioned in the
letter about Q* mentioned above, it’s also being cited as one of the
reasons for the board’s decision to fire Altman — which has since been
reversed.
It’s worth mentioning that just days before he was fired, Altman
mentioned at an AI summit that he was “in the room” a couple of weeks
earlier when a major “frontier of discovery” was pushed forward. The
timing checks out that this may have been in reference to a breakthrough
in Q*, and if so, would confirm Altman’s intimate involvement in the
project.
Putting the pieces together, it seems like concerns about
commercialization have been present since the beginning, and his
treatment of Q* was merely the final straw. The fact that the board was
so concerned about the rapid development (and perhaps Altman’s own
attitude toward it) that it would fire its all-star CEO is shocking.
To douse some of the speculation, The Verge was reportedly told by “a
person familiar with the matter” that the supposed letter about Q* was
never received by the board, and that the “company’s research progress”
wasn’t a reason for Altman’s firing.
We’ll need to wait for some additional reporting to come to the surface
before we ever have a proper explanation for all the drama.
Is it really the beginning of AGI?
AGI, which stands for artificial general intelligence, is where OpenAI
has been headed from the beginning. Though the term means different
things to different people, OpenAI has always defined AGI as “autonomous
systems that surpass humans in most economically valuable tasks,” as the
Reuters report says. Nothing about that definition has reference to
“self-aware systems,” which is often what presume AGI means.
Still, on the surface, advances in AI mathematics might not seem like a
big step in that direction. After all, we’ve had computers helping us
with math for many decades now. But the powers given to Q* aren’t just a
calculator. Having learned literacy in math requires humanlike logic and
reasoning, and researchers seem to think it’s a big deal. With writing
and language, an LLM is allowed to be more fluid in its answers and
responses, often giving a wide range of answers to questions and
prompts. But math is the exact opposite, where often there is just a
single correct answer to a problem. The Reuters report suggests that AI
researchers believe this kind of intelligence could even be “applied to
novel scientific research.”
Obviously, Q* seems to still be in the beginnings of development, but it
does appear to be the biggest advancement we’ve seen since GPT-4. If the
hype is to be believed, it should certainly be considered a major step
in the road toward AGI, at least as it’s defined by OpenAI. Depending on
your perspective, that’s either cause for optimistic excitement or
existential dread.
But again, let’s not forget the remarks from LeCun mentioned above.
Whatever Q* is, it’s probably safe to assume that OpenAI isn’t the only
research lab attempt the development. And if it ends up not actually
being the reason for Altman’s firing as The Verge report insists, maybe
it’s not as big of a deal as the Reuters report claims.
Images generated by artificial intelligence (AI) have been causing
plenty of consternation in recent months, with people understandably
worried that they could be used to spread misinformation and deceive the
public. Now, ChatGPT maker OpenAI is apparently working on a tool that
can detect AI-generated images with 99% accuracy.
According to Bloomberg, OpenAI’s tool is designed to root out user-made
pictures created by its own Dall-E 3 image generator. Speaking at the
Wall Street Journal’s Tech Live event, Mira Murati, chief technology
officer at OpenAI, claimed the tool is “99% reliable.” While the tech is
being tested internally, there’s no release date yet.
Most people distrust AI and want regulation, says new survey
A person's hand holding a smartphone. The smartphone is showing the
website for the ChatGPT generative AI.
Most American adults do not trust artificial intelligence (AI) tools
like ChatGPT and worry about their potential misuse, a new survey has
found. It suggests that the frequent scandals surrounding AI-created
malware and disinformation are taking their toll and that the public
might be increasingly receptive to ideas of AI regulation.
The survey from the MITRE Corporation and the Harris Poll claims that
just 39% of 2,063 U.S. adults polled believe that today’s AI tech is
“safe and secure,” a drop of 9% from when the two firms conducted their
last survey in November 2022.
GPT-4: how to use the AI chatbot that puts ChatGPT to shame
A laptop opened to the ChatGPT website.
People were in awe when ChatGPT came out, impressed by its natural
language abilities as an AI chatbot. But when the highly anticipated
GPT-4 large language model came out, it blew the lid off what we thought
was possible with AI, with some calling it the early glimpses of AGI
(artificial general intelligence).
The creator of the model, OpenAI, calls it the company's "most advanced
system, producing safer and more useful responses." Here's everything
you need to know about it, including how to use it and what it can do.
What is GPT-4?
GPT-4 is a new language model created by OpenAI that can generate text
that is similar to human speech. It advances the technology used by
ChatGPT, which is currently based on GPT-3.5. GPT is the acronym for
Generative Pre-trained Transformer, a deep learning technology that uses
artificial neural networks to write like a human.
--
technologyreview.com
Unpacking the hype around OpenAI’s rumored new Q* model
Melissa Heikkilä
6–7 minutes
This story is from The Algorithm, our weekly newsletter on AI. To get
stories like this in your inbox first, sign up here.
Ever since last week’s dramatic events at OpenAI, the rumor mill has
been in overdrive about why the company’s chief scientific officer, Ilya
Sutskever, and its board decided to oust CEO Sam Altman.
While we still don’t know all the details, there have been reports that
researchers at OpenAI had made a “breakthrough” in AI that had alarmed
staff members. Reuters and The Information both report that researchers
had come up with a new way to make powerful AI systems and had created a
new model, called Q* (pronounced Q star), that was able to perform
grade-school-level math. According to the people who spoke to Reuters,
some at OpenAI believe this could be a milestone in the company’s quest
to build artificial general intelligence, a much-hyped concept referring
to an AI system that is smarter than humans. The company declined to
comment on Q*.
Social media is full of speculation and excessive hype, so I called some
experts to find out how big a deal any breakthrough in math and AI would
really be.
Researchers have for years tried to get AI models to solve math
problems. Language models like ChatGPT and GPT-4 can do some math, but
not very well or reliably. We currently don’t have the algorithms or
even the right architectures to be able to solve math problems reliably
using AI, says Wenda Li, an AI lecturer at the University of Edinburgh.
Deep learning and transformers (a kind of neural network), which is what
language models use, are excellent at recognizing patterns, but that
alone is likely not enough, Li adds.
Math is a benchmark for reasoning, Li says. A machine that is able to
reason about mathematics, could, in theory, be able to learn to do other
tasks that build on existing information, such as writing computer code
or drawing conclusions from a news article. Math is a particularly hard
challenge because it requires AI models to have the capacity to reason
and to really understand what they are dealing with.
A generative AI system that could reliably do math would need to have a
really firm grasp on concrete definitions of particular concepts that
can get very abstract. A lot of math problems also require some level of
planning over multiple steps, says Katie Collins, a PhD researcher at
the University of Cambridge, who specializes in math and AI. Indeed,
Yann LeCun, chief AI scientist at Meta, posted on X and LinkedIn over
the weekend that he thinks Q* is likely to be “OpenAI attempts at planning.”
People who worry about whether AI poses an existential risk to humans,
one of OpenAI's founding concerns, fear that such capabilities might
lead to rogue AI. Safety concerns might arise if such AI systems are
allowed to set their own goals and start to interface with a real
physical or digital world in some ways, says Collins.
But while math capability might take us a step closer to more powerful
AI systems, solving these sorts of math problems doesn’t signal the
birth of a superintelligence.
“I don’t think it immediately gets us to AGI or scary situations,” says
Collins. It’s also very important to underline what kind of math
problems AI is solving, she adds.
“Solving elementary-school math problems is very, very different from
pushing the boundaries of mathematics at the level of something a Fields
medalist can do,” says Collins, referring to a top prize in mathematics.
Machine-learning research has focused on solving elementary-school
problems, but state-of-the-art AI systems haven’t fully cracked this
challenge yet. Some AI models fail on really simple math problems, but
then they can excel at really hard problems, Collins says. OpenAI has,
for example, developed dedicated tools that can solve challenging
problems posed in competitions for top math students in high school, but
these systems outperform humans only occasionally.
Nevertheless, building an AI system that can solve math equations is a
cool development, if that is indeed what Q* can do. A deeper
understanding of mathematics could open up applications to help
scientific research and engineering, for example. The ability to
generate mathematical responses could help us develop better
personalized tutoring, or help mathematicians do algebra faster or solve
more complicated problems.
This is also not the first time a new model has sparked AGI hype. Just
last year, tech folks were saying the same things about Google
DeepMind’s Gato, a “generalist” AI model that can play Atari video
games, caption images, chat, and stack blocks with a real robot arm.
Back then, some AI researchers claimed that DeepMind was “on the verge”
of AGI because of Gato’s ability to do so many different things pretty
well. Same hype machine, different AI lab.
And while it might be great PR, these hype cycles do more harm than good
for the entire field by distracting people from the real, tangible
problems around AI. Rumors about a powerful new AI model might also be a
massive own goal for the regulation-averse tech sector. The EU, for
example, is very close to finalizing its sweeping AI Act. One of the
biggest fights right now among lawmakers is whether to give tech
companies more power to regulate cutting-edge AI models on their own.
OpenAI’s board was designed as the company’s internal kill switch and
governance mechanism to prevent the launch of harmful technologies. The
past week’s boardroom drama has shown that the bottom line will always
prevail at these companies. It will also make it harder to make a case
for why they should be trusted with self-regulation. Lawmakers, take note.
More information about the D66
mailing list