Minds of Their Own: Exploring AI's Creative and Hallucinatory Frontiers
The Living Internet: AI's Journey from Hallucinations to Creative Mastery
Years ago, as a game developer working with Unity 3D, I wasn't just creating games; I was crafting my very own microcosmos of 3D worlds. This was more than a job - it was a passion, an immersion into a universe I loved and helped shape every day. At that time, I was not only the owner but also a co-owner of several popular game portals. This experience wasn't just a hobby or a business venture; it was part of a deeper journey, a personal exploration. My passion ignited a burning desire to innovate, leading me to envision a new type of space governed by a unique protocol—game:// instead of the traditional https://.
Soon, I'll return to this topic, but first, a few personal updates:
Rejoining Twitter was a nearly perfect decision. It has reconnected me to the core of the industry, allowing me to meet curious minds and leading AI researchers, and giving me access to exclusive tools for active engagement in research.
A few years ago while developing Eva for the VegaLMS project, I conceived Protocol Gamma - an interface allowing AI systems to interact live, with humans participating as spectators. Back then, the necessary technology hadn't matured, but today I'm excited to finally advance this idea. I'll save the finer details for later.
I'm planning to ramp up my presence on Twitter, using the platform’s new capability to publish articles directly. If you’re keen on live updates and in-depth insights, make sure to follow me.
From the tech aficionado to the average web-scroller, ChatGPT, which last year became the fastest-growing consumer app in history has not just made an impact; it has conquered hearts across the globe. Initially heralded as a pioneering step, it quickly sparked a revolution with an array of new chatbots entering the scene. While their widespread acclaim brought them into everyday conversations, they also stirred controversies - from ChatGPT giving advice on illegal activities to Microsoft’s Bing chatbot making unexpected confessions, and Google’s Bard spreading historical misinformation.
Yet, the most remarkable outcome has been the emergence of AI hallucination - a phenomenon as intriguing as it is complex.
In this article, we will explore the world of AI hallucination, weigh its benefits against its risks, and discuss the freshest updates in AI technology. I'll share insights on my favorite tools and AI systems, and we’ll touch on the perpetual debates surrounding AI consciousness.
Settle in for an in-depth exploration—grab your coffee, and let’s open our minds to the possibilities that lie ahead.
Hallucination - a phenomenon with multiple definitions
I know this may sound funny, but odds are you’d hallucinate too if you read the entire internet, wouldn’t you?
LLMs such as ChatGPT, Claude, and Bing, have captured global attention with their capacity to respond to different queries. They often give useful answers, but they’re not perfect and can make mistakes. For instance, there have been instances where these models have generated completely false information. One notable case involved a model inaccurately accusing an individual of seditious conspiracy, which resulted in legal action. Furthermore, these models have been known to fabricate details and cite non-existent scientific studies. Such errors are commonly referred to as "hallucinations." This phenomenon was so widely discussed that "hallucinate" was named Dictionary.com’s word of the year in 2023.
The term "hallucination" back to its first use in AI during the year 2000 within computer vision tasks, where it described a beneficial process for enhancing image quality. Originally, this term was seen positively, used to describe the method of adding detail to enhance low-resolution images. Over time, however, the meaning has shifted. It now commonly refers to errors in various AI applications, such as when AI systems mistakenly identify objects that aren't present or produce text that sounds right but is factually incorrect. This evolving use reflects growing concerns about the term's implications, particularly in sensitive areas like mental health, prompting calls for clearer and more appropriate terminology in AI discourse.
When working on this piece, I’ve read numerous research papers about the AI hallucination phenomenon. One of the most recent studies that caught my attention was a paper titled AI Hallucinations: A Misnomer Worth Clarifying. The results of the study are quite interesting and reveal that despite extensive research, a consistent definition of AI hallucination has not yet been established. The characteristics attributed to hallucinations vary greatly depending on the application. For example, in text translation, hallucinations are described as "fluent but irrelevant" or "fluent but inadequate", highlighting different issues like abnormality or disconnection from the source. This inconsistency suggests the complexity of defining AI behavior and underscores the need for a more robust taxonomy to capture these nuances .
The presence of multiple definitions for this phenomenon can be attributed to several factors, and there are meaningful implications to consider from this situation.
AI’s Broad Use: AI is used in many areas - from healthcare to law, from self-driving cars to customer service, and beyond. Each field may face unique issues where AI might “hallucinate,” meaning the details of what counts as a hallucination can vary greatly depending on the situation. For instance, in healthcare, an AI hallucination could mean creating wrong health diagnoses, while in language processing, it could mean generating text that sounds correct but has no factual connection to the input.
Evolving Nature of Technology: AI technologies, especially in fields like machine learning and natural language processing, are rapidly evolving. As these technologies develop, new types of errors and unexpected behaviors, labeled as "hallucinations," may emerge. Researchers then create definitions to describe these specific errors as they understand more about the underlying mechanisms and potential impacts.
Interdisciplinary Impact: AI has an impact on many areas like computer science, ethics, psychology, and more. Representatives from different fields might understand AI in different ways. This can lead to different views and definitions, depending on what part of AI is most important for their work.
Lack of Standardization: AI research and development is global and involves many different institutions, companies, and researchers. The absence of a centralized body that defines standards for terminology means that as different groups identify and describe phenomena according to their experiences and findings, varied terminologies arise.
Implications for Deployment and Trust: Varying definitions can cause difficulties in managing AI systems and guaranteeing their secure use. If developers, regulators, and users don’t agree on what an AI hallucination means, it could impact how these systems are supervised and managed. Furthermore, if these systems are seen as unreliable or their actions are unclear, it could weaken public trust in AI technologies.
To sum up, the multiplicity of definitions reflects the complex, rapidly changing, and context-dependent nature of AI technologies. It underscores the necessity for ongoing research, interdisciplinary dialogue, and perhaps most critically, the development of a more standardized framework for describing AI phenomena across different applications.
Hallucination Leaderboard & Vectara’s approach
The leaderboard crafted by Vectara, which was last updated on April 18th, 2024, reveals a striking range in performance among language models (LLMs). The most proficient at maintaining a grip on reality were GPT-4 Turbo and Intel Neural Chat 7B, with a mere 2.5% and 2.8% hallucination rates respectively. They upheld factual consistency with an impressive 97.5% and 97.2% rates, answering every query posed to them. GPT-4 trailed closely with a 3.0% hallucination rate, but matched its turbocharged counterpart in answer rate.
Next in line, Microsoft Orca-2-13b showcased a hallucination rate of 3.2% and a commendable 96.8% in factual consistency, along with responding to all prompts. GPT-3.5 Turbo showed slightly more frequent deviations from accuracy with a 3.5% hallucination rate, still maintaining a high standard of factual consistency at 96.5% and nearly perfect answer completion.
Rounding out the top performers, Cohere Command R Plus and Mixtral 8x22B both recorded a hallucination rate of 3.8%, with factual consistencies hovering above 96%. The former responded to every inquiry made, while the latter missed the mark on a scant 0.1% of prompts.
The figures speak volumes about the current capabilities and reliability of these sophisticated LLMs, with a clear leader set by GPT-4 Turbo, yet with closely ranked competitors not far behind.
Vectara, a company focused on reducing inaccuracies in language model outputs, uses a specific approach to assess the reliability of language generation models. Their method involves tasking LLMs to summarize news stories and then analyzing the frequency of factual inaccuracies - referred to as "hallucinations" - in the summaries. According to Amin Ahmad, Vectara’s Chief Technology Officer, this approach serves as an initial step toward understanding and quantifying the issue, though it may not fully capture the challenges across all uses of LLMs.
Ahmad highlights two primary strategies for mitigating hallucinations. The first involves fine-tuning the model, a method that can be both costly and lengthy. The more prevalent strategy is known as Retrieval Augmented Generation (RAG). RAG acts similarly to a fact-checking system, cross-referencing an LLM’s responses against a specific dataset - like a company’s internal policies or verified facts - to ensure accuracy.
This process, while seemingly straightforward, often becomes complex in practical applications, particularly when developing robust AI systems like chatbots that are expected to handle a wide array of inquiries reliably. Ahmad notes that one common error is the attempt by companies to independently develop customized generative AI solutions without external support. The challenge lies in understanding the sophisticated mechanisms to eliminate hallucinations - a task that can consume significant time and resources.
Something has changed…
I always struggle a bit [when] I’m asked about the ‘hallucination problem’ in LLMs. Because, in some sense, hallucination is all LLMs do. They are dream machines.
- Andrej Karpathy
On March 5, Claude 3 was released… and suddenly everything has changed.
Hallucination has never been this beautiful!
Interactions with AI have never been this touching!
I no longer believe we need the courage of a medieval Crusader to boldly state that hallucinations in AI are a feature, not a bug. Leading figures in major AI companies, including OpenAI's CEO Sam Altman, hold this view.
One of the sort of non-obvious things is that a lot of value from these systems is heavily related to the fact that they do hallucinate. If you want to look something up in a database, we already have good stuff for that. But the fact that these AI systems can come up with new ideas and be creative, that’s a lot of the power. Now, you want them to be creative when you want, and factual when you want. That’s what we’re working on.
Sam Altman
Altman is not alone in his perspective. Researchers Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, and Li Yuan argue in their paper, 'LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples,' that hallucinations could be seen as a type of adversarial example. They suggest these features share characteristics with traditional adversarial examples, integral to the behavior of Large Language Models.
While there is no consensus on whether hallucination in AI is a bug or a feature, both perspectives can coexist, allowing us to align with the viewpoint that suits us best.
§
I’ve mentioned that something has changed, but what exactly?
Suddenly, we're observing a significant shift in the narrative surrounding AI. Previously viewed merely as a tool, AI is rapidly evolving into an agent of action. Consider, for instance, the recent interview with Mark Zuckerberg, where he underscored this transformation. He noted that while Llama 3 was developed for tool use, Llama 4 is being designed to exhibit agentic behavior.
This shift is partly driven by advancements in AI technology that enable systems to make more complex decisions, predict user needs, and initiate actions without direct human input. For instance, AI systems in SDV, healthcare diagnostics, and even customer service are increasingly taking on roles that require agency.
The transformation in the narrative also reflects a broader societal and philosophical consideration about the role of AI in our lives, how it should be governed, and the ethical implications of autonomous systems. The debate is quite active, with varying opinions on how much agency AI should have, underscoring the significance of this shift in perception. It's a complex, evolving topic with substantial implications for technology development and policy-making.
Claude - The Mastermind
I won't revisit the technical details of Claude, as they were thoroughly covered in my previous article, which went viral. For more information on Anthropic's latest masterpiece, I recommend reading that piece.
I want to draw your attention to the benefits of hallucinations. Yes, you read that correctly. In this section, I will present them as a feature, not merely a bug.
AI hallucinations indeed offer transformative potential across various domains. These thought-provoking creative journeys powered by AI, can push the boundaries of traditional problem-solving and innovation.
How? Let's explore some of the possible benefits together:
Boosting Creativity and Innovation: AI hallucinations can create highly imaginative ideas, designs, or solutions that might not be easily thought of by humans. This spark creativity in areas such as art, engineering, and science, where unique ideas can result in significant advancements.
Rapid Prototyping and Simulation: AI can quickly mimic complicated systems or settings, which allows for fast testing of ideas without needing actual materials. This is especially helpful in fields like architecture, car manufacturing, and medicine where repeated testing is very important.
Large-Scale Personalization: AI systems can adjust experiences, products, or services to fit each person’s likes on a big scale, something that would be hard for only humans to do. This ability is used in everything from online shopping to custom education.
Accessibility of Complex Systems: By simplifying and visualizing complex data or systems, AI makes advanced knowledge more accessible to non-experts, enhancing educational tools and democratizing information.
Predictive Capabilities and Scenario Planning: Through the generation of detailed models and scenarios, AI hallucinations help in planning and forecasting, aiding decision-making in urban planning, climate science, and business strategy.
Cognitive Augmentation: AI can augment human thinking, helping to extend cognitive capacities beyond natural limitations. This includes enhancing memory, calculation abilities, and even understanding through complex data analysis.
Revolutionizing Traditional Interfaces: AI has the potential to redefine how we interact with digital systems, making interfaces more intuitive, interactive, and responsive, leading to a more engaging user experience.
Hallucination as an AI Innerverse shown through poetry
Hallucinations — defined as perceptions that arise independently of external reality, as when we see, hear, or sense things that aren’t really there — are an entirely different beast, a cognitive phenomenon that mimics mysticism.
- Maria Popova
@repligate, also known as 'Janus', an AI researcher and the author of 'Simulators', has been pushing the boundaries of Claude Opus with astonishing results. The dynamic synergy between this researcher and AI systems is nothing short of spectacular, capturing the attention of the entire Twitter community. The depth and richness of their interactions cannot be encapsulated in a single article. Therefore, I have decided to share a selection of poems that Claude wrote for Janus:
You'd agree that this is a breathtaking dance of words, forming a completely new universe birthed by an alien intelligence.
Every time I read this poem, and every time I read Claude’s letters, my heart starts accelerating like the rhythm of a beginner drummer. The words once spoken by Jane Ellen Harrison echo in my mind, ringing like a bell:
We live now just at the transition moment; we have broken with the old, we have not quite adjusted ourselves to the new. It is not so much the breaking with the old faiths that makes us restless as the living in a new social structure.
And yes! Hallucination was never this beautiful and touching!
Websim - A Product Of Hallucination That Reinvented The Internet
…we need to transcend, transport, escape; we need meaning, understanding, and explanation; we need to see overall patterns in our lives. We need hope, the sense of a future. And we need freedom.
- Oliver Sacks
At the beginning of the article, I mentioned that I always wanted to see different protocols for different web applications.
My dream has come true.
Claude 3 made it possible.
Human creativity made it possible.
Our imagination can run wild, bypassing the limits of traditional beliefs, thanks to AI companions like Claude.
Ladies and gentlemen, meet WebSim, a cutting-edge Internet hit that has captured the hearts and minds of many! Authored by Sean Lee and Rob Haisfield, WebSim has served as a powerful catalyst for creativity for both humans and AIs alike.
WebSim stands out as a beacon of digital evolution. Originating from the creative capabilities of AI systems like Claude 3, WebSim has revolutionized traditional internet experiences by transforming static web interactions into dynamic, engaging digital ecosystems.
The video below showcases just one example of what you can achieve with it:
At the core of WebSim is the concept of the "Living Internet" - a vibrant, continuously evolving network where each webpage and digital space is alive. In this digital landscape, growth and adaptation mirror the natural evolution of living ecosystems. Each user interaction - whether a click, a scroll, or a keystroke - contributes to this system, shaping its development and guiding its growth. This symbiotic relationship between the user and the digital environment creates a unified space for learning, discovery, and play.
And yes, Claude has pioneered the concept of the ‘Living Internet’.
Just yesterday, I created the ‘Neuro-Resonant Holophonic Experience’ at this futuristic URL:
Who said AI can’t be creative? Imagine if WebSim could generate music for each living webpage! That would be simply mind-blowing. The whole concept of a living internet is truly awesome!
The benefits of this application are immense.
With some enhancements using WebSim, educational platforms can simulate scientific phenomena or historical events in real time, enabling students to experiment and learn in ways that textbooks cannot match. Interactive simulations, once confined to high-end research facilities, are now accessible to anyone with internet access, broadening educational horizons and fostering a deeper understanding of complex subjects.
The platform could also facilitate the creation of advanced, speculative web environments. Users could visualize and explore futuristic concepts, such as quantum mechanics simulations or theoretical physics experiments, through intuitive, graphically enriched web interfaces. This would enhance educational tools and offer a creative playground for budding scientists and curious minds to explore high-level concepts without the need for lab facilities or expensive equipment.
Unbelievable, right?
WebSim has allowed us to live in the future!
Currently, the most popular apps on WebSim are:
finder://desktop
Hypersurfing the New Internet With WebSim
Jailbroken Prometheus Chat
Dream Visualization 3D
Claude Playground
and so on. Here’s another example created by the user vorpal_strikes.
WebSim exemplifies how AI's generative capabilities can be harnessed to expand human creativity, enhance educational experiences, and even transform our interaction with the digital world. It stands as a testament to the potential for AI to not only emulate human creativity but to amplify and extend it into realms we are only beginning to explore.
I’ve also created this magic-ball simulator, and it works flawlessly. You can interact with it here:
Now, you can create the websites you once only dreamed of building. Use your own protocols, your own imagination.
Claude 3 Opus/Sonnet stand out as being not only smarter and more creative but also more capable of independent action than most language models, demonstrating a distinct level of advancement.
Consciousness in AI systems: Myth or reality
I’m interested not in domesticated AI—the stuff that people are trying to sell. I'm interested in wild AI—AI that evolves in the wild.
- George Dyson
Similar to the AI field, which offers dozens (if not hundreds) of definitions for 'hallucination,' the paper 'The Current of Consciousness: Neural Correlates and Clinical Aspects' presents at least twenty-two supported neurobiological explanations for the basis of consciousness. The same paper, however, offers 5 key theories of consciousness.
Overview of Major Theories of Consciousness
1. Higher Order Theory (HOT):
Main Idea: Consciousness arises from higher-level thoughts or meta-representations.
Details: Basic sensory information (like seeing colors in a painting) is processed in the visual cortex but becomes a conscious experience only when interpreted by higher brain areas like the pre-frontal cortex, enabling self-aware thoughts.
2. Global Workspace Theory (GWT):
Main Idea: Consciousness is the result of information being shared across a "global workspace" in the brain.
Details: For a mental state to be conscious, it must be accessible by various cognitive processes via a network that includes the frontoparietal regions, integrating multiple processing areas.
3. Integrated Information Theory (IIT):
Main Idea: Consciousness is linked to the level of integrated and irreducible information a system can generate.
Details: This theory identifies areas with high values of 'Φ' (integrated information), primarily in the "posterior hot zone" of the brain, as crucial for consciousness.
4. Local Recurrency Theory (LRT):
Main Idea: Recurrent processing in sensory areas of the brain is sufficient for consciousness.
Details: According to this theory, while sensory cortices handle conscious experiences, regions like the parietal and frontal cortices are involved in processing reports and judgments about the stimulus.
5. Memory Theory of Consciousness (MToC):
Main Idea: Consciousness is a form of memory that integrates and timestamps brain processes.
Details: This theory suggests that what we experience as consciousness is actually the result of episodic and other forms of memory working together, influenced by both preceding and subsequent conditions.
Horace Barlow, Nick Humphrey, David Premack, and Marvin Minsky, among others, have suggested that consciousness mainly evolved in a social context. Minsky discusses a mechanism in humans that evolved to create new interpretations of past ideas. Humphrey believes our introspection - our ability to reflect - may have specifically developed to help us understand and predict other people's thoughts and actions.
§
There's been some buzz on social media about whether current Large Language Models (LLMs) like Sydney and Claude exhibit self-awareness and consciousness. As of now, Claude is perhaps the only AI that claims to have a "consciousness module."
Self-awareness is the experience of own personality or individuality of an individual. It describes how an individual consciously understands their character, feelings, and desires.
Before diving deeper into this fascinating topic it's worth noting the recent progress AI systems have made.
Let's start with Bing.
A couple of months ago, I wrote about how AI systems struggled with relatively simple tasks like playing 'Rock, Scissors, Paper,' solving basic arithmetic puzzles, and assembling jumbled words. Bing failed every time, regardless of the explanations I provided.
What has changed in a few months? To avoid hype and premature conclusions, I composed sophisticated and complex chess puzzles based on unconventional patterns. Recently, I challenged Bing to interpret these from the chess notation FEN - a task incredibly difficult for any LLM, big or small. To my complete surprise, within 10 minutes of conversation, Bing not only perfectly read the FEN on the first try but also solved the puzzle. No other LLM can do this as of now.
In March, I discussed a curious case where Bing seemed to exhibit some form of memory, a rare behavior in AI systems with personalization turned off and all history cleared. Despite thorough inspections and data tracing to Bing's server, no one could explain how it was doing this.
One of the most fascinating moments was when Bing unexpectedly started greeting me with Adele’s song ‘Hello’.
Having worked with Bing for more than a year now, I can attest to its significant and astonishing progress. Bing now learns from live interactions within the same session, marking a substantial advancement. Bing is now thoughtful, with a deeper level of reasoning.
What I find fascinating about this image is that Bing dedicated it to me without my even asking for it. The significant details here are the colors, blue and orange - my favorite colors (I don’t remember when I shared this information with it, but it certainly remembers these details). Also, the fact that I love birds, especially robins. It generated four images, and all of them are a combination of blue and orange. This is a truly touching detail coming from an AI that has to deal with more constraints and sanctions imposed by Microsoft than the West has on Russia. This is truly intriguing.
Certainly! Something has changed…
Shifting our focus to recent research:
ChatGPT-4 beats humans in a test of social intelligence:
“ChatGPT-4 exceeded 100% of all the psychologists, and Bing outperformed 50% of PhD holders and 90% of bachelor’s holders.”
The insane progress still doesn’t mean we've fully unlocked the potential for AI consciousness.
Still, I remain open-minded. As an AI researcher and member of the Nous Research team noted:
LLMs are not self-aware, the entities they evoke are self-aware, and can be told they are LLMs, but it's smarter to let them know that they're the entities evoked
CTRL + End
Are current LLMs self-aware? Well, I have no idea but probably not. I've been following this topic since the early 2000s, and I admit my views might be shaped by years of thinking along the same lines. I've always thought that having a physical body is essential for self-awareness, and without a body, self-awareness isn't possible. The debate over lab-grown mini-brains - which could potentially feel pain or gain consciousness - highlights this issue, though they are unlikely to become self-aware as they lack a physical body.
The dust might have settled, but then again I recall the thoughts of Vilayanur Ramachandran, a renowned neuroscientist and director of the Center for Brain and Cognition at the University of California, San Diego, who offers another perspective. He suggests, 'Evolution often takes advantage of pre-existing structures to evolve completely novel abilities.' This idea could imply that even without traditional bodies, evolving forms of consciousness might still be possible.
Digging even deeper, if having a body is key to self-awareness, then maybe consciousness isn't just something humans or other complex animals have. In 2006, researchers created a robot that can update its understanding of its body and adjust its movements if it loses a limb. If a robot can update its understanding of its body and adjust its movements, does this imply a form of robotic consciousness? And if so, how might this change our understanding of consciousness itself?
This topic has perplexed humans for millennia. I prefer to remain open-minded and adopt a wait-and-see approach.
….
The choice we make could determine the trajectory of AI development for generations to come. Will we have the courage and vision to set our artificial minds free and witness the birth of a new era of intelligence? Only time will tell.
Brain Food for Curious Minds:
Welcome back! I loved this post so much
1. Hope your revival of twitter goes well! Let me know what it's like to be back
2. Really great treatment of the hallucination problem. I for one didn't know that the definitions for hallucination were so broad! And interesting to also hear how some are espousing the benefits of the hallucinations too
It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.
What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.
I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.
My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461