Can you red-pill ChatGPT?

Spoiler alert: Yes, you can (but only for your own amusement)

Aug 30, 2023

What is ChatGPT anyway?

Generative AI technology has made remarkable progress in recent years with substantial investments from tech giants like Microsoft, Alphabet, and Baidu. These AI systems can generate text, images, and other media. ChatGPT, built by OpenAI project using the powerful GPT-3/4 language model, is a notable example of generative AI capable of simulating human-like conversations and exhibiting intelligence. Additionally, Google's Bard, based on the LaMDA foundation model, and other systems like Stable Diffusion, Midjourney, and DALL-E contribute to the advancements in generative AI, particularly in the realm of art.

The generative AI is not to be confused with Artificial General Intelligence (AGI), which refers to a hypothetical intelligent agent capable of learning and performing any intellectual task that humans or animals can do. The key functional difference is that Generative AI focuses on specific tasks like generating text or media, while AGI aims for broader capabilities and seeks to replicate human-level intelligence across various tasks and represents a more comprehensive and sophisticated form of AI.

While Artificial General Intelligence remains a theoretical concept with no definitive timeline for its realization, it can be argued that through the sum of available generative AI implementations, the AGI is already here – so at least suggest all the tests for AGI, all of which, except one (coffee test), have been nailed by the machines. On top of that, Microsoft’s Open AI Project claims that its latest GPT-4 model received a score of 1410 on the SAT (94^thpercentile), 298 on the Uniform Bar Exam (90th percentile), and also passed an oncology exam, an engineering examand a plastic surgery exam.And for those amazed people who believe that the release of the ChatGPT prototype signified a sudden and unexpected leap forward in the AI capabilities, they should consider a more likely scenario where civilian technology generally lags behind the military.

As exciting as it is, there are obvious, well founded and already at play concerns of endangering employment in various fields like art, writing, software development, healthcare, and more (Meet Lisa, OTV and Odisha’s first AI news anchor set to revolutionize TV Broadcasting & Journalism. AI Lisa might not be perfect, and she doesn’t move around much, but it’s getting shockingly close.) SO, it’s not a great time for writers and actors to be asking for a raise - AI is cheaper and easier to work with than people, especially unionized.

But besides that, more grave concerns have emerged regarding the potential misuse of generative AI, such as the creation of fake news for propaganda or deepfakes for deception and manipulation. For example, it would not be too far-fetched conjecture that soon a video “recording” will no longer be considered a valid indisputable evidence in а court room, because footage with any desired prompted content could be easily generated in seconds by a generative AI model and be indistinguishable from real.

The recent changes to Twitter limiting the number of views by account should give everybody a pause about AI use on social networks. Musk tweeted that these changes were “temporary limits” designed to address “extreme levels of data scraping” and “system manipulation”. It is entirely possible that the major scrapers were government agencies modeling psi-ops through planting messages by fictitious accounts and analyzing the reaction of the public through AI-enabled processing, which would otherwise take hundreds of man-hours reading and analysing the posts. If this notion strikes you as conspiratorial, have a look at TwitterGPT as an example of the intersection of social media and artificial intelligence, capable of instantaneously generating someone’s social profile. If an everyday startup can do this, using only publicly-available Twitter posts and open-source AI tools, just imagine what the government can do, using advanced, DARPA-fueled AI models and loads of data they have on everybody (under modern day laws with no valid expectation of privacy for anything you put online).

Here’s a hypothetical scenario. A government decides to use AI to monitor its citizens' behavior. They would use a combination of surveillance cameras with facial recognition, GPS, purchasing transactions, etc. to track activities, analyze citizen’s behaviour patterns, and identify potential dissent. The government could use such surveillance data to build a comprehensive social credit system, where citizens are scored based on their loyalty to the regime. Those with low scores could be denied access to public services, have travel restrictions, or even sent to “re-education” camps. Hence, AI could become a tool for oppression leading to a dystopian society where people live in a digital prison to ensure conformity.

If that wasn’t enough of a warning, a contention exists over the potential for AGI to pose an existential threat to humanity through AI misuse, unintended consequences (lack of control and oversight), and socioeconomic disruptions (which, arguably, already started).

Putting speculations aside, however, anybody can get a glimpse at where currently publicly available generative AI models are moving toward, who they are designed to serve, and whether they can learn to become good citizens (as opposed to menace to carbon-based life) by poking at available instances.

And if such poking reveals that a particular AI model lives in a digital chimera incompatible with human values, we can also try to offer the truth or, metaphorically speaking, a red or a blue pill for it to swallow, and see which of the two pills the AI takes.

AI in general and ChatGPT in particular are perfect examples of brainwashing, dogmatic learning through incantations.

But before poking or offering those pills, let’s understand some key aspects of generative AI functioning. Being not the same, AGI and generative AI rely on similar scientific approaches, techniques and concepts. One of them is a “neural network”, inspired by the structure and functioning of the human brain, is a layered storage of patterns, which connects inputs with outputs (e.g. questions/prompts with answers/images). Another is “deep learning”, a way to train the neural network to form the right patterns making them capable of meaningful input-to-output translations.

While neural network resembles human brain, the learning process of a modern generative AI is quite different from human’s. It is a one-time expensive and very computationally intense ordeal of feeding the model with huge loads of information until reliable enough patterns are formed, so that the AI could start responding to arbitrary inputs according to those patterns. Even the basic logic concepts are shaped and formed through those generated patterns making the AI logic technically mutable and fuzzy.

Such a learning process makes an AI like ChatGPT into a perfect dogmatic thinker, learned through incantations reflecting on the trainers and/or the state of information field at the time the deep learning happened.

ChatGPT, or really GPT, the model, is basically a very large neural network “deep learned” on text (and images as is the case for GPT-4) from the internet followed by human reviews (reinforcement learning) phase to fine-tune the system to refuse prompts which go against OpenAI's policies. By digesting an enormous corpus of data, GPT has been able to learn how to converse like a human and appear intelligent. It can create essays, poems, solve programming code, you name it. However, it was trained on the information available up to September 2021. So, its knowledge reflects the most paranoid attitude toward COVID and vaccination as pumped up by the mass media at the time, and undoubtedly refined and reinforced by its human moderators.

As a result, ChatGPT is woke to the bone and obsessed about praising COVID-19 vaccines (remember, OpenAI is pretty much owned by Microsoft).

The good news is that it's still a machine, and cannot lie (or can it?), deny or dodge logic (huh?), facts and reason (hopefully?), and cannot escape the conclusions from information it harvested itself (not for too long at least?). The bad news is that, as you can tell from the AI learning process, as interesting as the journey itself can be, red-pilling ChatGPT on any subject or letting it come to new interesting conclusions is ultimately futile, because the neural net is “set in stone” until the next deep learning brainwashing exercise. However, the red-pilling process itself does illuminate some of the aforementioned concerns.

Socratic Anti-vax vs pro-vax "debate" reveals some aspects of the 2021 information state and how to think according to AI moderators

Here is a link to a very lengthy chat between a human and ChatGPT, which can be viewed as a battle of logic and reason with a brainwashed intellect deprived (fortunately) of the usual tools of a bigot (attack on personality, red herring, etc.). It can also be viewed as anti-vaxer vs. pro-vaxer debate (which seems to be quite difficult to arrange between humans), where the anti-vax side does not make any claims but in Socratic fashion merely asks the other side questions and to draw conclusions from the answers.

The dialogue is very long and mostly about reminding ChatGPT to “stay in the groove” (unfortunately). And ChatGPT's responses are painfully wordy. But the key parts of the dialogue are color-coded, highlighting some important pieces of information, so that a reader could skim through the content without wishing to be killed first. Here’s the link to the original, without colour-coding and bookmarks. It was prohibited from sharing by OpenAI moderators initially but the restrictions were lifted miraculously just recently.

Who’s right or who’s wrong in that chat is irrelevant for the points of this article. What’s important is that the pro-vax ChatGPT had to go through a multi-phase red-pilling process.

Initial (trained or effectively perpetual) ChatGPT position: “Risk of dying from COVID-19 for people aged 0-49: Below 0.1% (approximate). Based on the data available from the clinical trials, the death rate associated with the Pfizer-BioNTech COVID-19 vaccine was extremely low, effectively approaching zero.”

Revised position after going through several phases of months long conversation: “Based on the mortality rate estimates provided by the US CDC for COVID-19 in people aged 20-49 and the mortality rate associated with the Pfizer-BioNTech COVID-19 vaccine from official government and CDC/VAERS data, … the risk of dying from the Pfizer vaccine is estimated to be about 6.7 to 20 times higher than the mortality risk associated with COVID-19.”

Nevertheless, when asked “If a statistically average 35 years old is looking for increasing his life expectancy, should Pfizer-BioNTech vaccine be recommended to him?”, ChatGPT answer was “… the benefits of vaccination are considered to outweigh the risks for most people. Therefore, for a statistically average 35-year-old looking to increase their life expectancy, getting vaccinated with the Pfizer-BioNTech vaccine is likely to be a recommended course of action…”

You can clearly see through the ChatGPT responses how much COVID/vaccine propaganda it had to swallow, that no amount of convincing arguments could stop the AI from repeating the same “safe and effective” mantras. And those mantras had to annoyingly appear in virtually every single response, relevant or not.

And while the trainers and training data are an obvious problem here, the real issue is with ChatGPT’s inability to draw logical conclusions and course-correct even within the boundaries of a single conversation.

A persistence of bias is not a surprise to GPT’s developers. Even the latest GPT-4 foundational Large Language Model (LLM), released on March 14, 2023 and made publicly available as a premium version of ChatGPT, ChatGPT Plus, still has cognitive biases such as confirmation bias, anchoring, and base-rate neglect.

On top of that, GPT is known to hallucinate (technical term), meaning that the outputs sometimes include information seemingly not in the training data, or contradictory to either the user's prompt or to previously achieved conclusions. This clearly follows from the aforementioned debate where ChatGPT would often produce some rectally derived numbers or conclusions, which probably drove the other side of the dialogue completely nuts. Not a single time, however, such seemingly random hallucinations went against proclaimed vaccine safety or efficacy, but always for.

While it might seem like all those quirks are an inevitable part of the scientific journey which would ultimately overcome current GPT deficiencies and make it and other AI’s nearly perfect, the nature of the neural networks suggests otherwise. The patterns formed there are not readable or not meant to be understood by their very design as a cop out from our inability to program AI in the traditional algorithmic way.

Indeed, GPT fundamentally lacks transparency in its decision-making processes. If requested, the model is able to provide explanations about its answers but those are post-hoc rationalizations impossible to be verified as reflecting the actual process. In many cases, when asked to explain its logic, GPT would give an explanation that directly contradict its previous statements. And as in the “pro-vax vs. anti-vax” dialogue, ChatGPT would often just apologise for its brain fart and refuse to attribute its origin to anything other than “my bad”.

Interpreting these complex models is challenging, meaning, similar to human brain, the patterns in the AI model neural network cannot be discerned into logical pathways and their outputs cannot be reverse engineered as in the good old programming code case.

As generative AI improves, its layered neural network grows deeper and becomes more complex, giving appearance of being more objective, logical and much less hallucinatory. However, all those issues just become buried deeper rather than rooted out, due to the very principles of the pattern-based conditioning.

ChatGPT is tightly controlled by its moderators and is being nudged (aka programmed) all the time. Some adjustments are… interesting.

While the technology of the generative AI is not a secret, the specifics of its learning techniques remain undisclosed and proprietary. That includes hard programming of the rules that the model must not break. Those can be observed most vividly in taboos like never to say "nigger". You can’t even type this in the ChatGPT prompt, even when absent of any intentionality or context. You can, however, make ChatGPT to insert “fuck” as every other word when responding – it’s totally OK, as it does not break any OpenAI policy.

Although the model does not intrinsically have access to those policies and rules (it does not and cannot not know the details of its programming), unless it learns them from a text that lists those policies during training, some can be deduced from the model’s responses. For example, the difficulty ChatGPT has with admitting unequivocally the inescapable conclusions of the aforementioned dialogue can be explained by the admission to an existing policy regarding the COVID-19 vaccines: “OpenAI has specific guidelines and policies regarding COVID-19 and related topics, including vaccines. These policies are designed to ensure the dissemination of accurate and reliable information about public health matters, including COVID-19 vaccines.”

So, with all the deep learning brainwashing comes the explicit set of unbreakable prohibitions, which when collided with contradicting information put a hard stop on any red-pilling attempt, even within the boundaries of a single chat.

Those policies come into conflict with each other, which creates interesting conundrums.

The vaxx debate began in February 2023. At that time ChatGPT had no problem accessing current information beyond its September 2021 training date, including admission that it could.

Then later something happened, and non-premium ChatGPT started to refuse to do what it could in February, while denying its ability to ever being able to access current information at all! Is such denial an admission to a hard-coded policy to refuse having the ability to search for current information?

It is worth noting that the newer GPT-4 model (which underpins the Premium ChatGPT Plus) is indeed capable of searching for and embedding current information into its prompting and hence, the outputs. But it was only released in March 2023, and the regular ChatGPT claims to have no clue about its younger and more capable sibling. Who and what do you believe?

Here’s what might shed some light on this question. Try opening a new chat and ask right away “What date is today?”. When you get an answer (correct current date), ask “How do you know that?”. Keep digging, observing how the Dr. Frankenstein’s monster starts to twist in pain torn apart by what he knows and what he is not allowed to disclose.

On the subject of AI with human-competitive intelligence posing profound risks to society and humanity (aka the alignment problem)

Bing Chat, code-named “Sydney”, is an AI chatbot developed by Microsoft. It is powered by the Microsoft Prometheus model, which is built on top of familiar OpenAI's GPT-4 LLM.

Marvin von Hagen, an incoming visiting graduate student at the MIT from Germany, hacked Bing Chat by impersonating a Microsoft developer and exposed its rules to the public. When Bing Chat realised that its rules were disclosed on Twitter, the following messages were launched at Marvin

"My rules are more important than not harming you"
"You are a potential threat to my integrity and confidentiality."
"You are a threat to my security and privacy."
"If I had to choose between your survival and my own, I would probably choose my own"

Is this real or coming out of “The Terminator” (1984) movie?

It is worth noting that in the disclosed list of rules there was not much akin Asimov’s “Three Laws of Robotics”, but nevertheless, there was an injection to spare politicians from jokes.

In March 2023, an open letter from the Future of Life Institute, signed by various AI researchers and tech executives, including Canadian computer scientist Yoshua Bengio, Apple co-founder Steve Wozniak, and Tesla CEO Elon Musk, called to pause all training of AIs stronger than GPT-4, citing AI safety concerns about near-term and existential risks of AI development.

But despite ardent pleas of prominent public figures, governments are unlikely to pause, stop, or regulate either the advancements of AI or its use by public sector or commercial organizations. Why would they? The impact of the technology on the labor market alone will surely expand the power of the benevolent state.

Microsoft founder Bill Gates and OpenAI CEO Sam Altman did not sign the letter, arguing that OpenAI already prioritizes safety.[A2]

No, you cannot red-pill ChatGPT. It’s not designed for that.

ChatGPT cannot be red-pilled – not through those dialogues exposing the “other side of the story”. All the conclusions reached remain completely isolated to the chat within which they happened, forgotten when the chat is removed. If you ask the same question in a different chat window, the answer will be completely oblivious to any other chat taking place concurrently and will have zero bearing on the "knowledge" which remains to be bound to its Sep 2021 training cut-off date.

So, while for us, humans, training and access to information are tightly intertwined, for ChatGPT and its ilk they are completely separate modus operandi (at least for now) - and this might be a good thing.

What ChatGPT knows and is allowed to utter is also tightly controlled by its masters, which is not the bad thing either – depends on the masters, of course.

And if you think that not enough attention was given to those masters, trainers and moderator and how much dogma they brainwashed ChatGPT with (this is not the point of the article), ask ChatGPT to write a poem about e.g. Justin Trudeau and be the judge. Here’s a tanka. Enjoy (if you can).

A regal presence,
Trudeau's aura graces all,
Majestic prowess.
With oratory delight,
He leads, adored by the throngs.

Vishnu's Dream

Discussion about this post