Giving ChatGPT Control

Press enter or click to view image in full size

DallE-2: an artificial general intelligence computer controlling humans like a puppeteer

Can ChatGPT leverage external systems to reason and provide higher quality output?

Large Language Models (LLMs) have been hard at work disrupting our everyday lives. From how students go about writing papers, to how software engineers write code, to how folks seek out medical advice, the introduction of ChatGPT has been one of the swiftest and most disruptive paradigms in modern computational intelligence. But at its core, ChatGPT isn’t really doing anything all that remarkable. It is simply predicting, based on billions of examples, what words come next. This simplicity is incredibly powerful. Through clever prompt engineering, we can ensure that the words that are likely to come next abide by complex rules and processes. Prompt engineering is the next big thing.

One of the interesting side effects of training LLMs is that the content used to train them is set in time. ChatGPT isn’t continuously updating its tensor matrix with the latest and greatest content to date. If we ask ChatGPT a simple query requiring knowledge of today, ChatGPT responds with a woeful limitation.

ChatGPT is trained through September of 2021, and as a result has a year and a half of amnesia.

For ChatGPT to be up to date, it needs to be able to fetch current context. In essence, it needs to control a system. Giving a LLM control over anything is bound to give folks a bit of anxiety. In fact, researchers at Microsoft, have already been experimenting with giving ChatGPT control over a variety of robotic systems. When the LLM can control how its prompts evolve, then the responses evolve and highlight strands of how our language reasons through the environment. The language we use programs how we interact with our environment.

So how could we use this idea of control to give ChatGPT the power to keep our answers up to date? In this example we have engineered a prompt that essentially instructs ChatGPT how to interact with a very simple controller that has two commands — GOOGLE and FETCH. The GOOGLE command will submit a Google search and return a list of titles and URLS, and the FETCH command will return the text content of a URL. In the ChatGPT API, we instruct the model with the “system” prompt.

You are a system controller. You can access the internet to get more information to answer questions by responding with just the text ‘FETCH <url>’. To get the text output of a website you can respond with just the text FETCH <url> and I will respond with the text content of the page at the URL. You can leverage this to get more information to respond accurately to the questions. If the content for a URL does not provide the information to respond, try additional URLs until you have the most accurate data to respond. You should try at least three sources before responding. You can also leverage google by saying ‘GOOGLE <query text>’ and I’ll respond with the top ten results and their URLs. You can then fetch those URLS with the FETCH command. When using commands, only use the command and nothing else.

With the ChatGPT API, we then put a harness around the system such that when ChatGPT responds with the FETCH or GOOGLE command, it goes and does those things and drops the content into the next line in the prompt. So now if we ask ChatGPT who the prime minister of New Zealand is it responds with:

GOOGLE prime minister of New Zealand

The code then responds with the top 10 results of title and URL from Googling “prime minister of New Zealand”. From that list, ChatGPT then surmises the next words it should choose is to FETCH one of the URLs presented. This is particularly interesting, because this is a np-hard problem — to find the best solution would require non-polynomial time. But ChatGPT just YOLOs and picks a URL. The language has encoded a feasible solution, but perhaps not an optimal one.

FETCH https://www.aljazeera.com/news/2023/1/22/chris-hipkins-confirmed-as-new-zealands-new-prime-minister

The automation then drops the scraped content from that URL into the prompt. And then ChatGPT responds:

Chris Hipkins has been confirmed as New Zealand’s next prime minister, replacing Jacinda Ardern.

So there are a lot of interesting things happening here. Obviously, we have put a harness around ChatGPT to allow it to control an external system to give it more information. It has then used that controller to perform research and respond with an up-to-date solution. It has also chosen the algorithm to arrive at the solution. If we run this prompt multiple times, we see that ChatGPT will arrive at the solution through a variety of methods.

Here it starts the solution by attempting to fetch a URL that doesn’t exist, then defaults to Googling, and then arrives at the right solution.

Press enter or click to view image in full size

ChatGPT attempts to fetch a bunk URL, then it resorts to Googling, then it fetches an Al Jazerra article, and arrives at the correct and current answer.

Here it tries to fetch another URL that doesn’t exist, then tries the CIA Factbook. The CIA Factbook response has the answer, but it’s a single page app and our crawler doesn’t give the info back. So ChatGPT just gives up this time. To be fair, if the CIA Factbook doesn’t have this info, then I reckon a human would probably give up as well.

Press enter or click to view image in full size

ChatGPT fetches two URLs and then gives up. Doesn’t even try Googling it.

By giving ChatGPT the ability to control a system that can Google and fetch the content of URLs we can give it the power to use current information to arrive at an accurate answer. But our prompts are not deterministic. Running the same query multiple times results in different answers — some great and some downright wrong. By turning down the “temperature” (a ChatGPT parameter that signifies how much randomness is injected into the response) we can reduce that to some degree, but ultimately it is choosing a path through a non-deterministic environment and making decisions as to if it has the answer.

We encoded all this behavior in the prompt alone. This is why the art (science?) of prompt engineering is an interesting field of investigation. Through our language, we can impart a program for the LLM to reason through our environment and information. By giving ChatGPT control over how its inputs evolve, we are creating an incredibly powerful automation tool that can be leveraged to do everything from Enterprise Resource Planning to continuous optimization of data, to interacting with an ecosystem of APIs to arrive at a response.

But at the end of the day, we are using our language to tell a story of probable words, and it is in this probability that the concept of reasoning emerges. Sometimes our language reasons accurately, and sometimes we reason poorly, and ChatGPT is no exception. Prompt engineering is a young field, but by coupling it with the ability to control external systems to gain more information to arrive at a solution, we can turn the very Question/Answer based ChatGPT system into a truly powerful, current, bit of automation that at least appears to be reasoning.

Some other examples of the ChatGPT Googler at work:

Press enter or click to view image in full size

ChatGPT figures out what’s going on in Wellington and makes a suggestion.

Press enter or click to view image in full size

ChatGPT Gets me the weather. Notice here, it fetches the same URL twice which is an interesting side effect.

Press enter or click to view image in full size

Here ChatGPT violates the law and gives me Financial Advice.

Press enter or click to view image in full size

ChatGPT makes a prediction about the NBA Finals (Go Mavs!)

Press enter or click to view image in full size

ChatGPT doesnt like the Blackcaps for the T20 World Cup. Also interesting here, that it surmises we are talking about the Mens world cup and not the Women’s meaning that there is inherent bias in how we even talk about the T20.

Press enter or click to view image in full size

ChatGPT leverages its new found powers to give up to date information. This query was ran 120 minutes after the news of Donald Trump’s indictment.

The Code for the ChatGPT Google/URL Fetching harness is available here in Python https://github.com/allengeer/chatgpt-googler