Notes on LLMs and AI

There has been a lot of press lately about AI, OpenAI, GPT-${X}, and how AI will affect the world. For the time being, I plan on not looking further into AI (unless my current or a future employer compels me to). I think that right now there is not enough diversity in the vendor population. I also have some thoughts on how it will affect things going forward.

I do not like to get too meta in my posts, but I have been writing this on and off for over a week, and I want to get it out before too much more time passes. I am still learning about this stuff, like what is the difference between models, weights, and datasets; some articles use a project name to refer to all three components. The LLaMA debacle is the textbook case: some stuff was released, some was leaked, there are projects that are based on Meta’s work, some that seem to be clean-room implementations, so how it all fits together is murky to me.

GPT-${X} by OpenAI is taking the world by storm, particularly Chat-GPT. It was the focus of a recent EmacsATX meeting. It is disruptive in the sense that it has capabilities beyond prior AI technology, and will probably have a profound affect on society going forward. But in another sense, it is the opposite of disruptive; it consolidates power and influence in OpenAI. One of the owners of OpenAI is Microsoft, and for me that makes anything by OpenAI something to avoid. They are not doing this for you.

I think a lot of people do not realize that when they play around with the OpenAI prompts in ChapGPT, they are training the OpenAI models and making them better and more powerful. Power that can be used by other users of tool. Not only the vendors, but also your competitors. There have been reports of confidential data and PII being put into ChatGPT, and then extracted by other users later. People need to be more careful. And stop making the rich and powerful richer and more powerful. A lot of people in corporate America might work at companies that are independent on paper, yet they all act like they want to be subsidiaries of Microsoft. Start looking out for your own company and your own career and your own life.

The GPT-${X} products were used in making GitHub Copilot. I mentioned Copilot when I posted I was moving from Github to Codeberg. It does not respect licenses, which could put a company at legal risk, and sometimes it “solves” a problem while violating stated constraints. GPT-${X} has the same issues: Who owns the training data? Who owns the output?

It is good to automate things, but could relying on AI too much make people stupider? A good point was brought up in the discussion about why MIT dropped SICP: When you rely on a black box, how do you know you can rely on the black box? I think we might be coming close to fulfilling a prophecy from Dune:

Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

I think we should collectively make an effort to avoid anything by OpenAI, and anything Microsoft. I do not know how long Microsoft has been involved with OpenAI, but there are a few MS hallmarks: it is called “OpenAI” even though it is not open (they have been tight-lipped about how they trained their data), and when it is wrong it insists you are wrong. And when it is incorporated into MS products it has started pushing ads.

There are a few alternatives out there. There is a company called Hugging Face that I think provides datasets, different models and hosting for AI. I think you can provide your own data. There is a company called Lambda Labs which provides all your AI/GPU needs: cloud hosting, colocation, servers, workstations with terrabytes of memory, and a very expensive and very nice looking laptop with Ubuntu pre-installed (a LOT more than System76, but it is nice to see more Linux laptops out there).

WRT software, there are some implementations of AI that are open source. NanoGPT can be run on a local system, although it might take a while. You can find the Github link here, and a link to what might be a fork on Codeberg here. It was started by Andrej Karpathy, who worked on autonomous driving at Tesla and worked at OpenAI.

GPT is a type of artificial neural network known as a large language model, or LLM. Then Facehook/Meta released an LLM called Large Language Model Meta AI, or LLaMA, so now there are a few projects with names referring to South American camelids: llama.cpp (Github link here, Hacker News discussion here), and a fork of llama.cpp called alpaca.cpp (Github link here, Codeberg link here). Once they saw money going to someone else’s pockets, Stanford decided to get in on the act with their own LLaMA implementation, also called Alpaca. There is one caleld Vicuna (intro page here, Github link here). And, last but not least, Guanaco, which look like a fork of Stanford’s Alpaca (Github repos here, page here). You would think AI researchers would come up with more original names rather than run a theme into the ground.

Note: I think Facebook/Meta did release some papers about LLaMA, and then some parts of it were leaked. The status of these projects is a bit unclear to me at the moment. Some of the projects mentioned cannot be used for commercial purposes. IANAL, but I think that llama.cpp and alpaca.cpp can since they are clean-room implementations and were not created with any assistance or collaboration with Meta. Stanford got some early access to LLaMA, so its project and Vicuna cannot be used for commercial purposes.

You can find some more info about open source AI here on Medium (archive here), and here on HN. I think the group EleutherAI is trying to be an open source counter to OpenAI.

There are a LOT of other AI projects out there, but a lot of them are just interfaces to Chat-GPT or DALL-E or something else from OpenAI, as opposed to a program you can run for yourself. A lot of the forks and clean-room/non-OpenAI models require a LOT of memory. Some need at least 60 GB. The mini I got from System76 can have up to 64GB. They have desktops that can go up to 1TB of memory, and servers up to 8TB. Granted, maybe something local will never catch up to OpenAI, but as a few comments in the HN discussion on llama.cpp pointed out: the open source models are becoming very efficient very quickly. Granted, some of the commenters said that AI might be out-of-reach for the hobbyist. But then all this stuff is doing is simulating a human.

So where does all this go next? Honestly, who knows, but I will share my thoughts anyway.

First off: I dismiss the doomsday scenario that AI will kill us all. Like the Wikipedia page on “pessimism porn” states: A lot of people like to predict disaster because it makes them feel smart, even if years go by and their predictions never come to pass. There are lot of people with blogs and YouTube channels that are always predicting a stock market collapse, or who think we are about to become Weimar Germany if the price of a gallon of milk goes up one cent. They dismiss you if you cannot offer irrefutable proof that the world will NOT end, yet they insist their predictions are to be regarded as self-evident. Granted, maybe those are not the best arguments against Skynet, but I have dealt with a lot of people who confuse the strength of their convictions for logic. Sometimes the best prediction is that things will mostly continue as they are, just with more of something you do (or do not) like.

Since this will be a major change, there will be an effect on jobs. Some jobs will be lost. But there might actually be more jobs due to AI. Scott McNeally pointed out that making a system used to be a masters thesis, and systems were pretty limited. Now we have powerful software that is easy to install. We have packages (like JDK, Golang, Elixir) that are powerful compilers and runtimes, far beyond what people thought was possible a few decades ago, yet they can be downloaded as tar or zip files that once expanded let people create robust, powerful software. Linux and these VMs have created a lot of technology jobs. I think AI might wind up on net creating more than we have now.

Granted, it is possible that the jobs that get created are more soul-sucking than what we have. I joked on Mastodon that AI will not take your job; it will just take away the parts you like, leaving you with the parts you do not like.

I do hope all the More Bad Advice pinheads who all sound the same and think the answer to everything is to cut costs lose their jobs. I have had good and bad bosses, but honestly, a lot of people in Corporate America sound the same: asking when things will be done, going on and on about how important some arbitrary deadline they pulled out of thin air is, harping on about innovation yet only having the same tired ideas (piling on more work during the so-called “good times”, then cutting staff when things start looking shaky).

And there will be more people thinking the same. One thing that really grates on me is that we are told in technology that we have to be constantly learning new things. Yet the world is full of business pinheads who cannot conceive of not using Excel. It bugs me that there are lot of people in corporate America And there are plenty of software developers who cannot conceive of doing something in a language that is not Javascript. I have a bad feeling that OpenAI will become the Third Pillar of Technology Stupidity.

Sadly, maybe that will be the way to stay employed. Be a Microsoft drone, a Javascript drone, or an OpenAI drone. I have met tech people older than me who said they could do things decades ago with Lisp and Smalltalk that most languages and runtimes still cannot match. I feel like we took a wrong turn somewhere.

That said, even if AI leads to more jobs, there could still be downsides. We are already seeing this: Generative AI is already being used to craft more effective phishing emails. ChatGPT accused a law professor of sexual harassment (article here, HN discussion here. The HN comments have examples of AI making stuff up, but the professor gave a good summary: “ChatGPT falsely reported on a claim of sexual harassment that was never made against me on a trip that never occurred while I was on a faculty where I never taught. ChapGPT relied on a cited Post article that was never written and quotes a statement that was never made by the newspaper.” What if this is used for background checks and nobody verifies what the AI tells them? This could cause a lot of damage to people. Per the quote misattributed to Mark Twain, a lie can travel halfway around the world before the truth can get its boots on.

We should call AI “artificial inference”, because it mostly makes up stuff that sounds true. It just makes guesses about what seems logical. For a long time it was logical to think the earth is flat. Yet for some reason people think the output of AI is always true. Perhaps they are assuming that it must be true since it is based on large data and advanced technology. But sometimes the machine learning is just machine lying. Marissa Mayer said Google search results seem worse because web is worse (articles here and here). People used to put content on the web to tell you things, and now they just want to sell you things. There is lots of junk on the web. I predict here will be a lot of junk in AI.

Microsoft is putting ads in Bing AI chat which is already fostering distrust in AI (article here and here). Unlike Google search ads, the ads in the chat are hard to distinguish from the rest of the results. If companies need to put ads in AI, then make it like Google ads. People realize that things need to be paid for. Intermingling ads with AI just ruins the AI. You do not need advanced AI to say something you are getting paid to say. Google has been able to serve ads based on user input since 2004.

I think AI will lead to a lot of artificial and misleading content. Not just text, but also audio and video. People might not be able to believe what they read, see or hear online. It could cause more cynicism and distrust in our society. Perhaps we will not get Skynet, just a slow decay and further fracturing of society.

AI could, of course, lead to massive job losses. A lot of people care more about cost than quality. And it is possible that after a time some of those jobs might come back. There is a post on Reddit (link here, HN discussion here) about a freelance writer who lost a gig to ChatGPT. (Another writer wrote an “AI survival guide“.) A few comments gave anecdotes of multiple applications to jobs that all sounded the same that the HR people realized were all done with AI. If more companies start using AI, a lot of websites will all start to be the same. A lot of people hate it when an article “sounds/feels like it was written by AI”. Perhaps the human touch will make a comeback. There is a joke I read somewhere:

An LLM walks into a bar.
The bartender asks, "What will you have?"
The LLM says, "That depends. What is everyone else having?"

Granted, it might be a while before jobs lost to AI come back, assuming they ever do. And not all of the jobs might not come back.

I think that people who understand concepts will do better in the long run than people who just know a tool. At least, that is how things have been. It could be different this time. On the other hand, could an AI come up with “Artisanal Bitcoin“?

Software used to be done in binary or assembly, and over time the languages became more powerful, and the number of jobs increased. Software was always about automation, and there was always something to automate. Has that cycle stopped?

I am worried, but I cannot just yet get on board the Doom Train. I remember working at Bank of America in the 00s/Aughts/Whatever that decade is called, and we all thought that all our jobs would go to India and there would be No More Software Made In ‘Merica. That did not happen.

Or maybe it is all a bubble that will burst.

Maybe the AI is not as advanced as the companies are telling us. OpenAI does not publicize it, but they used people in Kenya to filter the bad stuff (Reddit discussions here, here and here, Time article here with archive here, Vice article here with archive here, another Vice article here with archive here). One major focus of the artcles is that looking at all the toxic content was so traumatic for the workers that the company that got the contract ended it several months early. Looking at toxic content can wear on people. But isn’t the point of an AI to figure this stuff out?

My employer had us watch some videos on up and coming technology, and one of them was on AI. One of the people on the panel kept taking about how important it is to “train” and “curate” your data. They kept saying that over and over. And I had the same thought: isn’t that what the AI is supposed to do? They made it sound like AI was just a big fancy grep or SQL query.

Per the Vice articles, tech and social media companies have been using people in low-wage countries to flag content for years, while letting people think that their technology was so amazing. Perhaps ChatGPT is no different. I do not know if they have to re-flag everything for each version of GPT. I get the impression the data is trained when the AI is started up, and from there it is just repeating what it figured out. Does it actually learn in real-time the way a human can? Can an AI make new inferences and be an old dog learning new tricks the way a human can, or does it just keep re-inforcing the same ideas and positions the longer it runs? What if you train your data and the world changes? What if neo-nazis stop using triple parentheses as an anti-Jewish code today, and your training data is from two years ago? I guess you are just back to web search.

I think part of what is going on is hype. As Charlie Stross pointed out, it does seem interesting that we see the AI hype just starting as the corrupt-o-currency hype is winding down. The vulture capitalists need something new to sell.

Another issue is: will this scale going forward? Technology does not always progress at the same rate. We could be headed for another AI winter. Research into AI for autonomous driving has hit a wall (no pun intended).

And how will this scale? The human brain still have 1000 times the number of connections as GPT-4 has parameters. There is already a shortage forming for the chips used for AI. Is it worth it to burn the planet and use all that metal and electricity to chew through a lot of data…to do what? Simulate a human brain in a world with 8 billion people? Especially when a lot of the humans’ intelligence is not being used efficiently (see penetration of Lisp vs Windows).

That said, I don’t think AI will go away. If I could have one thing, I would like to see alternatives to OpenAI, particularly open source. It might be possible to run LLMs locally. Do you really need an AI that knows about oceanography? Most of us do not. I do not think that AI will kill us all (it is not clear to me how we go from chatbot to Terminator). But corporate consolidation in AI would be a tragedy.

I just need a job where I can use Emacs and call people stupid.

You’re welcome.

Image from an 11th-century manuscript housed in the Topkapi Palace in Istanbul, image from The Gabriel Millet Collection (collection page here), assumed allowed under public domain.