Maybe it’s like the dotcom bubble: there is genuinely useful tech that has recently emerged, but too many companies are trying to jump on the bandwagon.
LLMs do seem genuinely useful to me, but of course they have limitations.
We need to stop viewing it as artificial intelligence. The parts that are worth money are just more advanced versions of machine learning.
Being able to assimilate a few dozen textbooks and pass a bar exam is a neat parlor trick, but it is still just a parlor trick.
Unfortunately probably the biggest thing to come out of it will be the marketing aspect. If they spend enough money to train small models on our wants and likes it will give them tremendous amounts of return.
The key to using it in a financially successful manner is finding problems that fit the bill. Training costs are fairly high, quality content generation is also rather expensive. There are sticky problems around training it from non-free data. Whatever you’re going to use it for either needs to have a significant enough advantage to make the cost of training /data worth it.
I still think we’re eventually going to see education rise. The existing tools for small content generation adobe’s use of it to fill in small areas is leaps and bounds better than the old content aware patches. We’ve been using it for ages for speech recognition and speech generation. From there it’s relatively good at helper roles. Minor application development, copy editing, maybe some VFX generation eventually. Things where you still need a talented individual to oversee it but it can help lessen the workload.
There are lots of places where it’s being used where I think it’s a particularly poor fit. AI help desk chatbots, IVR scenarios, It says brain dead as the original phone trees and flow charts that we’ve been following for decades.
If GPT4o is still not what you would call AI, then what is? You can have conversations with it, the Turing test is completely irrelevant all of the sudden.
Hasn’t the Turing Test been irrelevant for a while now? Even before the new AI boom?
Artificial intelligence is a moving target. Every time a goal gets reached, they just move the goalposts, because “well, obviously this isn’t real intelligence”.
No, it was just suddenly completely irrelevant. The answers of the first chat bot that supposedly “beat” it are a complete joke. And yes, I just wrote exactly the same with the goal getting moved, next it has to invent relativity or it’s not intelligent. Absurd.
It’s a massive text predictor. It doesn’t solve problems, it applies patterns based on correlations it picked up during training. If someone talked about your topic online, it has been trained on those conversations. If a topic has two sides that don’t agree, chat gpt might respond in a way that is biased towards one side or the other and you can easily get it to “switch” to the other side with follow up prompts.
For what would be considered AI, think of the star trek computer or Data. The Star Trek computer could create simulations of warp core behaviour to push frontiers of knowledge or characters smart enough to defeat its own safeties (frankly, the computer was such a deus ex machina kinda thing that it was hard to suspend disbelief at times, like why did they even have humans doing the problem solving with computers that capable?). Data wouldn’t get confused about whether any counties in Africa start with K.
I don’t think the Turing test is an effective means of determining intelligence anyways. It came from a time when a conversational computer was barely thinkable. But I wouldn’t even say chat gpt is there yet, since you can tell if you ask it the right things. It is very useful, don’t get me wrong, like a very powerful search engine. But it’s not intelligent.
What of what you say does not apply to humans?
They apply patterns of behavior in response to some input. Picked up by learning them. Including people talking online. They are always biased on some way. Some will acknowledge their bias and change it if you give them context.
GPT can literally create simulations. I have used it to do exactly that, specifically for 2D heat conducting with coupled mass transport and reaction kinetics.
Yeah, it does do some very human-like things, but it’s still missing some important parts.
It’s kinda like using a textbook for problem solving. It’s great at helping you solve instances of problems that have already been solved, but you won’t likely find the next big advancement in that field in a textbook.
Newton realized masses attracted each other, and through experimentation, came up with his laws of classical physics.
Einstein took the idea that the speed of light always seems to be the same despite relative motion to come up with special relativity, then realized that space-time itself was a physical thing that could be interacted with rather than just a medium, plus came up with field equations that were used to predict things like black holes before anyone had any kind of notion that they were real things.
Chat gpt is incapable of things like that. And sure, many humans never do anything like that, some might not even be capable even if they were motivated and had the right supports to try. But many humans do solve problems that they’ve never seen before. There’s big names in academia but so many more that don’t get famous but still push the boundaries of human knowledge, creatively solving problems and answering questions every day.
I wouldn’t be surprised if an LLM is a piece of general AI if or when it comes, but there will be other parts that are currently missing. We don’t even know what consciousness is, let alone if any of our hardware is capable of creating/hosting one.
I listened to a podcast (This American Life, IIRC), where some researchers were talking about their efforts to determine whether or not AI could reason. One test they did was asking it to stack a random set of items (one it wouldn’t have come across in any data set, plank of wood, 12 eggs, a book, a bottle, and a nail. . .probably some other things too) in a stable way. With chat gpt 3, it basically just (as you would expect from a pure text predictor) said to put one object on top of another, no way would it be stable.
However, with gpt 4, it basically said to put the wood down, and place the eggs in a 3 x 4 grid with the book on top (to stop them from rolling away), and then with the bottle on top of that, with the nail (even noting you have to put the head side down because you couldn’t make it stable with the point down). It was certainly something that could work, and it was a novel solution.
Now I’m not saying this proves it can think, but I think this “well it’s just a text predictor” kind of hand-waves away the question. It also begs the question, and based on how often I hear people parroting the same exact arguments against AI thinking, I wonder how much we are simply just “text predictors.”
The sheer size of it and it’s training data makes it hard to really say what it’s doing. Like for an object that it wouldn’t have come across in it’s training data, a) how could they tell it was truly a new thing that had never been discussed anywhere on the internet where the training could have consumed it, and b) that any description provided for it didn’t map it to another object that would behave similarly when stacking.
Stacking things isn’t a novel problem. The internet will have many examples of people talking about stacking (including this one here, eventually). The put the flat part down for the nail could have been a direct quote, even. Putting a plank of wood at the bottom would be pretty common, and even the eggs and book thing has probably been discussed before.
I mean, I can’t dismiss that it isn’t doing something more complex, but examples like that don’t convince me that it is. It is capable of very impressive things, and even if it needs to regurgitate every answer it gives, few problems we want to solve day to day are truly novel, so regurgitating previous discussions plus a massive set of associations means that it can map a pretty large problem space to a large solution space with high accuracy.
I’m having trouble thinking of ways to even determine if it can really problem solve that won’t accidentally map to some similar discussion among nerds that like to go into incredible detail and are willing to speculate in any direction just for the sake of enjoying a thought experiment.
Like even known or suspected unsolvable problems have been discussed to greater levels of detail than I’ve likely considered them, so even asking it to do its best trying to solve the traveling salesman problem in polynomial time would likely impress me because computer science students and alums much smarter than I am have discussed it at length.
Sure, there is a chance the exact question had been asked before, and answered, but we are talking remote possibilities here.
that any description provided for it didn’t map it to another object that would behave similarly when stacking.
If it has to say ‘this item is like that other item and thus I can use what I’ve learned about stacking that other item to stack this item’ then I would absolutely argue that it is reasoning and not just “predicting text” (or, again, predicting text might be the equivalent of reasoning).
Stacking things isn’t a novel problem.
Sure, stacking things is not a novel problem, which is why we have the word “stack” because it describes something we do. But stacking that list of things is (almost certainly) a novel problem. It’s just you use what you’ve learned and apply that knowledge to this new problem. A non-novel problem is if I say “2+2 = 4” and then turn around and ask you “what does 2 + 2 equal?” (Assuming you have no data set) If I then ask you “what’s 2 + 3?” that is a novel problem, even if it’s been answered before.
I mean, I can’t dismiss that it isn’t doing something more complex, but examples like that don’t convince me that it is. It is capable of very impressive things, and even if it needs to regurgitate every answer it gives, few problems we want to solve day to day are truly novel, so regurgitating previous discussions plus a massive set of associations means that it can map a pretty large problem space to a large solution space with high accuracy.
How are you convinced that humans are reasoning creatures? This honestly sounds like you could be describing 99.99% of human thought, meaning we almost never reason (if not actually never). Are we even reasonable?
So every human that does not come up with something entirely new that has never been before is not intelligent? Are people with an IQ of 80 not intelligent anymore, just bio-machines?Seriously, where do you draw the line? You keep shifting the goal to harder and harder to reach things that at this point most people would not fit anymore. When GPT5 will then also do that, what will you say? That it did not invent the car? Come up with relativistic effects?
I could have full conversations with CleverBot a decade ago, but nobody was calling that AI then or even now. People generally recognized it for what it was - a heuristic model chatbot. These LLMs are just overgrown chatbots that still lack the capability of understanding anything it says to you other than how certain words relate to one another.
I can write a program that just replies “yes” to everything you say and you can have a conversation with that. Is that program AI?
“AI isn’t really AI and no one ever thought that AI was actually AI so it doesn’t matter if we call it AI” is the funniest level of tech bro cope these days.
Three dudes in a university somewhere referring to chatbots as AI does not redefine the word, even if they did it 70 years ago. 99.999% of the population has always meant AGI by “AI”. Trying to pretend they were always something different is COPE.
We’re hitting logarithmic scaling with the model trainings. GPT-5 is going to cost 10x more than GPT-4 to train, but are people going to pay $200 / month for the gpt-5 subscription?
4o is also not really much better than 4, they likely just optimized it among others by reducing the model size. IME the “intelligence” has somewhat degraded over time. Also bigger Model (which in tha past was the deciding factor for better intelligence) needs more energy, and GPT5 will likely be much bigger than 4 unless they somehow make a breakthrough with the training/optimization of the model…
4o is optimization of the model evaluation phase. The loss of intelligence is due to the addition of more and more safeguards and constraints by the use of adjunct models doing fine turning, or just rules that limit whole classes of responses.
Businesses might pay big money for LLMs to do specific tasks. And if chip makers invest more in NPUs then maybe LLMs will become cheaper to train. But I am just speculating because I don’t have any special knowledge of this area whatsoever.
Maybe it’s like the dotcom bubble: there is genuinely useful tech that has recently emerged, but too many companies are trying to jump on the bandwagon.
LLMs do seem genuinely useful to me, but of course they have limitations.
We need to stop viewing it as artificial intelligence. The parts that are worth money are just more advanced versions of machine learning.
Being able to assimilate a few dozen textbooks and pass a bar exam is a neat parlor trick, but it is still just a parlor trick.
Unfortunately probably the biggest thing to come out of it will be the marketing aspect. If they spend enough money to train small models on our wants and likes it will give them tremendous amounts of return.
The key to using it in a financially successful manner is finding problems that fit the bill. Training costs are fairly high, quality content generation is also rather expensive. There are sticky problems around training it from non-free data. Whatever you’re going to use it for either needs to have a significant enough advantage to make the cost of training /data worth it.
I still think we’re eventually going to see education rise. The existing tools for small content generation adobe’s use of it to fill in small areas is leaps and bounds better than the old content aware patches. We’ve been using it for ages for speech recognition and speech generation. From there it’s relatively good at helper roles. Minor application development, copy editing, maybe some VFX generation eventually. Things where you still need a talented individual to oversee it but it can help lessen the workload.
There are lots of places where it’s being used where I think it’s a particularly poor fit. AI help desk chatbots, IVR scenarios, It says brain dead as the original phone trees and flow charts that we’ve been following for decades.
Machine learning is AI. I think the term you’re looking for is general artificial intelligence, and no one is claiming LLMs fall under that label.
If GPT4o is still not what you would call AI, then what is? You can have conversations with it, the Turing test is completely irrelevant all of the sudden.
Hasn’t the Turing Test been irrelevant for a while now? Even before the new AI boom?
Artificial intelligence is a moving target. Every time a goal gets reached, they just move the goalposts, because “well, obviously this isn’t real intelligence”.
No, it was just suddenly completely irrelevant. The answers of the first chat bot that supposedly “beat” it are a complete joke. And yes, I just wrote exactly the same with the goal getting moved, next it has to invent relativity or it’s not intelligent. Absurd.
It’s a massive text predictor. It doesn’t solve problems, it applies patterns based on correlations it picked up during training. If someone talked about your topic online, it has been trained on those conversations. If a topic has two sides that don’t agree, chat gpt might respond in a way that is biased towards one side or the other and you can easily get it to “switch” to the other side with follow up prompts.
For what would be considered AI, think of the star trek computer or Data. The Star Trek computer could create simulations of warp core behaviour to push frontiers of knowledge or characters smart enough to defeat its own safeties (frankly, the computer was such a deus ex machina kinda thing that it was hard to suspend disbelief at times, like why did they even have humans doing the problem solving with computers that capable?). Data wouldn’t get confused about whether any counties in Africa start with K.
I don’t think the Turing test is an effective means of determining intelligence anyways. It came from a time when a conversational computer was barely thinkable. But I wouldn’t even say chat gpt is there yet, since you can tell if you ask it the right things. It is very useful, don’t get me wrong, like a very powerful search engine. But it’s not intelligent.
What of what you say does not apply to humans? They apply patterns of behavior in response to some input. Picked up by learning them. Including people talking online. They are always biased on some way. Some will acknowledge their bias and change it if you give them context.
GPT can literally create simulations. I have used it to do exactly that, specifically for 2D heat conducting with coupled mass transport and reaction kinetics.
Yeah, it does do some very human-like things, but it’s still missing some important parts.
It’s kinda like using a textbook for problem solving. It’s great at helping you solve instances of problems that have already been solved, but you won’t likely find the next big advancement in that field in a textbook.
Newton realized masses attracted each other, and through experimentation, came up with his laws of classical physics.
Einstein took the idea that the speed of light always seems to be the same despite relative motion to come up with special relativity, then realized that space-time itself was a physical thing that could be interacted with rather than just a medium, plus came up with field equations that were used to predict things like black holes before anyone had any kind of notion that they were real things.
Chat gpt is incapable of things like that. And sure, many humans never do anything like that, some might not even be capable even if they were motivated and had the right supports to try. But many humans do solve problems that they’ve never seen before. There’s big names in academia but so many more that don’t get famous but still push the boundaries of human knowledge, creatively solving problems and answering questions every day.
I wouldn’t be surprised if an LLM is a piece of general AI if or when it comes, but there will be other parts that are currently missing. We don’t even know what consciousness is, let alone if any of our hardware is capable of creating/hosting one.
I listened to a podcast (This American Life, IIRC), where some researchers were talking about their efforts to determine whether or not AI could reason. One test they did was asking it to stack a random set of items (one it wouldn’t have come across in any data set, plank of wood, 12 eggs, a book, a bottle, and a nail. . .probably some other things too) in a stable way. With chat gpt 3, it basically just (as you would expect from a pure text predictor) said to put one object on top of another, no way would it be stable.
However, with gpt 4, it basically said to put the wood down, and place the eggs in a 3 x 4 grid with the book on top (to stop them from rolling away), and then with the bottle on top of that, with the nail (even noting you have to put the head side down because you couldn’t make it stable with the point down). It was certainly something that could work, and it was a novel solution.
Now I’m not saying this proves it can think, but I think this “well it’s just a text predictor” kind of hand-waves away the question. It also begs the question, and based on how often I hear people parroting the same exact arguments against AI thinking, I wonder how much we are simply just “text predictors.”
The sheer size of it and it’s training data makes it hard to really say what it’s doing. Like for an object that it wouldn’t have come across in it’s training data, a) how could they tell it was truly a new thing that had never been discussed anywhere on the internet where the training could have consumed it, and b) that any description provided for it didn’t map it to another object that would behave similarly when stacking.
Stacking things isn’t a novel problem. The internet will have many examples of people talking about stacking (including this one here, eventually). The put the flat part down for the nail could have been a direct quote, even. Putting a plank of wood at the bottom would be pretty common, and even the eggs and book thing has probably been discussed before.
I mean, I can’t dismiss that it isn’t doing something more complex, but examples like that don’t convince me that it is. It is capable of very impressive things, and even if it needs to regurgitate every answer it gives, few problems we want to solve day to day are truly novel, so regurgitating previous discussions plus a massive set of associations means that it can map a pretty large problem space to a large solution space with high accuracy.
I’m having trouble thinking of ways to even determine if it can really problem solve that won’t accidentally map to some similar discussion among nerds that like to go into incredible detail and are willing to speculate in any direction just for the sake of enjoying a thought experiment.
Like even known or suspected unsolvable problems have been discussed to greater levels of detail than I’ve likely considered them, so even asking it to do its best trying to solve the traveling salesman problem in polynomial time would likely impress me because computer science students and alums much smarter than I am have discussed it at length.
Sure, there is a chance the exact question had been asked before, and answered, but we are talking remote possibilities here.
If it has to say ‘this item is like that other item and thus I can use what I’ve learned about stacking that other item to stack this item’ then I would absolutely argue that it is reasoning and not just “predicting text” (or, again, predicting text might be the equivalent of reasoning).
Sure, stacking things is not a novel problem, which is why we have the word “stack” because it describes something we do. But stacking that list of things is (almost certainly) a novel problem. It’s just you use what you’ve learned and apply that knowledge to this new problem. A non-novel problem is if I say “2+2 = 4” and then turn around and ask you “what does 2 + 2 equal?” (Assuming you have no data set) If I then ask you “what’s 2 + 3?” that is a novel problem, even if it’s been answered before.
How are you convinced that humans are reasoning creatures? This honestly sounds like you could be describing 99.99% of human thought, meaning we almost never reason (if not actually never). Are we even reasonable?
So every human that does not come up with something entirely new that has never been before is not intelligent? Are people with an IQ of 80 not intelligent anymore, just bio-machines?Seriously, where do you draw the line? You keep shifting the goal to harder and harder to reach things that at this point most people would not fit anymore. When GPT5 will then also do that, what will you say? That it did not invent the car? Come up with relativistic effects?
I could have full conversations with CleverBot a decade ago, but nobody was calling that AI then or even now. People generally recognized it for what it was - a heuristic model chatbot. These LLMs are just overgrown chatbots that still lack the capability of understanding anything it says to you other than how certain words relate to one another.
I can write a program that just replies “yes” to everything you say and you can have a conversation with that. Is that program AI?
“AI isn’t really AI and no one ever thought that AI was actually AI so it doesn’t matter if we call it AI” is the funniest level of tech bro cope these days.
AI has been the name of the field for 70 years at this point, it isn’t something Sam Altman came up with as a marketing wheeze.
Three dudes in a university somewhere referring to chatbots as AI does not redefine the word, even if they did it 70 years ago. 99.999% of the population has always meant AGI by “AI”. Trying to pretend they were always something different is COPE.
Magic Eightball
We’re hitting logarithmic scaling with the model trainings. GPT-5 is going to cost 10x more than GPT-4 to train, but are people going to pay $200 / month for the gpt-5 subscription?
But it would use less energy afterwards? At least that was claimed with the 4o model for example.
4o is also not really much better than 4, they likely just optimized it among others by reducing the model size. IME the “intelligence” has somewhat degraded over time. Also bigger Model (which in tha past was the deciding factor for better intelligence) needs more energy, and GPT5 will likely be much bigger than 4 unless they somehow make a breakthrough with the training/optimization of the model…
4o is optimization of the model evaluation phase. The loss of intelligence is due to the addition of more and more safeguards and constraints by the use of adjunct models doing fine turning, or just rules that limit whole classes of responses.
Businesses might pay big money for LLMs to do specific tasks. And if chip makers invest more in NPUs then maybe LLMs will become cheaper to train. But I am just speculating because I don’t have any special knowledge of this area whatsoever.