Is this the death of all world-changing AI applications?

Johannes Stiehler
Random Rant
Cover Image for Is this the death of all world-changing AI applications?

Since the earliest attempts to use machine learning on a large scale for truly critical tasks, such applications have repeatedly hit the same wall. The very thing that makes machine learning so magical also sometimes makes it useless: the outpourings of artificial intelligence are too often inexplicable in both the narrower and wider senses.

In saying this, I don't mean that we normal earthlings are too stupid to understand the complex data structures and algorithms behind the current AI hype (ChatGPT, LamDa, BARD, and the Bing AI thing).
The problem is different, and to some extent inevitable: even the experts who wrote these applications can't understand why a particular input produces the respective output.

This is due to the nature of the algorithms involved: In classical, or symbolic, artificial intelligence, one relies on knowledge representations and formulas to draw comprehensible conclusions from that knowledge. In other words, the human programmer specifies the rules whereby a certain input leads to an output. This has the advantage of absolute traceability and consistency, but unfortunately it is not powerful enough for many types of problems - while at the same time requiring very high development effort.

For non-symbolic problems, such as image recognition and audio analysis, such methods are completely unsuitable. This is where machine learning can help: using training data, i.e., "labeled" samples (e.g., pre-categorized images), the AI derives "rules" by itself. Different methods of non-symbolic AI differ in the nature of the "derivation engine".
Machine learning with a large number of parameters (and correspondingly ginormous amounts of training data) produces models that are no longer predictable or explainable. This, of course, is what makes ChatGPT, for instance, so appealing: the inexplicable humanity of its output, the amount of information it can draw on - until it starts lying or even arguing and you can't do anything about it.

People also lie, deceive, argue. I can't even count how many times I've quoted a fellow human being, only to find out later that I had been fed utter rubbish.
This is damaging to my personal reputation, and I have increasingly taken to checking information I receive from third parties several times - e.g. through Internet research - before passing it on. But what if I now have to rely on a ChatBot with a dubious sense of truth instead of a search engine to do so?

Certain professionals are expected to be highly reliable and truthful: Lawyers, teachers, scientists face this demand more than others, even if they often fail to meet it. But what result do we expect if we now want to rely on artificial intelligence with dubious factual knowledge, especially in the legal, educational and research domains?
In a previous professional life, I was CEO of a company that was trying to gain a foothold in the "eDiscovery" or "Legal AI" space in the US. I can clearly remember the discussions around the explainability and reliability of our software.

And these kinds of discussions have been going on long before Deep Learning: the simplest application in eDiscovery is a kind of "hot-or-not classifier" whose only task is to decide whether a document could be relevant to a case or not. When potentially 20 million company emails are being considered relevant, such machine classification is critical to whether the case can even proceed to trial. But what error rate is acceptable? Is it OK if the classifier only finds 80% of the essential documents? Would a lawyer have found more in this huge haystack? What if an additional 5 million documents are considered relevant when they are not? Who is to blame for the many billed hours of the lawyers who now have to read these documents?

And this involved algorithms whose errors and omissions were at least partially explicable, e.g., by pointing to specific example documents from the training phase that had led to misclassification.

This ceases completely with Large Language Models, the current show horse of the deep learning hype. The model cannot tell where it is getting a piece of information from. And the best a human can do is guess what input data was mined for a particular output. Assuming he knows the training data very well, in which case he could also just replace the AI.

So let's keep in mind: Deep Learning and especially Large Language Models are unpredictable and not comprehensible in their outputs, they are proverbial black boxes. So these technologies cannot be used to produce software for use cases that require reliability, indication of sources, or full automation (no more humans in the loop). With this in mind, I am also convinced that the classic search engine is far from dead. The bias and falsehoods in the indexed documents are quite enough for me, even without yet another dialog engine adding its unpredictable two cents. The current attempts to marry e.g. GPT with Bing or Bard with the Google search index do weave source information and search results into the conversation, but this does not solve the original problem - as becomes painfully obvious after a few sample queries.

For detailed background on Large Language Models and their features, see our YouTube video on the topic:

Johannes Stiehler
CO-Founder NEOMO GmbH
Johannes has spent his entire professional career working on software solutions that process, enrich and surface textual information.

There's more where this came from!

Subscribe to our newsletter

If you want to disconnect from the Twitter madness and LinkedIn bubble but still want our content, we are honoured and we got you covered: Our Newsletter will keep you posted on all that is noteworthy.

Please use the form below to subscribe.

NEOMO is committed to protecting and respecting your privacy and will only use your personal data to manage your account and provide the information you request. In order to provide you with the requested content, we need to store and process your personal data. If you consent to us storing your personal data for this purpose, please check the box below.

Follow us for insights, updates and random rants!

Whenever new content is available or something noteworthy is happening in the industry, we've got you covered.

Follow us on LinkedIn and Twitter to get the news and on YouTube for moving pictures.

Sharing is caring

If you like what we have to contribute, please help us get the word out by activating your own network.

More blog posts


ChatGPT "knows" nothing

Language models are notoriously struggling to recall facts reliably. Unfortunately, they also almost never answer "I don't know". The burden of distinguishing between hallucination and truth is therefore entirely on the user. This effectively means that this user must verify the information from the language model - by simultaneously obtaining the fact they are looking for from another, reliable source. LLMs are therefore more than useless as knowledge repositories.


Rundify - read, understand, verify

Digital technology has overloaded people with information, but technology can also help them to turn this flood into a source of knowledge. Large language models can - if used correctly - be a building block for this. Our "rundify" tool shows what something like this could look like.


ChatGPT and the oil spill

Like with deep learning before, data remains important in the context of large language models. But this time around, since somebody else trained the foundation model, it is impossible to tell what data is really in there. Since lack of data causes hallucinations etc. this ignorance has pretty severe consequences.