Ben Vigoda on the New Era of NLP

16 June 2025

0

Many apps and programs claim to be able to understand you and are at least capable of engaging in superficial interactions. Spend long enough talking to one of these programs, however, and you’ll no doubt see the hallmarks of imperfectly reproduced natural language. That technology has not yet caught up to the complexities of natural language is no surprise.

According to Gamalon founder Ben Vigoda, a fifteen-word sentence can express something like ten billion ideas. In his talk at ODSC East 2018, Vigoda identifies the necessity of building better, new era of NLP functions to accurately process free-form text: business already receive far more unstructured language-data than they can process, and the amount is only going to keep growing. As Vigoda puts it, “we need an interface between us and that data volume of messages…no way for institutions to handle all their incoming natural language data.”

Beyond Deep Learning

Current training models rely on Pavlovian conditioning, but Vigoda notes that this kind of training is not analogous to human language-learning: “You can’t just teach a system complex ideas by giving it stimulus, showing it a response, and training it to give that response.”

Relying on deep-learning systems that utilize this kind of training results in systems that are capable only of a shallow level of interpretation. Even when the level of interface requires nothing more than the recognition of a few keywords, current models typically rely on time-consuming, labor-intensive training processes, with an end result that is, per Vigoda, never more than 65% accurate or capable of responding to questions outside of a limited amount of prompts. The typical idea-trees used by voice recognition systems are easily confused and are prone to repeat themselves or return to menus when confronted with an unfamiliar idea.

A Model for the New Era of NLP

In creating his product at Gamelon, Vigoda decided to move away from the neural networks that had dominated the sphere of NLP. Instead, he and his team began experimenting with machine learning. Their system relies on Occam’s razor to demystify unstructured data and relies on only a few training models built by their team of data scientists. The results have been revelatory: “One person teaches the model, not thousands; one processor, not a server farm; a handful of training examples per object instead of thousands, hours to train instead of days or months.” Essentially, the system “backsolves” from the unstructured data and fits it to the given model.

Conclusion

Going forward, Vigoda hopes that language processing models will work collaboratively, sharing their idea trees with one another to build more accurate and realistic models for language processing. The billions of possible permutations available to humans are too much for any one company or team to account for in a single model, but massive amounts of unstructured data will make automation of language-processing tasks a necessity.

Ben Vigoda on the New Era of NLP

Adding Persistent Memory to Claude Code with the Lightweight memsearch Plugin

GLM-5 vs. MiniMax M2.5 vs. Gemini 3 Deep Think: Which Model Fits Your AI Agent Stack?

We Extracted OpenClaw’s Memory System and Open-Sourced It (memsearch)

LEAVE A REPLY Cancel reply

Most Popular

I love my Pixel, but I’d trade it for this in a heartbeat

I stopped fighting Google Sheets after Gemini made formulas feel optional

We need to talk about this.

This Galaxy S26 leak highlights a trend that makes me want to skip it

EDITOR PICKS

I love my Pixel, but I’d trade it for this in a heartbeat

I stopped fighting Google Sheets after Gemini made formulas feel optional

We need to talk about this.

POPULAR POSTS

I love my Pixel, but I’d trade it for this in a heartbeat

I stopped fighting Google Sheets after Gemini made formulas feel optional

We need to talk about this.

POPULAR CATEGORY

ABOUT US

FOLLOW US