Introduction
Machine Learning and Natural Language Processing are important subfields of Artificial Intelligence that have gained prominence in recent times. Machine Learning and Natural Language Processing play a very important part in making an artificial agent into an artificial ‘intelligent’ agent. An Artificially Intelligent system can accept better information from the environment and can act on the environment in a user-friendly manner because of the advancement in Natural Language Processing.
Similarly, an Artificially Intelligent System can process the received information and perform better predictions for its actions because of the adoption of Machine Learning techniques.
Machine Learning gives the system the ability to learn from past experiences and examples. General algorithms perform a fixed set of executions according to what it has been programmed to do so and they do not possess the ability to solve unknown problems. And, in the real world, most of the problems faced contain many unknown variables which makes the traditional algorithms very less effective. This is where machine learning comes to the fore. With the help of past examples, a machine learning algorithm is far better equipped to handle such unknown problems.
Some of the classic examples given include spam mail detection. To detect and classify if a mail is a legitimate one or spam includes many unknowns. There are many ways in which spam filters can be evaded. For a traditional algorithm to work, every feature and variable has to be hardcoded, which is extremely difficult, if at all possible. Whereas, a machine learning algorithm will be able to work in such an environment because of its ability to learn and form a general rule.
Deep Learning is a specialization of machine learning algorithms, the Artificial Neural Network. In recent times it has been observed that deep learning techniques have been widely adopted and have produced good results as well. The flexibility provided by the deep learning techniques in deciding upon the architecture is one of the important reasons for the success of these techniques. Deep learning techniques have been at the forefront of machine learning techniques used for research in natural language processing.
Natural Language Processing, on the other hand, is the ability of a system to understand and process human languages. A computer system only understands the language of 0’s and 1’s, it does not understand human languages like English or Hindi. Natural Language Processing gave the computing system the ability to understand English or the Hindi language.
Natural Language Processing has seen large-scale adaptation in recent times because of the level of user-friendliness it brings to the table. From choosing your choice of music to controlling your electronic appliances like Air conditioners, and ovens, in fact even the ceiling fans and light bulbs, everything and anything can now be done using your voice, thus making these electronic items smart…!!. This is all possible because of Natural Language Processing.
Even as NLP has made it easier for the users to interact with the complex electronics, on the other side there is a lot of processing happening behind the scenes which makes this interaction possible. Machine learning has played a very important role in this processing of the language.
Apart from playing a role in the proper processing of natural language Machine Learning has played a very constructive role in important applications of natural language processing as well. Important NLP applications like Sentiment Analysis, Chatbot Systems, Question Answering Systems, Information Retrieval Systems, Machine Translation, and Email Classification, among others have all included machine learning techniques for better working.
This article was published as a part of the Data Science Blogathon.
Role of Machine Learning in Natural Language Processing
Processing of natural language so that the machine can understand the natural language involves many steps. These steps include Morphological Analysis, Syntactic Analysis, Semantic Analysis, Discourse Analysis, and Pragmatic Analysis, generally, these analysis tasks are applied serially. Machine Learning acts as important value addition in almost all these processes in some form or the other. Let us try to understand how.
1. Morphological Analysis:
As already mentioned the data received by the computing system is in the form of 0s and 1s. These 0s and 1s can be converted into alphabets using the ASCII code. So, it can be said that a machine receives a bunch of characters when a sentence or a paragraph has been provided to it. At the level of morphological analysis, the first task is to identify the words and the sentences. This identification is called tokenization. Many Different Machine Learning and Deep Learning algorithms have been employed for tokenization including Support Vector Machine and Recurrent Neural Network.
Once the tokenization is complete the machine has with it a bunch of words and sentences. Most of the sentences which are formed contain affixes. These affixes complicate the matter for the machines as, having a word meaning dictionary containing all the words with all its possible affixes is almost impossible. So, the next task that the morphological analysis level is removing these affixes. These affixes can be removed either using stemming or lemmatization. Machine Learning algorithms like the random forest and decision tree have been quite successful in performing the task of stemming.
2. Syntactic Analysis
The next task in natural language processing is to check whether the given sentence follows the grammar rule of a language. To do this the words are first tagged with their part of speech. This helps the syntactic parsers in checking the grammar rules. Machine learning and Deep learning algorithms like the random forest and the recurrent neural network has been successfully used implemented for this task. Machine learning algorithms like K- nearest neighbor have been used for implementing syntactic parsers as well.
3. Semantic Analysis
At this level, the word meanings are identified using word-meaning dictionaries. The problem encountered here is, the same word might have different meanings according to the context of the sentence. For example, the word ‘Bank’ might mean a Blood Bank or a Financial Bank, or even a River Bank / Shore, this creates ambiguity. So, removing this ambiguity is one of the important tasks at this level of natural language processing called Word Sense Disambiguation.
Word sense disambiguation is one of the classical classification problems which have been researched with different levels of success. Machine learning like the random forest, gradient boosting and decision trees have been successfully employed. But, in recent times it is the deep learning algorithms like the recurrent neural network, long short term memory based recurrent neural network, gated recurrent unit based recurrent neural network and convolution neural network have been researched and have produced very good results.
4. Discourse Analysis
There instances where pronouns are used or certain subjects/objects are referred to, which are outside of the current preview of the analysis. In such cases, the semantic analysis will not be able to give proper meaning to the sentence. This is another classical problem of reference resolution which has been tackled by machine learning and deep learning algorithms.
5. Pragmatic Analysis
Many a time sentences convey a deeper meaning than what the words can describe. That is, the machine has to discard the word meaning understood after semantic analysis and capture the intended or the implied meaning. It is easier said than done. For many years now this is of natural language process has intrigued researchers. One of the classic examples of pragmatic analysis is sarcasm detection.
Many, in fact almost all the different machine learning and deep learning algorithms have been employed with varied success for performing sarcasm detection o for performing pragmatic analysis in general.
Role of Machine Learning in the applications of Natural Language processing
As with the processing task of the natural language machine learning and deep learning algorithms have played a very important role in almost all of the applications of natural language processing. In recent times there has been a renewed research interest in these fields because of the ease with which machine learning and deep learning algorithms can be implemented, and this is especially true for deep learning techniques.
Hence, almost all the deep learning techniques including, Deep Neural Network, Autoencoders, Restricted Boltzmann Machine, Recurrent Neural Network, and Convolution Neural Network have been experimented with to get good accuracy in the different applications of Natural Language Processing.
Recurrent Neural Network with its variants the Long Short Term Memory and Gated Recurrent Unit and Convolution Neural Network along with its variants Recurrent Convolution Neural Network, Regional Convolution Neural Network have been all been extensively researched to produce good results for these applications. Let us have a look at some of these applications of Natural Language Processing where the deep learning techniques have had a very positive role to play.
1. Sentiment Analysis
Sentiment Analysis strives to analyze the user opinions or sentiments on a certain product. Sentiment analysis has become a very important part of Customer Relationship Management. Even a single negative opinion can be disastrous for the product. Recent times have seen greater use of deep learning techniques for sentiment analysis. An interesting fact to note here is that new deep learning techniques have been quipped especially for analysis of sentiments that is the level of research that is being conducted for sentiment analysis using deep learning.
2. Chatbot Systems
Chatbot systems are conversational agents or dialog systems that try to engage the user in a conversation. This conversation can be through voice or text. Personal assistants like Amazon’s Alexa and Google Assistant have popularised the chatbot systems and have also showcased the level of ease through which user interaction can be carried out.
As easy as it may sound, the development of a true chatbot system that can replace a human agent is an extremely difficult task. Which requires Natural Language Understanding and also Natural Language Generation.
Recent frameworks like Google’s DialogFlow, IBM’s Watson AI, and Amazon’s Alexa AI provide an easy way of developing a chatbot system. And, all these frameworks employ complex and proprietary deep learning architectures.
3. Question Answering Systems
As the name suggests, a question answering system is a system that tries to answer user’s questions. Recent times have seen the thin line separating a dialog system and a question answering system getting blurred and most of the time a chatbot system performs the question answering task and it is true the other way round as well. So, the research works which pledge to develop a chatbot system will, in all probability, be developing a question answering system within it as well.
A question answering system has three important components, Question Processing, Information Retrieval, and Answer Processing. Machine Learning and Deep Learning techniques have played a crucial role in all these three components. Especially, Question Processing has attracted quite a few research. The idea here is that understanding the question is extremely important for better answer retrieval. The question processing task is taken as a classification problem and many research works have experimented with deep learning techniques for better question classification.
4. Information Retrieval Systems
Information Retrieval is another important application of Natural Language Processing that tries to retrieve relevant information. Information retrieval systems act as the backbone of the systems like the chatbot systems and question answering systems.
The most basic way of retrieving any information is using the frequency method where the frequency of keywords determines if a particular data is retrieved or not. But, smart systems process the required query as well as the present large data to retrieve only the relevant information. This process is carried out using deep learning techniques.
5. Machine Translation
A machine translation system is striving to translate a text from one language to another with minimum or no human intervention. Applications like Google Translate are one of the best examples of the machine translation system.
Have a translation system that translates word to word is not enough as the construction of a sentence might vary from one language to another. For example, English follows the Subject-Verb-Object format whereas Hindi follows Subject -Object-Verb form for sentence construction. Apart from this, there are many different rules which need to be followed. All these things make the task of machine translation difficult.
The Recurrent Neural Network Deep learning technique along with its variants, Long Short Term Memory and Gated Recurrent Unit, with their Bi-directional forms, have been extensively experimented with for better machine translation. The reason for this is the ability of these neural networks in holding on to the contextual information, which is very crucial in proper translation. Even, Convolution Neural networks have experimented with varied success.
So, it can be observed that Machine Learning and Deep Learning techniques are being extensively researched for their employment in the field of Natural Language Processing. it can be seen that these learning techniques are playing an important role in almost all of the processing of natural language tasks as well as in almost all the applications of natural language processing.
All the different processing of natural language tasks and the different applications of natural language processing are different fields of research by themselves. And currently, in all these fields of research Machine Learning and Deep Learning techniques are being researched extensively with an exceeding level of success. In conclusion, it can be said that Machine Learning and Deep Learning techniques have been playing a very positive role in Natural Language Processing and its applications.
References:
1. Tatwadarshi P. Nagarhalli, Dr. Vinod Vaze, and Dr. N. K. Rana, “Impact of Machine Learning in Natural Language Processing: A Review”, Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021), 2021.
2. Aravind Pai, What is Tokenization in NLP? Here’s All You Need To Know. Available at: https://www.geeksforgeeks.org/blog/2020/05/what-is-tokenization-nlp/.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.