A Beginner’s Guide to Natural Language Processing (NLP).

Pinterest LinkedIn Tumblr

Natural Language Processing or simply NLP is the field of Artificial Intelligence that focuses on the manipulation of text or speech to create the machine of futures. Thus useless texts can be converted into a source of information with the help of Natural Language Processing.

But even before learning how NLP can convert useless text into something informative, why we even need to convert texts? Out of all the text information present out there on the internet, only 21% is formatted good enough to get something useful. The other 89% text does contain information but it is present in scrambled form. That’s why we are using NLP a lot these days,

Some companies are even using NLP to make some cool machines. E.g. IBM’s Project Debater, an AI robot capable of debating with humans,

What Are The Different Techniques Used in NLP?

Semantic and Syntax analysis are the two most techniques of Natural Language Processing. Below is a brief introduction to how these two works.

1. Syntax

Syntax analysis refers to the arrangement of the words in a particular order so that they make a meaningful sentence. It is the process of analyzing natural language with the rules of formal grammar. Grammatical rules are applied to categories and groups of words, not individual words. The syntactic analysis assigns a semantic structure to text.

E.g. While making an English sentence we apply the general rule that a sentence should have a noun, a verb, and an optional subject in it.

2. Semantic

Semantic analysis means to check the correctness of a language. Not always a noun and a verb combination are enough to make a sense. E.g. “cows flow supremely” is a complete in it as it has both a noun and a verb but this doesn’t make any sense to the user.

So, therefore, semantic analysis helps Computer generate something logical or meaningful.

Syntax and Semantic Natural Language Processing

A brief process of how Natural Language Processing works.

Natural Language Processing is not a single step process but rather several steps.

Step 1: Sentence Segmentation

Sentence Segmentation means that the computer first will break a paragraph into many different sentences. Consider the following paragraph

  1. He knew what he was supposed to do. That had been apparent from the beginning. That was what made the choice so difficult. What he was supposed to do and what he would do were not the same. This would have been fine if he were willing to face the inevitable consequences, but he wasn’t.
In the above text, Computer will break para into sentences like “He knew what he was supposed to do.” “That had been apparent from the beginning.” and more.

Step 2: Tokenization

After sentence segmentation, the next step is tokenization. Tokenization means converting a single into nothing but just words. Therefore “He knew what he was supposed to do” will get converted into “He”, “knew”, “what”, “he”, “was”, “supposed”, “to”, “do”.

Step 3: Identifying part of speech

The next step is to find the part of speech of a given word. In the above example, “He” is a pronoun. Similarly, “knew” is a verb, and so on.

Step 4: Dependency Parsing

The last step in NLP is dependency parsing. Dependency Parsing is the process of identifying how each word depends upon another.

The goal is to build a tree that assigns a single parent word to each word in the sentence. The root of the tree will be the main verb in the sentence. Here’s what the beginning of the parse tree will look like for our sentence.

I hope you like this article. If you have any questions, you can ask us anytime on our Instagram. For further reading check this article.


Comments are closed.