A blog for computer science passionates.

Sunday 3 March 2019

How to work with NLTK?

Hello Friends,

In previous post, we have seen how to install NLTK? In this post we start with How to work with NLTK?

As we know, NLTK is a library in python to work with NLP. So, First of all what is NLP?

NLP - Natural Language Processing is to develop an application to understand Human Languages.
some examples,

  • Amazon Alexa
  • Google HomeMini, etc.
Some of it's application, Speech recognition, speech translation, understanding sentences, synonyms etc.

Where NLP is used?
  • Search Engine - in search engine we have seen, we speak a word and it answers. like google, yahoo etc, uses NLP.
  • Social Website Feeds - in social website like facebook uses news feeds.
  • Spam Filters - Google filters spam mails.
These all are the examples of NLP.


So, now let's start to work with NLTK.
Let's tokenize sentences,
Tokenize Sentence means we have a paragraph, and we split this paragraph into sentences. this process is known as tokenize sentences

Example,

from nltk.tokenize import sent_tokenize
sentence="This is an example of NLTK. Let's start with sentences tokenizer"
print(sent_tokenize(sentence))

O/P of above code

['This is an example of NLTK.', "Let's start with sentences tokenizer"]

Run the above code it will return list of sentences. 

Now, Let's see an example of word_tokenize. Like tokenize sentences return different sentence in list from group of sentences.
word_tokenize returns list of words from sentences it splits words.

Example,
from nltk.tokenize import word_tokenize
sentence="This is an example of NLTK. Let's start with sentences tokenizer"

print(word_tokenize(sentence))

O/P of below code

['This', 'is', 'an', 'example', 'of', 'NLTK', '.', 'Let', "'s", 'start', 'with', 'sentences', 'tokenizer']

So, Start with NLTK and try to tokenize sentences and words. in next post we will learn more in NLTK.

Happy Coding... :)

No comments:

Post a Comment