Overview of Artificial Intelligence and Role of Natural Language Processing in Big Data Artificial Intelligence Overview AI refers to ‘Artificial Intelligence’ which means making machin
Trang 1Overview of Artificial Intelligence and Role of Natural Language
Processing in Big Data
Artificial Intelligence Overview
AI refers to ‘Artificial Intelligence’ which means making machines capable of performing quick tasks like human beings AI performs automated tasks using intelligence
The term Artificial Intelligence has two key components
- Automation
Intelligence
Goals & Applications of Artificial Intelligence
Trang 2Evolution of Artificial Intelligence
Machine Learning
It is a set of algorithms used by intelligent systems to learn from experience
Machine Intelligence
These are the advanced round of algorithms used by machines to learn from experience E.g - Deep Neural Networks
Artifical Intelligence technology is currently at this stage
Machine Consciousness
It is self-learning from experience without the need for external data
Trang 33 Types of Artificial Intelligence
Artificial Narrow Intelligence (ANI)
It comprises of primary/role tasks such as those performed by chatbots,
personal assistants like SIRI by Apple and Alexa by Amazon
Artificial General Intelligence (AGI)
Artificial General Intelligence comprises of human-level tasks such as performed by self-driving cars by Uber, Autopilot by Tesla It involves continual learning by the machines
Artificial Super Intelligence (ASI)
Trang 4Artificial Super Intelligence refers to intelligence way smarter than humans.
What Makes System AI Enabled
Difference Between AI, NLP, ML, DL &
Neural Networks
Artificial Intelligence (AI)
Building systems that can do intelligent things
Natural Language Processing (NLP)
Building systems that can understand language It is a subset of Artificial Intelligence
Machine Learning (ML)
Building systems that can learn from experience It is also a subset of Artificial Intelligence
Neural Network (NN)
A biologically inspired network of Artificial Neurons
Deep Learning (DL)
Building systems that use Deep Neural Network on a large set of data It is
a subset of Machine Learning
Trang 5What is Natural Language Processing(NLP)?
Natural Language Processing (NLP) is “ability of machines to understand and interpret human language the way it is written or spoken.”
The objective of NLP is to make computer/machines as intelligent as human beings in understanding language
Trang 6The ultimate goal of NLP is to the fill the gap how the people communicate (natural language) and what the computer understands (machine language)
There are three different levels of linguistic analysis done before performing NLP
- Syntax - What part of given text is grammatically right.
Semantics - What is the meaning of given text?
Pragmatics - What is the purpose of the text?
NLP deal with different aspects of language such as
Phonology - It is systematic organization of sounds in language.
Morphology - It is a study of words formation and their relationship
with each other
Approaches of NLP for understanding semantic analysis
Distributional - It employs large-scale statistical tactics of Machine
Learning and Deep Learning
Frame-Based - The sentences which are syntactically different but
semantically same are represented inside data structure (frame) for the stereotyped situation
Trang 7 Theoretical - This approach builds on the idea that sentences refer to
the real world (the sky is blue) and parts of the sentence can be combined to represent whole meaning
Interactive Learning - It involves pragmatic approach and user is
responsible for teaching the computer to learn the language step by step in an interactive learning environment
The real success of NLP lies in the fact that humans deceive into believing that they are talking to humans instead of computers
Importance of Natural Language
Processing(NLP)
With NLP, it is possible to perform certain tasks like Automated Speech and Automated Text Writing in less time.
Due to the presence of significant data (text) around, why not we use the computers untiring willingness and ability to run several algorithms to perform tasks in no time
These tasks include other NLP applications like Automatic Summarization (to generate summary of given text) and Machine Translation (translation of one language into another)
Process of Natural Language Processing
In case the text is composed of speech, speech-to-text conversion is performed
The mechanism of Natural Language Processing involves two processes
- Natural Language Understanding
Natural Language Generation
Natural Language Understanding
Trang 8NLU or Natural Language Understanding tries to understand the meaning
of given text The nature and structure of each word inside text must be known for NLU For understanding structure, NLU attempting to resolve following ambiguity present in natural language
- Lexical Ambiguity - Words have multiple meanings
Syntactic Ambiguity - Sentence is having multiple parse trees.
Semantic Ambiguity - Sentence having multiple meanings
Anaphoric Ambiguity - Phrase or word which is previously
mentioned but has a different meaning
Next, the sense of each word is understood by using lexicons (vocabulary) and set of grammatical rules
However, certain different words are having similar meaning (synonyms) and words having more than one meaning (polysemy)
Natural Language Generation
It is the process of automatically producing text from structured data in a readable format with meaningful phrases and sentences The problem of natural language generation is hard to deal It is subset of NLP
Natural language generation divided into three proposed stages
- Text Planning - Ordering of the primary content in structured data is
done
Sentence Planning - The sentences are combined with structured
data to represent the flow of information
Realization - Grammatically correct sentences are produced finally to
represent text
Difference Between NLP and Text Mining
Natural language processing is responsible for understanding meaning and structure of given text
Text Mining or Text Analytics is a process of extracting hidden information inside text data through pattern recognition
Trang 9Natural language processing is used to understand the meaning (semantics) of given text data, while text mining is used to understand structure (syntax) of given text data
As an example - I found my wallet near the bank The task of NLP is to figure out in the end that ‘bank’ refers to financial institute or ‘river bank.'
What is Big Data?
According to the Author Dr Kirk Borne, Principal Data Scientist, Big Data Definition is described as big data is everything, quantified, and tracked
You May also Love to Read Ingestion & Processing of Data
For Big Data & IoT Solutions
Big Data For Natural Language Processing
Today around 80 % of total data is available in the raw form Big Data comes from information stored in big organizations as well as enterprises Examples include information about employees, company purchase, sale records, business transactions, the previous record of organizations, social media, etc
Though human uses language, which is ambiguous and unstructured to be interpreted by computers, yet with the help of NLP, this large unstructured data can be harnessed for evolving patterns inside data to know better the information contained in data
NLP can solve significant problems of the business world by using Big Data Be it any business of retail, healthcare, business, financial institutions
Trang 10What is a Chatbot?
Chatbots or Automated Intelligent Agents
These are the computer program you can talk to through messaging apps, chat windows or through voice calling apps
These are intelligent digital assistants used to resolve customer queries in a cost-effective, quick, and consistent manner
Why Are Chatbots Essential For Business
Chatbots are critical to understanding changes in digital customer care services provided and in many routine queries that are most frequently enquired
Chatbots are useful in a certain scenario when the client service requests are specified in the area and highly predictable, managing a high volume of similar requests, automated responses
How Does A Chatbot Work?
Image Source - blog.wizeline.com
Knowledge Base - It contains the database of information that is
used to equip chatbots with the information needed to respond to queries of customers request
Data Store - It contains interaction history of chatbot with users.
NLP Layer - It translates users queries (free form) into information
that can be used for appropriate responses
Trang 11 Application Layer - It is the application interface that is used to
interact with the user
Chatbots learn each time they make interaction with the user trying to match the user queries with the information in the knowledge base using Machine Learning
Deep Learning For NLP
It uses a rule-based approach that represents Words as ‘One-Hot’ encoded vectors
The traditional method focuses on syntactic representation instead of semantic representation
Bag of words - classification model is unable to distinguish certain contexts
3 Capability Levels of Deep Learning Intelligence
Trang 12Expressibility - This quality describes how well a machine can
approximate universal functions
Trainability - How well and quickly a Deep Learning system can learn its
problem
Generalizability - How well the machine can perform predictions on data
that it has not been trained
There are of course other capabilities that also need to be considered
in Deep Learning such as Interpretability, modularity, transferability, latency, adversarial stability, and security But these are the main ones
Applications of Deep Learning in NLP
Deep Learning Algorithms NLP Usage
Neural Network - NN
(feed)
Part-of-speech Tagging
Tokenization
Named Entity Recognition
Intent Extraction
Recurrent Neural
Networks -(RNN)
Machine Translation
Question Answering System
Image Captioning
Recursive Neural
Networks
Parsing sentences
Sentiment Analysis
Paraphrase detection
Relation Classification
Trang 13 Object detection
Convolutional Neural
Network -(CNN)
Sentence/ Text classification
Relation extraction and classification
Spam detection
Categorization of search queries
Semantic relation extraction
Difference Between Classical NLP & Deep Learning NLP
Trang 14Image Source - blog.aylien.com
NLP For Log Analysis and Log Mining
What is Log?
A collection of messages from different network devices and hardware in time sequence represents a log Logs may be directed to files present on hard disks or can be sent over the network as a stream of messages to log collector
Logs provide the process to maintain and track the hardware performance, parameters tuning, emergency and recovery of systems and optimization of applications and infrastructure
You May also Love to Read Understanding Log Analytics,
Log Mining & Anomaly Detection
What is Log Analysis?
Trang 15Log analysis is the process of extracting information from logs considering the different syntax and semantics of messages in the log files and interpreting the context with application to have a comparative analysis of log files coming from various sources for Anomaly Detectionand finding correlations
What is Log Mining?
Log Mining or Log Knowledge Discovery is the process of extracting
patterns and correlations in logs to reveal knowledge and predict Anomaly Detection if any inside log messages.
Natural Language Processing Techniques
Different methods used for performing log analysis are described below
Pattern recognition
It is one such technique which involves comparing log messages with messages stored in pattern book to filter out messages
Normalization
Normalization of log messages is done to convert different messages into the same format This is done when different log messages have different terminology, but the same interpretation is coming from various sources like applications or operating systems
Classification & Tagging
Classification & Tagging of different log messages involves ordering of messages and tagging them with the various keywords for later analysis
Artificial Ignorance
It is a kind of technique using Machine Learning Algorithms to discard uninteresting log messages It is also used to detect an Anomaly in the ordinary working of systems
You May also Love to Read Log Analytics With Deep
Learning & Machine Learning
Trang 16Role of NLP in Log Analysis & Log Mining
Natural Language processing techniques are widely used in Log Analysis and Log Mining.
The different techniques such as tokenization, stemming, lemmatization, parsing, etc are used to convert log messages into structured form
Once logs are available in the well-documented form, log analysis, and log mining is performed to extract useful information and knowledge is discovered from information
The example in case of error log caused due to server failure
Diving into Natural Language Processing
Natural language processing is a complex field and is the intersection
of Artificial Intelligence, computational linguistics, and computer science Getting started with Natural Language Processing
The user needs to import a file containing text written Then the user should perform the following steps for natural language processing
Sentence Segmentation
Mark met the president He said:”Hi!
What’s up -Alex?”
Sentence 1 -Mark met the president
Sentence 2 - He said: ”Hi! What’s
up - Alex?”
Tokenization My phone tries
to ‘charging’
from
[My] [phone] [tries] [to] [‘] [charging] [‘]
Trang 17state
[from] [‘] [discharging] [‘] [state][.]
Stemming/
Lemmatization
Drinking, Drank, Drunk Drink
Part-of-Speech tagging If you build ithe will come.
IN - prepositions and
subordinating conjunctions
PRP - Personal Pronoun
VBP - Verb Noun 3rd person singular present form
PRP- Personal pronoun
MD - Modal Verbs
VB - Verb base form
Parsing Mark and Joewent into a bar.
(S(NP(NP Mark) and (NP(Joe))
(VP(went (PP into (NP a bar))))
Recognition
Let’s meet Alice at 6 am in India
Let’s meet Alice
at 6 am in India
Person Time Location
Coreference resolution Mark went into
the mall He thought it was
a shopping
Mark went into the mall He thought it was a shopping mall
Trang 18 Sentence segmentation - It identifies sentence boundaries in the
given text, i.e., where one sentence ends and where another sentence begins Sentences are often marked ended with punctuation mark ‘.’
Tokenization - It identifies different words, numbers, and other
punctuation symbols
Stemming - It strips the ending of words like ‘eating’ is reduced to
‘eat.’
Part of speech (POS) tagging - It assigns each word in a sentence
its own part-of-speech tag such as designating word as noun or adverb
Parsing - It involves dividing given text into different categories To
answer a question like this part of sentence modify another part of the sentence
Named Entity Recognition - It identifies entities such as persons,
location and time within the documents
Co-Reference resolution - It is about defining the relationship of
given the word in a sentence with a previous and the next sentence
Key Application Areas of Natural Language Processing
Apart from use in Big Data, Log Mining, and Log Analysis, it has other significant application areas
Although the term ‘NLP’ is not as popular as ‘big data’ ‘machine learning’ but we are using NLP every day
Automatic summarizer
Given the input text, the task is to write a summary of text discarding irrelevant points
Sentimental analysis