160 likes | 322 Views
ANALYZING TEXT IN THE BIG DATA ENVIRONMENT. A presentation by W H Inmon. much of the information found in the Big Data environment is text. the queries of the world can be divided into two classes – - simple search - analytical query.
E N D
ANALYZING TEXT IN THE BIG DATA ENVIRONMENT A presentation by W H Inmon
much of the information found in the Big Data environment is text
the queries of the world can be divided into two classes – - simple search - analytical query
there are many differences between a simple search and an analytical query one of those differences is the state of the data that the query is made against
the tone of the message is good. Another carrier that is mentioned is Singapore Air
the tone of the message is very bad. The message mentions a late flight. The message is not formed in proper English, as might be found in a tweet or IM
this cryptic message must first be expanded to understandable words There is no tone to this message. It is purely informational. Other types of data that have been extracted include flight number, city, activity, operand, and claim number. Note that two cities have been used to determine that the flight type is a US domestic flight
this message written in French has a good tone and a very good tone. Where there are two tones the higher is designated as the official tone of the message. Other types of information found include city, and a reference to personnel. Note that two cities are used to determine that this is an overseas flight.
this message is written in Spanish. The tone is very bad. City, service, on time, and connection are found in the message. A lawsuit is mentioned note that two cities are used to determine that this in international flight but not an overseas flight
the tone of this message is very bad. The flight number,date, and city are mentioned. city is used twice to determine that this was a domestic US flight.
after transformation occurs, the results are placed in a standard relational data base
and once a standard relational data base is created, there are literally hundreds of analytical tools that can operate against the data base
if you are an analyst, what kind of data do you want to operate on