430 likes | 623 Views
Supporting Decision Making. A Framework for IS Management. Introduction (2). Most computer systems support decision making because all software programs involve automating decision steps that people would take
E N D
Introduction (2) • Most computer systems support decision making because all software programs involve automating decision steps that people would take • Decision making is a process that involves a variety of activities, most of which handle information • A wide variety of computer-based tools and approaches can be used to confront the problem at hand and work through its solution
Introduction (3) • Computer technologies that support decision making • Decision support system (DSSs) • Data mining • Executive information systems (EISs) • Expert systems (ESs) • Agent-based modeling • Multidisciplinary foundations for DS technologies • Database research, artificial intelligence, statistical inference, human-computer interaction, simulation methods, software engineering etc.
Case Example---A Problem-Solving Scenario • Using an EIS to discover a sales shortfall in one region • Investigate several possible causes • Economic conditions • Competitive analysis • Written sales reports • A data mining analysis • Result: no clear problems revealed
Decision Support Systems---History • Two contributing areas of research in 1950s-1960s • Organizational decision making in CMU • Interactive computer systems in MIT • Middle 1970s: single user and model-oriented DSS • Middle and late 1980s: EIS, GDSS, ODSS • 1990s: Data warehousing and OLAP • Late 1990s-2000s • Data mining • Web-based analytical applications
What is a DSS? • A DSS aims to use IT to relieve humans of some decision making or help us make more informed decisions • Systems that support, not replace, managers in their decision-making activities • DSSs are defined as: • Computer-based systems • That help decision makers • Confront ill-structured problems • Through direct interaction • With data and analysis models
DSS Architecture (2) • The Dialog Component • Linking the user to the system • The Data Component • Data sources --- use all the important data sources within and outside the organization in the form of summarized data (DW & DM) • The Model Component • Models provide the analysis capabilities for a DSS • Using a mathematical representation of the problem, algorithmic processes are employed to generate information to support decision making
A Taxonomy of DSS • Using the mode of assistance as the criterion • A model-driven DSS • A communication-driven DSS • A data-driven DSS or data-oriented DSS • A document-driven DSS • A knowledge-driven DSS
Executive Information System (1) • The emphasis of EIS is on graphical displays and easy-to-use user interfaces • EIS can be viewed as a DSS that: • Provides access to summary performance data • Uses graphics to display and visualize the data in an easy-to-use fashion, and • Has a minimum of analysis for modeling beyond the capability to "drill down" in summary data to examine components
Executive Information System (2) • EISs aim to provide both internal and external information relevant to meeting the strategic goals of the organization • Gauge company performance • Scan the environment • EIS and data warehousing technologies are converging in the marketplace • The term EIS has lost popularity in favor of Business Intelligence
Data Mining: Motivations • The explosive growth of data: from TB to PB • Data collection and data availability • Automated data collection tools, database systems, Web, computerized society • Major sources of abundant data • Business: Web, e-commerce, transactions, stocks, … • Science: remote sensing, bioinformatics, … • Society and everyone: news, digital cameras, YouTube • We are drowning in data, but starving for knowledge! • “Necessity is the mother of invention”—Data mining—Automated analysis of massive data sets
What Is Data Mining? • Data mining (knowledge discovery from data) • Extraction of interesting patterns or knowledge from huge amount of data • Alternative names • Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. • Watch out: Is everything “data mining”? • Simple search and query processing • (Deductive) expert systems
Knowledge Discovery (KDD) Process Knowledge • Data mining—core of knowledge discovery process Pattern Evaluation Data Mining Task-relevant Data Selection Data Warehouse Data Cleaning Data Integration Databases
Graphical User Interface Pattern Evaluation Knowledge-Base Data Mining Engine Database or Data Warehouse Server data cleaning, integration, and selection Data Warehouse World-Wide Web Other Info Repositories Database Architecture: A Typical Data Mining System
Database Technology Statistics Data Mining Visualization Machine Learning Pattern Recognition Other Disciplines Algorithm Data Mining: Confluence of Multiple Disciplines
Why Not Traditional Data Analysis? • Tremendous amount of data • Algorithms must be highly scalable to handle TB of data • High-dimensionality of data • Micro-array may have tens of thousands of dimensions • High complexity of data • Data streams and sensor data • Time-series data, temporal data, sequence data • Structure data, graphs, social networks and multi-linked data • Heterogeneous databases and legacy databases • Spatial, spatiotemporal, multimedia, text and Web data • Software programs, scientific simulations • New and sophisticated applications
Multi-Dimensional View of Data Mining (1) • Data to be mined • Relational, data warehouse, transactional, stream, object-oriented/relational, active, spatial, time-series, text, multi-media, heterogeneous, legacy, WWW • Knowledge to be mined • Characterization, discrimination, association, classification, clustering, trend/deviation, outlier analysis, etc. • Multiple/integrated functions and mining at multiple levels
Multi-Dimensional View of Data Mining (2) • Techniques utilized • Database-oriented, data warehouse (OLAP), machine learning, statistics, visualization, etc. • Applications adapted • Retail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis, text mining, Web mining, etc.
Data Mining Functionalities (1) • Multidimensional concept description: characterization and discrimination • Generalize, summarize, and contrast data characteristics, e.g., dry VS. wet regions • Frequent patterns, association, correlation vs. causality • Diaper Beer [0.5%, 75%] • Classification and prediction • Construct models (functions) that describe and distinguish classes or concepts for future prediction • E.g., classify countries based on (climate), or classify cars based on (gas mileage) • Predict some unknown or missing numerical values
Data Mining Functionalities (2) • Cluster analysis • Class label is unknown: Group data to form new classes, e.g., cluster houses to find distribution patterns • Maximizing intra-class similarity & minimizing interclass similarity • Outlier analysis • Outlier: Data object that does not comply with the general behavior of the data • Noise or exception? Useful in fraud detection, rare events analysis • Trend and evolution analysis • Trend and deviation: e.g., regression analysis • Periodicity analysis
Major Issues in Data Mining (1) • Mining methodology • Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web • Performance: efficiency, effectiveness, and scalability • Pattern evaluation: the interestingness problem • Incorporation of background knowledge • Handling noise and incomplete data • Parallel, distributed and incremental mining methods • Integration of the discovered knowledge with existing one: knowledge fusion
Major Issues in Data Mining (2) • User interaction • Data mining query languages and ad-hoc mining • Expression and visualization of data mining results • Interactive mining of knowledge at multiple levels of abstraction • Applications and social impacts • Domain-specific data mining & invisible data mining • Protection of data security, integrity, and privacy
Artificial Intelligence (1) • AI is a group of technologies that attempts to mimic our senses and emulate certain aspects of human behavior such as reasoning and communication • 1956, a conference in Dartmouth College • John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon ( MIT, CMU and Stanford) • 1965, H. A. Simon: "machines will be capable, within twenty years, of doing any work a man can do" • 1967, Marvin Minsky: "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved" • Heavily funded by DARPA
Artificial Intelligence (2) • They had failed to recognize the difficulty of some of the problems they faced: • The lack of raw computing power • The intractable combinatorial explosion of their algorithms, • The difficulty of representing commonsense knowledge and doing commonsense reasoning, • The incredible difficulty of perception and motion • The failings of logic • First AI Winter • In 1974, DARPA cut off all undirected, exploratory research in AI
Artificial Intelligence (3) • In the early 80s, the field was revived by the commercial success of expert systems • By 1985 the market for AI had reached more than a billion dollars. • Minsky and others warned the community that enthusiasm for AI had spiraled out of control and that disappointment was sure to follow • Second AI Winter • The collapse of the Lisp Machine market in 1987
Artificial Intelligence (4) • In the 90s AI achieved its greatest successes • Artificial intelligence was adopted throughout the technology industry, providing the heavy lifting for • Data mining • Logistics • Medical diagnosis • …
Expert System • An expert system is an automated type of analysis or problem-solving model that deals with a problem the way an "expert" does • The process involves consulting a base of knowledge or expertise to reason out an answer based on the characteristics of the problem
Architecture of an ES User Interface Inference Engine Description of a problem User Knowledge Base Advice and explanation
Knowledge Representation • In AI, the primary aim of knowledge representation is to store knowledge so that programs can process it and achieve the verisimilitude of human intelligence • The representation theory has its origin in cognitive science • Knowledge can be represented in a number of ways • Case-based reasoning • Artificial neural networks • Stored as rules
Case-based Reasoning (1) • Case-based reasoning • The process of solving new problems based on the solutions of similar past problems • A case consists of a problem, its solution, and, typically, annotations about how the solution was derived
Case-based Reasoning (2) • Case-based reasoning as a four-step process • Retrieve: given a target problem, retrieve cases from memory that are relevant to solving it • Reuse: map the solution from the previous case to the target problem • Revise: test the new solution, if necessary, revise it. • Retain: After the solution has been successfully adapted to the target problem, store the resulting experience as a new case in memory
Supervised vs. Unsupervised Learning • Supervised learning • Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations • New data is classified based on the training set • Unsupervised learning • The class labels of training data is unknown • Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data
Artificial Neural Network (1) • An interconnected group of artificial neurons • Using a mathematical or computational model for information processing based on a connectionistic approach to computation. • An adaptive system that changes its structure based on external or internal information that flows through the network. • ANNs can be used to model complex relationships between inputs and outputs or to find patterns in data • Non-linear statistical data modeling or decision making tools
Artificial Neural Network (2) Training set: (1) high salary, owns a house, has a dog, [profitable customer] (2) less than 3 years on job, prior bankruptcy, owns a dog, [deadbeat] ......
Rule-based Systems (1) • Knowledge stored as rules • The most commonly used form of rules is the if-then statement • e.g. IF some condition THEN some action • A rule-based inference model: decision tree • Each internal node (non-leaf node) denotes a test on an attribute • Each branch represents an outcome of the test • Each leaf node holds a class label
Rule-based Systems (2) Training dataset for decision tree buys_computer
age? <=30 overcast >40 31..40 student? credit rating? yes excellent fair no yes no yes no yes Rule-based Systems (3) Decision tree buys_computer
Agent-based Modeling • Simulate the behavior that emerges from the decisions of a large number of distinct individuals • Computer generated agents, each making decisions typical of the decisions an individual would make in the real world • Trying to understand the mysteries of why businesses, markets, consumers, and other complex systems behave as they do
Toward the Real-Time Enterprise • The essence of the phrase real-time enterprise is that organizations can know how they are doing at the moment • Digitization and automation of some crucial enterprise activities traditionally completed by people • Esp. information analysis • Better sense-and-response
Real-time Reporting • Real-time reporting is occurring on a whole host of fronts including: • Enterprise nervous systems • A network that connects people, applications and devices • To coordinate company operations • Straight-through processing • To reduce distortion in supply chains • Real-time CRM • To automate decision making relating to customers, and • Communicating objects • To gain real-time data about the physical world • E.g. radio frequency identification device (RFID)
The Dark Side of Real Time • Object-to-object communication could compromise privacy • Knowing the exact location of a company truck every minute of the day is an invasion the driver's privacy • In the era of speed, a situation can become very bad very fast • E.g. "circuit breaker" to stop deep dives in NYSE