490 likes | 656 Views
Lecture 2. ADMS 3020 - Canadian Business Culture and Management Skills Prof. Dawid Kasperowicz http://www.yorku.ca/dkasper. Learning Objectives. Information Systems More on DSS Supervised learning Unsupervised learning DSS vs. MIS DSS Components More on AI Systems
E N D
Lecture 2 ADMS 3020 - Canadian Business Culture and Management Skills Prof. DawidKasperowicz http://www.yorku.ca/dkasper
Learning Objectives • Information Systems • More on DSS • Supervised learning • Unsupervised learning • DSS vs. MIS • DSS Components • More on AI Systems • Types of AI systems • Application Exercises ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Why? • Information and decision support is the lifeblood of today’s organizations • Every organization needs effective decision makers • Allow organizations on all levels of management to obtain useful information in real time • ERP have DSS • My field of interest ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • A career in IT is largely viewed as a problem solving career • Problem solving is a critical activity for any business organization • After identifying a problem, the process of solving the problem begins with decision making • Problem solving is composed of two phases: • Decision-making phase • Problem-solving phase ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Decision-Making Phase • Divided into three stages: • Intelligence – Identify and define potential problems or opportunities • Investigate resource and environmental constraints • E.g., A Hawaiian farmer during this phase would explore possibilities of shipping tropical fruit from their farm in Hawaii to stores in Toronto. The perishability of the fruit and the maximum price that consumer in Toronto are willing to pay for the fruit are problem constraints ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Decision-Making Phase • Divided into three stages: • Design– Develop alternative solutions to the problem and evaluate their feasibility • E.g., A Hawaiian farmer during this phase would consider the alternative methods of shipments, including the transportation times and costs associated with each method of shipment. Shipping by freighter to California and then by Truck to Toronto might not be feasible because the fruit would spoil ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Decision-Making Phase • Divided into three stages: • Choice– Select a course of action • E.g., A Hawaiian farmer during this phase would select shipping fruit by air from their Hawaiian farm to Toronto as the solution ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem-Solving Phase • Divided into two stages: • Implementation – A solution is put into effect • Monitoring – Decision makers evaluate the implementation to determine whether the anticipated results were achieved and to modify the process in light of new information • E.g., American Airlines monitored its decisions to use probability analysis to reduce inventory levels and shipping costs for airline maintenance equipment and in-flight service items. The value of this inventory can be over $1 billion a year on average. The airline used a decision-making technique called decision tree analysis that outlined major decisions and possible outcomes from those decisions ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Herbert Simon’s Problem Solving Model ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – Decision Tree Induction • A supervised learning(classification/inductive learning)technique in machine learning, where it learns from data that is collected in the past and represent past experiences in some real-world applications • Most frequently used mining technique in both practical data mining and Web mining • Also called classification, which aims to learn a classification function (called a classifier) from data that are labeled with pre-defined classes or categories. The resulting classifier is then applied to classify future data instances into these classes • Decision Tree Induction is a type of classifier • Uses a eager learning method as the system tries to generalize the data before seeing any queries ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Supervised Learning – Basic Concepts • A dataset used in the learning task consists of a set of data records, which are described by a set of attributes A = {A1, A2, …,A|A|}, where |A| denotes the number of attributes or the size of the set A • The dataset also has a special target attribute C, which is called the class attribute; considered separately from attributes in A due to its special status (C is not in A), and has a set of discrete values C = {C1, C2, …, C|C|}, where |C| is the number of classes and |C| ≥ 2 • A dataset for learning is simply a relational table ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Supervised Learning – Basic Concepts • Each data record describes a piece of “past experience”, and in machine learning and data mining literature, a data record is also called an example,instance,caseor vector • Dataset basically consists of a set of examples • Given data set D, the objective of learning is to produce a classification/prediction function to relate values of attributes in A and classes in C in order to predict the class value of the future data • The function is also called classification model, predictive model, or classifier ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Supervised Learning – Basic Concepts • A loan application data set • Has four attributes: Age, Has_job, Own_house, Credit_rating • Last column is the Class attribute • We want to learn a classification model from this dataset that can be used to classify future loan applications • Called supervised learning because the class labels (Yes and No values) are provided in the data ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – Decision Tree Induction • One of the most widely used techniques for classification • Accuracy is competitive with other learning methods, and it is very efficient • Terminology: • Root node – The topmost node in a decision tree • Parent node – A node that has children • Leaf node – A node that indicates a class • Decision node – A node where a decision between at least two possible alternatives can be made; indicated by small squares • Branch- A possible decision direction that can be taken ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Problem Solving – Decision Tree Induction First Potential TreeSecond Potential Tree ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Problem Solving – Decision Tree Induction Mini-Application Exercise! Using both trees presented in slide 15, predict the class of the following new instance, which describes a new loan applicant Which of the two trees was easier to use? What knowledge about decision trees do we get from the answer to question 2? ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – Decision Tree Induction • We want to have a decision tree that is as small and accurate as possible because: • Smaller decision trees are more general • Tend to be more accurate • Easier to understand – important for humans to understand the classifier • E.g., In some medical applications, doctors want to understand the model that classifies whether a person has a particular disease. It is not satisfactory to simply produce a classification because without understanding why the decision is made, the doctor may not trust the system and/or does not gain useful knowledge ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – Decision Tree InductionChallenges and Solutions • Recursively partitions the data until there is no impurity or there is no attribute left • May result in tress that are very deep with many leaves covering very few training examples • This type of tree will have high accuracy with classifying the training dataset, but generally has low accuracy with unseen datasets – does not generalize the data well • This is known as overfitting; a classifier f1overfitsthe data if there is another classifier f2 such that f1 achieves a higher accuracy on the training data than f2, but a lower accuracy on the unseen test data than f2 • Usually caused by noise (wrong class values/labels and/or wrong values of attributes), but it may also be due to the complexity and randomness of the application domain ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – Decision Tree InductionChallenges and Solutions • To reduce overfitting, we prune/performpruning – deleting a branch and replacing them with leaves of majority classes • Two methods to perform pruning • Pre-pruning – Stops growing the tree earlier, before it perfectly classifies the training set • Shown to be more dangerous because it is not clear what will happen if the tree is extended further • Post-pruning – Allows the tree to perfectly classify the training set, and then prune the result • Shown to be more effective because after the tree is extended to the fullest, it becomes clearer which branches may overfit the data • Idea is to estimate the error of each tree node; if estimated error for a node is less than that of its extended branch, the branch is pruned ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Nearest Neighbour Learning (kNN) • A supervised learning technique • Uses a lazy learning method as no model is learned from the training dataset, and learning only occurs when a test example needs to be classified • An extremely simple and yet effective in many applications like text classification • Algorithm: • Compute the distance between d and every example in D • Choose the k examples in D that are nearest to d, denote that set by P(⊆D) • Assign d the class that is the most frequent class in P (or the majority class) ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Nearest Neighbour Learning (kNN) • Algorithm variable explanations • D is the training dataset and nothing is done on the training examples • When a test instance d is presented, the algorithm compares d with every training example in D to compute the similarity or distance between them • The k most similar (closest) examples in D are then selected. This set of examples is called the k nearest neighbours of d • d then takes the most frequent class among the k nearest neighbours • k = 1 is usually not sufficient for determining the class of d due to noise and outliers in the data ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Nearest Neighbour Learning (kNN) • Key component of a kNN algorithm is the distance/similarity functionthat is chosen based on applications and the nature of the data • Two popular choices: • Euclidean distance – For relational data • Cosine similarity – For text documents • The number of nearest neighbours k is usually determined by using a validation set, or through cross validation on the training data; a range of k values are tried, and the k value that gives the best accuracy on the validation set is selected ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Problem Solving – K-Nearest Neighbour Learning (kNN) Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Nearest Neighbour Learning (kNN) • Positives • Simple concept • Strong and accurate as more elaborated methods • Very flexible • Able to work with any arbitrarily shaped decision boundaries • Negatives • Slow at the classification time because there is no model building and each test instance is compared with every training example at the classification time • Does not produce an understandable model and cannot be applicable if an understandable model is required ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Unsupervised Learning – Basic Concepts • Unsupervised learning doesn’t know the classes and the learning algorithm needs to automatically generate classes • Clustering is the process of organizing data instances into groups whose members are similar in some way • A cluster is a collection of data instances which are similar to each other and are dissimilar to data instances in other clusters • This technique is one of the most commonly used data analysis techniques for finding data that have no class attributes but there is a need to find such structures • This technique has a long history, being used in almost every domain ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Unsupervised Learning – Basic Concepts • Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Unsupervised Learning – Basic Concepts • Two main types of clustering exist: • Partitional – Dividing like data into similarity groups • Hierarchical – Organizing data into a hierarchical structure • Clustering requires similarity function to measure how similar two data points are, or alternatively a distance function to measure the distance between two data points • Goal of clustering is to discover the natural grouping of the input data through the use of a clustering algorithm and a distance function ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Unsupervised Learning – Basic Concepts Example for Partition Clustering A company wants to conduct a marketing campaign for promote its products. The most effective strategy is to design a set of personalized marketing materials for each individual customer according to their profile and financial situation. However, this is too expensive for a large number of customers. At the other extreme, the company designs only one set of marketing materials to be used for all customers. This one-size-fits-all approach, however, may not be effective. The most cost-effective approach is to segment the customers into a small number of groups according to their similarities and design some targeted marketing materials for each group. This segmentation task is commonly done using clustering algorithms, which partitioncustomers into similarity groups. Example for Hierarchical Clustering Everyday, news agencies around the world generate a large number of news articles. If a website wants to collect these news articles to provide an integrated news service, it has to organize the collected articles according to some topic hierarchy. The question is: What should the topics be, and how should they be organized? One possibility is to employ a group of human editors to do the job. However, the manual organization is costly and very time consuming, which makes it unsuitable for news and other time sensitive information. Throwing all the news articles to the readers with no organization is clearly not an option. Although classification is able to classify news articles according to predefined topics, it is not applicable here because classification needs training data, which have to be manually labeled with topic classes. Since news topics change constantly and rapidly, the training data would need to change constantly as well, which is infeasible via manual labeling. Clustering is clearly a solution for this problem because it automatically groups a stream of news articles based on their content similarities. ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Best known partitional clustering algorithm • Most widely used among clustering algorithms because of its simplicity and efficiency • It iteratively partitions the data into k clusters based on a distance function ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Explanation of the algorithm: • One has a set of data points D, where D = {x1, x2, …,xn}, where xi = (xi1, xi2, …xir), and r is the number of attributes in the data • Partition the data into k clusters where each cluster has a centroid; a cluster central point used as its representative and the mean of all data points in the cluster • At the beginning, the algorithm randomly selects k data points as the seed centroids, and computes the distance between each seed centroid and every data point • Each data point is assigned to the centroid that is closest to it • Once all data points are assigned, the centroid for each cluster is re-computed using the data points in the current cluster • Process is repeated until a stopping criteria is met ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Stopping criteria: • No re-assignments of data points to different clusters • No change of centroids • Minimum decrease in the sum of square error (SSE) – a measure of the difference between the data and an estimation model • Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems Problem Solving – K-Means Clustering Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Positives • Simplicity and efficiency • Easy to understand and implement • Considered a linear algorithm, and thus considered relatively fast • No clear evidence that other clustering algorithm performs better • Negatives • Empty clusters can be formed; empty clusters forming during the clustering process since no data point is assigned to them • To solve, need to choose another data point as the centroid such that no empty cluster can be formed • User needs to specify the number of clusters k in advance • Sensitive to outliers; data points that are very far away from other data points ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Negatives • Sensitive to initial seeds as different seeds may result in different clusters that are not optimal • Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Problem Solving – K-Means Clustering • Negatives • Sensitive to initial seeds as different seeds may result in different clusters that are not optimal • Another Example and Mini-Application Exercise! ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Hierarchical Clustering • Two main types: • Agglomerative (bottom up) – Builds the hierarchy from the bottom level, and merges the most similar (or nearest) pair of clusters at each level to go one level up. Process continues until all data points are merged into a single cluster (i.e., the root cluster) • More popular • Divisive (top down) – All data points are in one cluster (root), then splits into a set of child clusters. Each child cluster is recursively divided further until each cluster contains only a single point • Less popular ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Hierarchical Clustering • Unlike the k-means algorithm, which uses only the centroids in distance computation, hierarchical clustering may use anyone of several methods to determine the distance between two clusters: • Single-Link – Merges two clusters whose two nearest data points have the smallest minimum(two nearest data points that have the smallest distance between them) pair-wise distance • Complete-Link – Merges two clusters whose two furthest data points have the smallest maximum(two furthest data points that have the smallest distance between them) pair-wise distance • Average-Link – The distance between two clusters is the average distance of all pair-wise distances between the data points in two clusters • Strengths • Able to take any form of distance or similarity function • The hierarchy of clusters enables the user to explore clusters at any level of detail ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Hierarchical Clustering • Strengths • Agglomerative hierarchical clustering often produces better clusters than the k-means method • Agglomerative hierarchical clustering can find clusters of arbitrary shapes • Weaknesses • Single-link method may suffer from the chain effect; sensitive to noise in the data producing straggly clusters • Complete-link method is sensitive to outliers • Computation complexities and space requirements are quadratic in terms of time complexity; very inefficient and not practical for large datasets ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Programed Vs. Nonprogrammed Decisions • In the choice stage, one factor that influences the decision maker’s selection is if the decision can be programmed • Programmed decisions – Made using a rule, procedure, or quantitative method • Easy to computerize • Improves forecasting accuracy and reduces the possibility of manufacturing the wrong types of inventory, saving money • E.g., To say that inventory should be ordered when inventory levels drop to 100 units is a programmed decision because it adheres to a rule • Nonprogrammed decisions – Deals with unusual situations that are difficult to quantify • Examples • Determining the appropriate training program for a new employee • Deciding whether to develop a new product line • Weighing the benefits and drawbacks of installing an upgraded pollution control system ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Optimization, Satisficing, and Heuristic Approaches • DSS can either optimize or satisfice • Optimization model – Finds the best solution, usually the one that will best help the organization meet its goals • E.g., Finding the best route to ship products to markets • Uses problem constraints; e.g., a limited number of available work hours in a manufacturing facility • Satisficing model – Finds a good, but not necessarily the best solution to a problem • Used when modeling the problem properly to get an optimal decision would be too difficult, complex, or costly ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Optimization, Satisficing, and Heuristic Approaches • Satisficing model – Finds a good, but not necessarily the best solution to a problem (continued!) • Does not look at all possible solutions but only at those likely to give good results • Heuristics – Commonly accepted guidelines or procedures that usually find a good solution • E.g., Ordering four months supply of inventory when inventory levels drops to 20 units or less; although might not minimize total inventory costs, it can serve as a good rule to avoid shortages without maintaining excessive inventory ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • Capabilities of a DSS • Support for problem-solving phases because most DSS are designed to assist decision makers with the phases of problem solving discussed earlier • Support for various decision frequencies such as ad-hoc decision (one-of-a-kind decision made on demand) to repetitive decisions done on a regular basis • Support for various problem structures such as structured problems that are straightforward, requiring known facts and relationships; semi-strucuted and unstructured problems where relationships between data is not always clear • Support for various decision-making levels such as upper and lower management ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems DSS vs. MIS ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Decision Support Systems • DSS Components • The database – The DBMS allows managers and decision makers to perform qualitative analysis on company’s data • Example of Qualitative Analysis • A hospital has amount of patient data. This kind of analysis would go through this data, analyze it, and produce accurate predictions when people would need to be admitted to the hospital for medical procedures and care. • The model base – Allows managers and decision makers to perform quantitative analysis on both internal and external data • Used to turn data into future products, services, and profits • The user interface/The dialogue manager – Used by the user to interact with the DSS ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Artificial Intelligence Systems • Artificial intelligence – The ability for CBIS to mimic or duplicate the functions of the human brain • Advancement of AI has gone a long way: • IBM developed a AI system named Watson that was able to soundly defeat two prior champions of the TV game show, Jeopardy • Able to process human speech, search vast databases for possible answers, reply in a human voice • Use of AI can improve daily lives. E.g., Doctors could use AI to make faster, more accurate diagnoses for patients; medical researchers could use AI to make medical breakthroughs ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Artificial Intelligence Systems • AI exhibits intelligent behaviour • Learning from experience and apply the knowledge acquired from experience • Handling complex situations • Solving problems when important information is missing • Determining what is important • Reacting quickly and correctly to a new situation • Understanding visual images • Process and manipulate symbols • Being creative and imaginative • Using heuristics (rules of thumb arising from experience) ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
More on Artificial Intelligence Systems • Major Branches of AI • Expert systems – Explained in Lecture 1 • Robotics – Mechanical or computer devices that can paint cars, make precision welds, and perform other tasks that require a high degree of precision or are tedious or hazardous for human beings • Vision systems – Hardware and software that permit computers to capture, store, and manipulate visual images • Natural language processing and voice recognition – Computers understanding and reacting to statements and commands made in a “natural” language, such as English • Learning Systems – Combination of software and hardware that allows a computer to change how its functions or how it reacts to situations based on feedback it receives • Neural networks – Computer system that can act like or simulate the functioning of a human brain ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
Application Exercises Describe problem solving and all its stages What is the difference between supervised learning and unsupervised learning? Compare and contrast a supervised learning method to a unsupervised learning method What is the difference between programmed and nonprogrammed decisions? What is the difference between the optimization, satisficing, and heuristic approaches? What is the main difference between a DSS and a MIS? What do AI systems have to do in order to be considered a true AI systems? Describe the branches of AI ADMS 3020 - Canadian Business Culture and Management Skills – Lecture 2
The End Questions? Lecture 2 ADMS 3020 - Canadian Business Culture and Management Skills Prof. DawidKasperowicz http://www.yorku.ca/dkasper