570 likes | 699 Views
Sybase Update. Joe Shaffner Regional Technical Manager jschaff@sybase.com. Agenda. The Challenges of Data Warehousing Sybase’s Approach Recent Proof Points Customer Examples / Success Stories. Agenda. Introduction Traditional Approaches to Data Warehousing Sybase Approach
E N D
Sybase Update • Joe Shaffner • Regional Technical Manager • jschaff@sybase.com
Agenda • The Challenges of Data Warehousing • Sybase’s Approach • Recent Proof Points • Customer Examples / Success Stories
Agenda • Introduction • Traditional Approaches to Data Warehousing • Sybase Approach • Data Archiving
Data to Information to Knowledge • There is no shortage of information but there is a shortage of useful information • Data • enables an enterprise to record an event • Information • enables an enterprise to respond to an event • Knowledge • enables an enterprise to anticipate an event
Flow “The data is there, I just can’t access it, can’t get to it.” “The data is locked up in silos.” Quality Speed “I don’t always have the right data to make a decision” “I need the information in minutes not months.” “The data isn’t presented in the right context that I can use.” “I have to wait until I’m back in the office to get data.” The Dimensions of Liquidity
BI System Purposes and Functions DM Review, April 2001 More ad hoc use than ever!
Why Do You to do Data Warehousing on the Fly? • Saving Money and time is what its all about now. • Decision cycles have significantly compressed • Need answers now, not tomorrow. • Each answer creates a new question • Questions are ad hoc by nature • Project timelines to build a data warehouse have been reduced
Lots of Queries, Lots of Users, lots of Data -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ; -- Query displays service types by month -- Valid values are 1- 12 for month_key select month_key ,service_key,count(*) from telco_facts where month_key = 1 group by month_key,service_key ; -- A look at customers who have the following service -- call waiting , caller id, and voice mail by fiscal period -- i.e. Q1,Q2,Q3,Q4 for year= 1998 select service_key,fiscal_period,count(*) from telco_facts T,month M where T.month_key=M.month_key and service_key = 4 group by fiscal_period,service_key order by fiscal_period,service_key ; -- female Customers in Massachusetts that do not have -- caller id select distinct(C.customer_key), C.customer_first_name, C.customer_last_name, C.phone_number from residential_customer C,service,telco_facts where C.customer_key = telco_facts.customer_key and telco_facts.service_key = service.service_key and caller_id_flag = 'N' and state = 'MA' and customer_gender = 'F' ; -- Find prospects for voice mail based on the criteria -- that customers with call waiting and caller id are -- good prospects for call Waiting select state, count(*) from telco_facts T,service S, residential_customer C, month M where T.service_key = S.service_key and T.customer_key = C.customer_key and T.month_key = T.month_key and call_waiting_flag = 'Y' and caller_id_flag = 'Y' and voice_mail_flag = 'N' and state in ('NY','NJ','PA') and fiscal_period = 'Q1' group by state ; -- Find customers that had ISDN service in February 1998 select customer_last_name ,customer_first_name from residential_customer R ,telco_facts T,service S,month M where M.month_text = 'February ' and M.year = 1998 and S.isdn_flag = 'Y' and M.month_key = T.month_key and S.service_key = T.service_key and R.customer_key = T.customer_key ; -- Look at the local call minutes to see if they -- have increased after adding call waiting select fiscal_period,count(*),sum(local_call_minutes) from residential_customer R ,telco_facts T,status S,month M where S.call_waiting_status='Added' and state = 'OH' and M.month_key= T.month_key and S.status_key = T.service_key and R.customer_key = T.customer_key group by fiscal_period order by fiscal_period ; -- Look at the call usage for customers with call waiting -- "service type 2" compared with customers with both call -- waiting and Caller id for Q4 for customers in CA select fiscal_period,T.service_key,sum(local_call_minutes), sum(local_call_count),count(*) from telco_facts T,residential_customer C,service S,month M where T.customer_key = C.customer_key and T.service_key = S.service_key and T.month_key = M.month_key and fiscal_period = 'Q4' and T.service_key in (02,03) and state = 'CA' group by fiscal_period,T.service_key ;
Traditional Row Based DBMS • Designed to support On-line transaction processing (OLTP) • GOOD at • getting data in quickly • assuring referential integrity • Not GOOD at: • Getting data out quickly • Supporting ad hoc queries • Requires table scans • Storing data efficiently • Requires many indexes • May require pre-aggregations • Have been retro-fitted to support data warehouses • Still run into to the same “old” limitations Examples: MSSQL Sybase ASE Oracle 8.1 DB2
Traditional Data Warehouse Databases • Were designed for Data Warehousing • Are good at • Ad Hoc Analysis • Supporting many users • Loading data • But are; • Challenging to implement • Costly to Maintain • Expensive to Own • May not scale well as users are added. Examples: Red Brick TeraData
ASIQ – What is it? • Adaptive Server IQ is a Relational DB System • developed in 1993 • employs patented bitwise indexes for fast query response • Adaptive Server IQ was designed specifically for Data Warehousing • Has the look and feel of a typical RDBMS • But under the covers it bears little resemblance • 2,500 Customers Worldwide • Mission Critical applications • VLDB • IQ customers grew by 125% last year
Traditional Solution - Parallelism • 800 Bytes per row • 16K Page Size • 20 Million Rows • 500,000 I/O’s Needed • Parallelism won’t solve this problem • More Hardware, slightly faster • Only 1 Query can run at a time • Very expensive and inflexible to ad-hoc queries Calculate average sales for “A” stores in New York
Read Full Row Read Relevant Columns Read Bitmap ASIQ Architectural Summary • Bitmap structures are built on all fields • Bitmaps further reduce the amount of data read • Small number of bits rather than entire field • ANDing and ORing bitmaps is very efficient with today’s processors • Note that even vertically stored data is not read • Data structure storage and manipulation is transparent to applications and administrators. New values are handled automatically.
Distinctive Advantages of IQ • Storage Efficiency • 20-50% of raw data vs. 300-900% • Query Efficiency • 10-100x faster than traditional RDBMS • Database Loading • Load & Index while reading • Scalability • Billions of rows, thousands of users, hundreds of nodes • 3 million internet users • Disk Input-Output Efficiency • 60x less I/O than traditional DBMS • Simple to Administer • DBA load is 25% of traditional DBMS • Any schema (Multidimensional, etc.) IQ Data Store “IQ deserves the attention of ANYONE evaluating data warehouse DBMS options.” Rich Finkelstein, Performance Computing Inc.
Summaries Aggregates 1-2 TB Same INPUT Data: “Conventional DW” is 3x-10x larger than IQ-M DW 2.4-6 TB Indexes 0.5-3TB LOAD Base table (“RAW data”) (no indexes) Aggr/Summ: 0-0.1TB 0.25 - 0.9 TB LOAD Indexes: 0.05-0.3TB 0.9-1.1TB Base table(FP):0.2-0.5TB Gartner Measurement - Amount of Detailed DataManaging Large Amounts of Detailed Data IQ Multiplex Conventional DBMS INPUT DATA: 1TB -Source: flat files, ETL, replication,ODS
Elements of TCO for DW “Over a 5 year period the cost of managing a data warehouse is typically 3X the initial budget.” Hardware consumes a large portion of the budget Source: Meta Group 2/98
ASIQ Multiplex – What Is It? • Version of ASIQ • Purpose is to Extend ASIQ Scalability • Extends Single Database Access across Multiple Computer Nodes • Allows Mixing of Unix and NT Nodes • NOT an MPP Solution • Much Simpler Implementation • Much Simpler Management
You build your Warehouse … It’s too successful !!!!!! It’s very successful ! At some point you hit the wall with performance It’s successful ! IQ
ASIQ Multiplex – What Is IT? • Version of ASIQ • Purpose is to Extend ASIQ Scalability • Extends Single Database Access across Multiple Computer Nodes • NOT an MPP Solution • Much Simpler Implementation • Much Simpler Management
IQ IQ IQ IQ IQ Multiplex functions IQ Multiplex functions IQ Multiplex functions Unix Unix Unix Unix Unix Server Unix Server Unix Server Unix Server IQ IQ Multiplex Configuration Each IQ Node has its own: - CPU’s - Local Temp Space (Disk) - Memory - Catalog All data and indexes are stored in the shared IQ database, which is on fiber channel or EMC type storage systems Individual nodes can be different configurations (CPUs, memory, disk)
IQ IQ IQ Unix Unix Unix Unix Server Unix Server Unix Server Scaling to More Users or More Data Write Node Read Only Nodes Read Only Nodes IQ IQ IQ IQ IQ Multiplex functions Unix Unix Unix Unix Unix Server Unix Server Unix Server VLM Unix Server No Data Redistribution No Change in Schema Replication of Catalog for Logins Very little I/O contention (1/10 of Oracle Parallel Server) IQ IQ
VLDB Enterprise Entrepreneurial Web Portals Adaptive Server IQ MultiplexExclusive Prepackaged Scalability 1000’s of Users Adaptive Server IQ Multiplex 100’s of Users 10’s of Users
Adaptive Server IQ Index Types: • Low-Fast(LF) This index is ideal for columns that have a very low number of unique values (under 1000) • Example: Gender, Yes/No, State • High_Group (HG) Typically used to process equality and group by operations on high-cardinality data (recommended for more than 1000 distinct values) • High_Non_Group (HNG) Add an HNG index when you need to do range searches. • The number of unique values is high (greater than 1000) • You don't need to do GROUP BY on the column • Compare (CMP) A Compare (CMP) index is an index on the relationship between two columns
Adaptive Server IQ Index Types: • Containment (WD) This index allows you to store words from a column string of CHAR and VARCHAR data. • Use a WD index for the fastest access to columns that contain a list of keywords (for example, in bibliographic record or Web page). • The Date (DATE), Time (TIME), and Datetime (DTTM) Three index types are used to process queries involving date, time, or datetime quantities: • A DATE index is used on columns of data type DATE to process certain queries involving date quantities. • The TIME index is used on columns of data type TIME to process certain queries involving time quantities. • The DTTM index is used on columns of data type DATETIME or TIMESTAMP to process certain queries involving datetime quantities.
PLATFORMS Sun Solaris HP - UX IBM AIX (Simplex) NT LINUX 32 bit WEB ANALYSIS APPL. Compudigm Web Trends Informatica CRM ANALYSIS APPL. Industry Warehouse Studio * ADVANCED VISUALIZATION Compudigm ETL / DATA MOVEMENT Informatica * Sybase Replication Server WEB REPORTING APPL. Actuate e.Reporting Suite IQ Hardware and Software Partners
ANALYSIS TOOLS Business Objects Cognos Brio Micro Strategy Easy Ask Whitelight Hummingbird ANALYSIS TOOLS MS Access SAS Group 1 MineSet (SGI) PowerDesigner * Warehouse Control Center * PowerBuilder * IQ Hardware and Software Partners
OLAP Tools And IQ: A Beautiful Thing • Demonstrates IQ well • OLAP Tools do ad hoc queries • OLAP Tools can bring a traditional RDBMS to its knees. • OLAP tools can be used by large numbers of people • The success of a Data Warehouse is largely determined by its acceptance by end users • If the OLAP tool becomes popular • More data will be requested • More users
References: Amount of Detailed Data Nielsen Media Research Business Issues • Leader in TV ratings business • 5 to 10 years of TV viewer history • Cost and logistics becoming costly • Need same data on multiple databases to scale • Difficult to deliver new services to customers “The big advantage of the Sun Sybase Reference Architecture is that it provides the advance knowledge that this solution will work.” “We are able to deliver one data warehouse for all our applications, at one-third the storage of conventional technologies, while seeing performance gains as advertised with IQ Multiplex.” Kim Ross CIO Nielsen Media Research Results • Sun/Sybase delivered Reference Architecture • 35 TB benchmark and best practices guide • 12 TB detailed input data in production • Fast access and data load • Linear scalability to 108 CPUs • Architected for 100s of TB on Hitachi SNA
References: Query Complexity U.S. DOT: Bureau of Transportation Statistics Business Issues • Congressional legislation required consolidated, single-point of access to all transportation statistics • Needed to deliver over the Web • Over 250+ databases of source data Sybase IQ reduced loading and indexing from 30 minutes to 2.5 to 3 minutes. Query speeds were 20 – 50 times faster than Oracle. Time to add a column was reduced from 4 hours with Oracle to 15 minutes with IQ. Jeff Butler Assistant Director, Office of Statistical Computing Department of Transportation Bureau of Transportation Statistics Results • 2.5TB of detailed input data compressed to 1 TB • Query complexity with 18-way joins • Reduced data gathering time • Easy linkages across many data sets allows new insights on transportation safety • The new website is aimed at transportation researchers and analysts • Website gets 15,000 hits per day
BizRate gains Economical DataManagement with Sun and Sybase Business Issues • Delivers analysis of internet utilization • Leading online customer survey producer • Cost and logistics becoming unwieldy • Microsoft SQL Server could not scale • Simplify data deliver and analysis for sellers Business Results • Manage 15 million customer data sets • Tight integration through Reference Architecture • Scaleable solution that will grow • Delivered on Sun Fire V880 Together, Sun and Sybase have created a solution that packs an extraordinary amount of data processing and analytical power into a small footprint that represents a realistic investment for small and mid-sized firms. Sybase’s tight architectural integration with Sun technology provides us with the assurance we need that the technology foundation of our data warehouse will scale to meet our growing needs in the future. Henri Asseily, Chief Technology Officer and Founder of BizRate.com
References: Amount of Detailed Data & Query Complexity Internal Revenue Service Business Issues • Analysis virtually impossible • Lost productivity • Loss of potential billions in revenue • VLDB management “The primary technology challenge was to build a system that could manage such large volumes of dataand yet was sufficiently open to facilitate queries from various off-the-shelf products. We selected Sybase ASIQ as the data-management server, based on its strength with decision support type queries.” Jeff Kmonk Manager, Office of ResearchCompliance Research Division Internal Revenue Service Results • 10+ TB detailed input data (10 yrs of taxpayer records) fits in 5 TB of storage • Query complexity with 14-way joins • Average 120 ad-hoc analysis users • Modeled entire population of commercial tax returns • Supports advanced analysis like data mining • Revenue protection & fraud detection • ROI of $250 Million • Portal-enabled
References: Amount of Data and # of Concurrent Users NC: Dept. of Health and Human Services Business Issues • System required to provide information for federally mandated reports. Also used for fraud detection for USDA and food stamp programs Non-disclosure reference Results • Now serving 1,200 users on 4.5TB, to grow to 28TB • Recently used to uncover $3.5 million in Medicaid claims saving • Approx. $18 million in storage savings
References: Amount of Detailed Data & Number of Users & Complex Data Models American Express Global Fraud Detection Business Issues • Unable to perform advanced analysis of fraud patterns for credit card transactions with competing solutions due to performance issues, query complexity limitations • Needed solutions to handle 700-column table to describe every transaction, at least one year of transactions online Non-disclosure reference Results • Advanced fraud analysis possible for last 4 years • Over 1,600 users worldwide • 6TB of input data; 10 billion records (last 13 months) of credit card transactions online • All fraud managers worldwide use IQ Multiplex system in AZ • Over 90% of database is fraud detection information
References: Query Complexity Bank of Montreal Business Issues • Identify and retain most profitable customers • Increase effectiveness of marketing programs • Attract new customers • Access to multiple information systems and “touch points” • Cutting edge technology and architecture “We felt you should develop the data warehouse component by component because that allows you to apply what you learn.” “Sybase was truly committed to ensuring that we used technology in a way that really impacted the business.” Carl A. Touchie Sr. Manager Electronic Financial Services Bank of Montreal Results • 1 Terabyte data warehouse • Avg. query complexity with 18-way joins • IRR over 100% • Average credit card volume up 59% • Average credit card balances up 129% • Market share up 60 basis points • System up in 4 months • Component architecture enables flexibility
Query Efficiency 10-100x faster than traditional RDBMS Database Loading 600+ million records/300+ GB of data per week, 2+ billion rows a month Scalability Billions of rows, thousands of users, hundreds of nodes 3 million internet users Storage Efficiency 20-70% of raw data vs. 300-900% Disk Input-Output Efficiency 60x less I/O than traditional DBMS Simple to Administer DBA load is 25% of traditional DBMS Any schema (Multidimensional, etc.) Distinctive Advantages of IQ
SybaseIQ Proof of Concept Overview for Large Southern Bank October 2nd, 2003
Your Requirements • Reduce EMC Storage costs • Simple and fast implementation • Faster query speeds
Who is using IQ today for Archiving? • Nielsen Media Loaded 10 years of detailed TV viewer patterns from mainframe archive. • http://syberstatic.sybase.com/bid/pdfsforweb/nielsen_research.pdf • U.S. Internal Revenue Service Keeps years’ worth of all U.S tax returns on disk. • http://syberstatic.sybase.com/bid/pdfsforweb/iq_ss_l01105.pdf • EMI Music Germany Stores 10 years of historical data on and delivers responses to ad-hoc queries within seconds. • http://syberstatic.sybase.com/bid/pdfsforweb/emi_ss.pdf • North Carolina Department of Health and Human Services Recently used to uncover $3.5 million in Medicaid claims saving. Serving 1,200 Approx. $18 million in storage savings • 2 TB of input/raw data, IQ-M: 1.5 TB, Recently used to uncover $3.5 million in Medicaid claims saving • Approx. $18 million in storage savings
TIME LINE DATE AUGUST SEPTEMBER 3rd 8th 15th 16th OCTOBER 1st 2nd TASK Initial IQ Presentation Scoping Meeting Pre POC Preparations Installation, configuration, load Load completion Tabulation of Results (Two Days!) Complete data movement Run sample queries Presentation TODAY Stumbling Blocks • Dr Watson prevented final table load on day 1 • Special Characters in data found in extract prevented load completion
Current Environment ARCHIVE SOURCE BCP BCP
POC Environment ARCHIVE Or FROM ASE O D B C J D B C O P E N C L BCP V I E W C R E A T E D P3Tradsssss_2000 (16M Rows) SYBASE IQ Xxx History_2001 (16M Rows) P3ssssss_2002 GOOBER Sun E 6500 20-way 7GB Ram Ssss _hist_Archive (89M Rows) IQ Clients
Storage Costs Analysis Ref Wachovia Cost Analysis.xls
Your Requirements…… The Results • Reduce EMC Storage costs = demonstrated • Simple and fast implementation = three days for four tables • Faster query speeds = all