130 likes | 246 Views
New NYC Business Incorporation 2005-2013 An Exploration of Non-Minority and Minority-Owned Enterprise Creation By Shelby Ahern stahern@gmail.com NYC Data Science Academy Student Demo day 07-21-2014 R005: Data Science by R(Beginner level). Objective. Explore
E N D
New NYC Business Incorporation 2005-2013 An Exploration of Non-Minority and Minority-Owned Enterprise Creation By Shelby Ahern stahern@gmail.com NYC Data Science Academy Student Demo day 07-21-2014 R005: Data Science by R(Beginner level)
Objective • Explore • New Business Incorporation in NYC between2005-2013, and • New Business Incorporation, by Minority and Non-Minority Ownership • Data Sources • Active New York Corporations: Beginning in 18001 • Entity Type: • Domestic Business Corporation • Domestic Cooperative Corporation • Domestic Professional Corporation • NYC Online Directory of Certified Businesses: Minority-Owned Business Enterprises (MBE)2,3 • U.S. Census Population Estimates4 • Parameters and Notes • 2005-2013 (9 years) • Borough = County (ie. Manhattan: New York County, Brooklyn: Kings County, Queens = Queens County, Bronx = Bronx County, Staten Island = Richmond County
Process/Methodology • Create Data Frames of Data from Each Source • Run Summary Statistics for Validation • Split by Borough and Combine DFs from Different Sources • Perform Calculations ie. New Incorporations per Capita • Data Viz! • Test: “Density” of New MBE Corps for Minority Population ≠ “Density” of New Non-MBE Corps per Non-Minority Population
Initial Exploration: Corporation Filings An Initial Review of the Summaries of the Corporation Data and MBE-Certified Corporations show… • Major disparity between the Number of Incorporations per year, and number of MBE’s established in that year. • Why? • Data Quality: Change in Ownership Structure, Restrictions to MBE Certifications, and/or Filing Lag • !! What the Data actually represent: MBE application purpose & process
Collating and Calculating Data After merging data from different data frames, we are able to calculate the number of new corporations filed per capita, on a yearly basis. Further, we calculate the number of new corporations filed per capita of certain populations, like MBEs/Minority and Non-MBE’s/Non-Minority populations. Example Data Frame, Manhattan
Incorporation Activity per Capita Incorporations per Capita and MBE Incorporations per capita, 2005- 2013 $NwCorpsperCap $NwMBECorpsperCap
MBE Incorporation Activity per Capita MBE Incorporations per Capita, 2005- 2013 $NwCorpsperCap $NwMBECorpsperCap
Incorporation Activity per Capita, cont. Findings: • The per-capita incidence of incorporations increased across all boroughs, from 2005 - 2013. • Manhattan, Queens, and Brooklyn had the highest per-capita incorporations. • Queens appears to have the steepest increase in corporation filings. • MBE incorporations per capita are a thousands of times smaller than the general level of per-capita-incorporation. • The per-capita incidence of MBE incorporations varied by borough (led by Manhattan), and trended downward after 2009.
Do the Frequency of Incorporations vary between Minority and Non-Minority Populations? Hypothesis: The number of MBE incorporations per non-white person is not equal to the number of non-MBE incorporations per white person. The approach: 1. Select Value to test: • MBE Corps per Minority capita • Non-MBE Corps per Non-Minority capita • Utilize data from all years and boroughs (5 boroughs x 9 years x 2 categories = 90 obs.) 2. Evaluate which test(s) to conduct. • Parametric vs. Non-parametric • Means test vs. Other 3. Conducttest and analyze results.
Reviewing the MBE-Minority Data Set Histogram, MBE Incorporations per Minority capita QQplot, MBE Incorporations per Minority capita
Reviewing the Non-MBE – Non-Minority Data Set Histogram, Non-MBE Incorporations per Non-Minority capita QQplot, Non-MBE Incorporations per Non-Minority capita
Testing for Similarity/Dissimilarity Neither MBE nor Non-MBE per capita data appear to be normally distributed. Hence, we’ll consider the following two non-parametric tests: Mood’s Median Test Anonparametrictest where the null hypothesis of the medians of the populations from which two or more samples are drawn are identical. (Wikipedia) H0: Medians of MBE - Minority cap and Non-MBE -- Non-Minority cap are equivalent. H1: Medians of MBE - Minority cap and Non-MBE -- Non-Minority cap are NOT equivalent. Mann-Whitney-Wilcoxon Test A nonparametric test of the null hypothesis that two populations are the same against an alternative hypothesis, especially that a particular population tends to have larger values than the other. (Wikipedia) H0: MBE - Minority cap and Non-MBE -- Non-Minority cap could be representative of the same set of data. H1: MBE - Minority cap and Non-MBE -- Non-Minority cap could NOT be representative of the same set of data.
Findings • In both tests of parity, the null hypothesis is rejected, thus we find that the incidence of new business incorporations per capita are different between the two populations.