580 likes | 743 Views
CENTER FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY. Longitudinal Data Analysis – methods and applications in Innovation Studies. Martin Andersson CIRCLE, Lund university. OUTLINE.
E N D
CENTER FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Longitudinal Data Analysis – methods and applications in Innovation Studies Martin Andersson CIRCLE, Lund university
OUTLINE • Part I: WHY? - identification problems in Innovation Studies and social sciences more broadly • Part II: WHAT? - introducing panel data analysis • Part II: HOW? - lab session on panel data
Part IIdentification problems in Innovation Studies and social sciences more broadly
Identification • Main goal in regression analysis is often to learn about causal relationships from micro-data capturing non-experimental economicbehavior. • Studies ask "treatment effect" questions of the form: what is the effect of X on Y? • What is the effect of R&D investment on a firm’sproductivity? • Does the thelocalmilieu of a firmaffectsitsinnovativeness? • What is the effect of general purposetechnologies in growth? • What is the effect of a alocaluniversity on new firm formation? • Does entrepreneurshipinfluence regional economicgrowth?
Identification • Does the estimated - parameter reflect a causaleffect of X on Y? • How to ”isolate” the effect of X on Y?
Identification • Identification is closely linked to consistency: • is the ”true” parameter • is our estimate • Ideally, we choose a model (right technique, right variables and right assumptions) whichs means that is consistent, such that it converges to when N is large • This is essentially what identification is all about
Identification • Manski (2003): the selection problem ”The researcher wants to compare the outcomes that people would experience if they were to receive alternative treatments. However, treatments are mutually exclusive. At most, the researcher can observe the outcome that each person experiences under the treatment that this person actually receives. The researcher cannot observe the outcomes that people would have experienced under other treatments. These other outcomes are counterfactual. Hence, data on treatments and outcomes cannot by themselves reveal treatment effects.” • IDEAL: “treated” individuals selected randomly • When is random treatment selection appropriate? • in the analysis of data from classical randomized experiments. This is the main reason why randomized experiments are valued so highly. The assumption of random treatment selection is usually suspect in non-experimental settings, where observed treatments may be self-selected or otherwise chosen purposefully.
Identification • Example 1.1 in Jurada (2007): • Suppose that you are interested in the effect of military service on subsequent earnings. You can look at the mean difference in the outcome between veterans and non-vets. • …. but, inside this number hides not only a causal effect of the service, but also the composition of other causal variables in each group, both observed and unobserved. • Are there variables that affect both participation in the program and the outcome? Are the vets earning more because of the military service or are the high-earners more likely to enroll in the army?
Identification • One solution is to control for factors that may drive selection. • Typical procedure in most empirical papers • we are interested in x but to isolate its effect we control for z. • when is controlling for observable factors enough to identify a causal effect ? => when is “selection on observables plausible”? • When is it plausible that conditional on Z, assignment to treatment is “ideal”, i.e. as good as random? If applicants to a college are screened based on Z, but conditional on passing the Z test, they are accepted based on a random draw. IMPORTANT TO THINK ABOUT THE DATA GENERATING PROCESS (DGP)
Identification • An issue of ”selection” vs. ”learning” • Applies to several different topics in Innovation Studies • Roles of selection and learning is typically of great conceptualinterest and policy relevant • usuallywethink of ”learning” as reflecting a causaleffect: • Three examples from the literature: • Persistence of Innovation • Exporting and productivity • Urban Wage Premium
Example 1: Persistence of Innovation • Bettina Peters: • Persistence of Innovation: stylised facts and panel data evidence, Journal of Technology Transfer, 2009 • German manufacturing and services firms 1994-2002: • Is innovation persistent? => Yes! • What drives this?
Example 1: Persistence of Innovation • 1: “True” state dependence. • a causal behavioral effect: the decision to innovate in one period in itself enhances the probability to innovate in the subsequent period. • (i) success breeds success (Mansfield 1968) • (ii) innovations involve dynamic increasing returns (Nelson and Winter 1982 and Malerba and Orsenigo 1993) • (iii) sunk costs in R&D investments (Sutton 1991)
Example 1: Persistence of Innovation • 2: Selection on time-invariant characteristics • Innovating firms may have characteristics which make them particularly ”innovation-prone” • If these characteristics themselves show persistence over time, they will induce persistence in innovation behavior. • If these are not appropriately controlled for, past innovation may appear to affect future innovation merely because it picks up the effect of the persistent characteristics. • In contrast to true state dependence this phenomenon is therefore called spurious state dependence
Example 2: Exporting and Productivity • Stylized fact that exporters are more productive • In Sweden, persistent exporters are about 20 % more productive than are non-exporters (Andersson et al 2008) • Why? • Learning-by-exporting: • Causal effect from exporting on productivity • Knowledge accumulation through interaction with foreign customers may stimulate innovation and productivity • Export markets more competitive and stimulate reduction of X-inefficiencies and adoption of ‘best-practice’ routines • Self-selection: • Exports associated with entry costs, implying productivity thresholds that only more productive firms can overcome (Bernard and Jensen 2004, Greenaway and Kneller 2007, Wagner 2007) Simplyanalzying the relationshipbetween exports and productivitywithouthfurthercontrols and/or study of time sequences (ex post // ex ante) tellsusnothingabout the relevance of the different explanations.
Example 3: Urban Productivity Premium • Wages (and productivity) generally higher in larger regions
Example 3: Urban Productivity Premium • Selection • the “best” and the “brightest” move to the cities • Learning • Causal effect “from the environment” on productivity • operating in a dense agglomeration stimulate a worker’s productivity
Example 3: Urban Productivity Premium Alfred Marshall on selection in 1890: ”In almost all countries there is constant migration towards the towns. The large towns and especially London absorb the very best blood from all the rest of England: the most enterprising, the most highly gifted, those with the highest physique and the strongest characters go there to find scope for their abilities”
Example 3: Urban Productivity Premium • Learning relates to ‘pure’ agglomeration effects and is conceptually rooted in the literature on agglomeration economies and localized human capital spillovers (Rauch 1993, Glaeser 2008). • Agglomerations as “innovation environments” (Glaeser 1999)
Example 3: Urban Productivity Premium • Big literaturefocused on untangling the relative roles of selection and learning in explaining the UWP. • Risk of overestimatinglearning(causaleffect from agglomeration on productivity) if not appropriatelycontrolling for selection
Identification • Thinking in terms of ”selection” and ”learning” important for identification • but, • … selection and learningeffects are, at leastconceptually, seldom mutuallyexclusive • and, • their relative rolesoftenbear on theory as well as policy
Identification • Example UWP: • Theory and conceptualizations: • if selection is the dominant source of the city productivity premium, then theory should focus on why cities attract more productive workers rather than why cities are more productive (Glaeser and Maré 2001) • Policy • learning effects provides support for policies stimulating the growth of large city agglomerations.
Identification • Examplepersistence of innovation: • Theory and conceptualizations: • Endogenous growth models: • Romer(1990) assumes that innovation behaviour is persistent at the firm level to a very large extent. • Aghion and Howitt (1982) suggest that the process of creative destruction leads to a perpetual renewal of innovators. • Empirical knowledge about the dynamics in firms’ innovation behaviour is a tool to assess different endogenous growth models
Identification • Policy • If innovation is state dependent, innovation–stimulating policy measures such as government support programmes are supposed to have a more profound effect because they do not only affect current innovation activities but are also likely to induce a permanent change in favour of innovation. • If, on the other hand, individual heterogeneity induces persistent behaviour, support programmes are unlikely to have long–lasting effects and policy should concentrate more on measures which have the potential to improve innovation–relevant firm–specific factors.
Identification • SUMMARY: • For identification of the effect of X on Y, accounting for selection is imperative. • Selection vs. learning a key issue in many lines of inquiry in Innovation Studies, as well as in the social sciences more broadly • But how to account for selection?
Identification • Selection on observables: • Estimate effect of X on Y, while controlling for observable characteristics of the ”observational units”, such as individuals or firms. • Firms: productiviy, employment size, location, capital stock, human capital, industry affliation ownership structure • Individuals: age, gender, education, place of residence, tenure, etc. • NOTE: one reason for the growing popularity of using micro-level datasets on individuals and firms is the potential for accounting for selection on observables
Identification • Problem! • We do not observe all relevant attributes of firms and individuals that may be of importance in explaining the phenomena we are interested in. • Firms: managerial abilities, organizational routines, attitudes towards risk, technological opportunities, etc • Individuals: IQ, skills, creativity, risk attitudes and all other sorts of innate abilities • What to do?
Identification • One option is selection on “unobservables” • We can do this with panel data. • Many researchers maintain that the main advantage of panel data is that one can get rid of unobserved heterogeneity, since unobserved heterogeneity is considered as ‘the‘ problem of non-experimental research.
What is panel data? • Panel data are a form of longitudinal data, involving regularly repeated observations on the same individuals • Individuals may be people, households, firms, areas, etc • Repeated observations over time • repeated cross-sectional time-series
Terminology in panel data applications • A balanced panel has the same number of time observations (T) for each of the n individuals • An unbalanced panel has different numbers of time observations (Ti) on each individual • A compact panel covers only consecutive time periods for each individual – there are no “gaps” • Attrition is the process of drop-out of individuals from the panel, leading to an unbalanced (and possibly non-compact) panel • A short panel has a large number of individuals but few time observations on each • A long panel has a long run of time observations on each individual, permitting separate time-series analysis for each
Benefits of panel data • They are more informative (more variability, less collinearity, more degrees of freedom), estimates are more efficient. • They allow to study individual dynamics • Some phenomena are inherently longitudinal (e.g. poverty persistence; unstable employment) • The ability to make causal inference is enhanced by temporal ordering • They allow to control for individual unobserved heterogeneity
A note on casual inference and panel data • This example is based on Brüderl (2005) • Let i denote an individual, t time, T treatment and C non-treatment. Y is the outcome we are interested in. • Optimal identification is: => impossible! (clones not available) • Cross-sectional data: => compare treated with untreated • This only provides the “true causal effect” if the assumption of unit homogeneity (no unobserved heterogeneity) holds. Requires ’perfect’ controls • With panel data: => ’within estimation’ • We observe the same indiviual before and after treatment. Unit homogeneity here is needed only in an intrapersonal sense!!
The issue with not accounting for unobserved ability • We want to analyze the effect of human capital x on a firm i’s innovation output, y. • We have panel data and set up the following model: • Despite panel data, all issues of selection is at work here as well. • It may be the managerial ability of the firms that matters. The effect of x on y may be biased because more high-ability managers tend to recruit more human capital. (omitted variable bias)
The issue with not accounting for unobserved ability • Unobserved heterogeneity, such as managerial ability, end up in the error term . • But if high-ability managers hire more human capital, then this means that and are correlated. • This violates the assumption of exogenity • Endogeneity (X-variable correlates with the error term) results in biased regression estimates. • Endogeneitycan be a consequence of unobserved heterogeneity.
How panel data can take care of unobserved heterogeneity • Panel data in itself do not remedy the problem of unobserved heterogeneity • but one can apply techniques using panel data that do that. • Within transformation does the trick. • Panel data: • Variance across individuals (between variance) • Variance within individuals over time (within variance)
How panel data can take care of unobserved heterogeneity • Suppose the managerial ability of each firm i is time-invariant. • Denote this by (firm-specific fixed effects) • A model including (unobserved) managerial ability would read: (1) • Take the average value of each i: (2) We have ”taken away” the time dimension and have a cross-section with average values of the time periods. (between variation)
How panel data can take care of unobserved heterogeneity • Now substract the second equation from the first: • is gone! • Why? => it is constant over time, so its mean value over the periods for each i is the same: • Time-constant unobserved heterogeneity is no longer a problem
How panel data can take care of unobserved heterogeneity • Within transformation means that the data is "time-demeaned". • Only the within variation is left, because we subtract the between variation. • The within-transformation made possible by panel data allows researchers to account for time-invariant unobserved heterogeneity • Better identification
Is unobserved heterogeneity empirically important? • Yes! • in many research papers the magnitude of the estimated effects of x on y depends to large extent on whether one accounts for unobserved heterogeneity or not • Example: • Andersson, Klaesson and Larsson (2012), • ”Selection and Learning of Workers in Cities”
Is unobserved heterogeneity empirically important? • Main question: • How important is selection and learning, repectively, in explaining the urban wage premium? • Panel data of private sector workers 2001-2010
Identification of selection • Workers are heterogeneous: • eduction, age, gender: observed • innate ’abilities’: unobserved • Suppose wages are10% higher in cities • how much of this is due to workers in cities being better educated and older? • how much of the wage diffence remains after controlling education and age? • if selection is important, we should observe that the wage premium drops as we account for worker heterogeneity.
Identification of selection We run different models and test how sensitve the city wage premium is to observed and unobserved worker heterogeneity
Identificatin of learning • 1: indirectly quantified while accounting for selection • remainder wage gap after controlling for spatial sorting of workers • 2: identification of workers that move from urban to rural regions. • faster human capital accumulation in cities => the advantages of having worked in a larger dense city should remain while moving away. • we estimate the wage premium for workers that move away from dense agglomerations and test if their wage drops or remains upon moving.
Wages, education levels and skills in the Swedish economic geography
Wages, education levels and skills in the Swedish economic geography How much of the wage premium in larger cities is due to selection?
Empirical model De = market potential measures (Harris 1954) In red: time effects, regional effects, region-specific ”shocks” In green: worker characteristics
Is unobserved heterogeneity empirically important? • The wage-density elasticity drops from 3.3% to 0.8% when accounting for worker fixed effects (within transformation)!! • Selection on observables relatively unimportant.