研究設計及撰寫報告的一些建議. 黃熾森 香港中文大學管理學系教授 地址 : 香港新界沙田香港中文大學管理學系 電郵 :cswong@baf.msmail.cuhk.edu.hk 2006 年 3 月. 大綱. ( 一 ) 恰當的資料提供者 (Appropriate Informants) ; ( 二 ) 非隨機抽樣 (Non-random Sample) ; ( 三 ) 「沒回應偏差」 (Non-response Bias) ; ( 四 ) 「共同方法變異量」 (Common Method Variance) ;
大綱 (一) 恰當的資料提供者(Appropriate Informants); (二) 非隨機抽樣(Non-random Sample); (三)「沒回應偏差」(Non-response Bias); (四)「共同方法變異量」(Common Method Variance); (五) 多樣本的相互印證(Multi-Sample Cross-Validation); (六) 合併資料(Aggregating Data to Groups); (七) 資料的獨立性; (八) 跨文化議題(Cross-Cultural Issues); (九) 研究方法議題的論文(Papers on Research Method Issues); (十) 跟從標準的報告方式。
恰當的資料提供者-1 必須確保提供資料的人是有能力提供可信和有效的相關資料。 個人層面方面,關於個人的性格和態度的構念,由受訪者直接提供可能是最恰當的,但關於工作績效,其直屬主管則更為恰當。 關於機構層面的資料,一個普通的職員不一定了解機構的人力資源政策或機構的業績。 如果我們用郵寄方式要求機構填寫這些資料,便不能沒有其他的措施來確保資料提供者是否恰當的人選。
恰當的資料提供者-2 …..a professor in Xiamen University helped us to distribute the survey to human resources and top-level managers in TNCs in Fujian, PRC, which constituted the final sample of this study….. we had good connections with the professor in Xiamen, Fujian, who could help us to collect quality localization data using his graduate students…..With the help of this professor in Xiamen, we sent out 180 questionnaires to current and graduated MBA students who are top or middle-level managers in TNCs in Fujian Province. These managers were asked to fill out the questionnaires themselves if they were the human resources manager of the company. They were asked to refer to their human resources manager for necessary information if they were top executives of the company. After distributing the questionnaires and one round of telephone follow-up, a total of 139 responses were received. The response rate was 77%. The key advantage of this sampling process is the personal relationships between the Xiamen professor and the respondents. Thus, we were sure that the appropriate executives completed the questionnaires instead of other employees who may not have sufficient information to answer the questions.
非隨機抽樣-1 在絕大部分的情況下,組織行為及人力資源管理的研究都無法做到完全的隨機抽樣 應該盡量避免太「隨意」的抽樣。 應盡可能有一個抽樣的框架(Sampling frame)或抽樣的標準。 提供樣本與研究問題可能相關的資料,讓讀者判斷樣本是否有偏差,在討論研究結果的部分,也應交待此一限制及可能對研究外部效度的影響。
非隨機抽樣-2 Final participants in our validation sample were 139 human resources managers of TNCs operating in Fujian Province in the PRC. We chose TNCs from one single province in order to control for the differences in governmental regulations because the policies towards foreign investments vary among local governments in the PRC. Fujian was chosen as the province for our sample for two reasons. First, Fujian is one of the most developed provinces in Southern China because of it geographic location. There are a wide variety of TNCs in Fujian from various countries such as Taiwan, the U.S., and other parts of the world….The parent companies of these operations came from various parts of the world, including Taiwan (25.8%), Hong Kong (17.2%), European countries (18.8%), the United States (15.6%), South Korea (10.1%), Japan (1.6%), and others (10.9%). The majority of these operations were in the manufacturing industries (56.9%). There were 12.3% in the financial industry, 4.6% in transportation, 3.9% in wholesale and retailing, and the remaining 22.3% in other industries. The majority of them had been operating in Fujian for at least three years (94.9%). The average number of years of operation of these companies in Fujian is 7.41 with a standard deviation of 3.49.
沒回應偏差(Non-response Bias)-1 與抽樣有關、又常常困擾研究人員的是回應率(response rate)偏低的問題,以不寄名而又不是即場回收的話,一般回應率都祗能有百分之二十幾左右。 可盡量收集資料以了解研究的效度是否會因較低的回應率而受到嚴重的影響,例如集中精力追蹤一小部分沒有回應者,比較他們與回應了的樣本是否有重大的分別。 如果我們有抽樣框架(Sampling frame)的資料,也可比較回應的樣本是否與整體框架內的情況有重要的分別。
沒回應偏差(Non-response Bias)-2 Although the average response rate for each survey is 49.3%, the final sample size that was used to examine the job perception-job satisfaction relationship was only 196 due to attrition and participant job changes. We therefore conducted three preliminary analyses to check the representativeness of this final sample. First, although the research participants were quite homogeneous in their backgrounds, they may have differed in three aspects: gender, undergraduate major, and year of graduation. The proportions of the respondents in these three aspects for the three surveys are all very similar to the general population. For example, the percentage of males in the population, and in the three surveys are 51.0%, 52.0%, 49.5%, and 49.9%, respectively. Chi-square tests on these three aspects indicate no statistical differences (p>.10) for the three surveys. Second, we compared the level of job perception and job satisfaction in the first and second surveys for those who continued to respond in the next survey with those who did not. No significant differences are observed (p>.10). Thus, research participants who responded to all the three surveys did not appear to be a biased sample.
X M (方法) Y 共同方法變異量-1 (Common Method Variance)
共同方法變異量-2 在組織行為及人力資源管理的研究中,由於很多時牽涉一些態度和行為的構念,所以這個問題會出現。 我們一定要處理,否則研究的內部效度會很受懷疑。我們可嘗試: (1)以因子分析法,檢查來自同一方法的各測量項目背後共同因子的影響; (2)同時測量一些無關或特別設定的變項(如Negative Affectivity),然後把它與X和Y的共變量先行排除(常用的是迴歸方法),最後才檢定X和Y的關係。
共同方法變異量-3 …..measures of this study are self-reported and so the effect of common method variances (Spector, 1992, 1994) or negative affectivity (Burke, Brief & George, 1993) may have affected the results. To deal with this potential threat, we have utilized the longitudinal design to examine the one-year, cross-lagged effects between job perception and job satisfaction. This design could help reduce the common method variances among the measures (Spector, 1992; 1994)……Nevertheless, we conducted the same LISREL analyses after partialling out some irrelevant variables, which should not have a direct relationship with both job perception and job satisfaction. In designing the study, we have included two irrelevant measures. The first one is a five-item measure (average coefficient alpha for the three surveys is .76) of need for achievement (Steers & Braunstein, 1976). The second one is a single item that requires the research participants to respond on a five point Likert-type scale (ranging from strongly agree to strongly disagree) to the statement: “I am an average worker”. After partialling out each of these irrelevant measures, the LISREL results are similar except that the path coefficients became smaller. Thus, common method variances did not appear to invalidate the results of this study.
共同方法變異量-4 In addition to the four subjective questions, we added an objective indicator of localization success. Specifically, we used a ratio of the “Number of local managers occupying positions originally occupied by expatriates” to the “Total number of positions occupied by expatriates when the PRC operations started.” The numerator is a measure of actual localization success, while the denominator is a comparison base of the starting number of expatriate positions. This variable is very important because it allows us to double check the validity of the subjective indicator of localization success. Also, unless the respondents deliberately lied to us, this objective indicator can be a good dependent variable that has little respondents’ biases with the independent variables.
多樣本的相互印證(Multi-Sample Cross-Validation)-1 「抽樣誤差」(Sampling error)是很普遍的,因此,要增加研究的效度和提高其品質,在可能的情況下,應考慮用不同的樣本來互相印證研究的結果。
多樣本的相互印證-2 To investigate the distinctiveness and potential utility of the EI construct, we used a two-study-four-sample design with participants from Hong Kong and the People’s Republic of China (PRC). In Study One, we used confirmatory factor analyses to show that, when properly defined and measured, EI is related to yet distinct from personality dimensions and has incremental predictive power on life satisfaction. In Study Two, Multi-Trait Multi-Method (MTMM) analyses were utilized to examine the construct validity of self-reports and others’ ratings of EI. In a student sample, parents’ ratings explained additional variance in the students’ life satisfaction and feelings of powerlessness after controlling for the Big Five personality dimensions. The results of the MTMM analyses were cross-validated on an employee sample with peer ratings of EI. These peer ratings were found to be significant predictors of job performance ratings provided by supervisors, after controlling for the Big Five personality dimensions.
多樣本的相互印證-3 We have collected data from two samples in this study. The first sample is a pilot study designed mainly for item generation. In drafting the pilot-testing questionnaire for the first sample, we held several meetings with four practicing human resource managers….. generate measurement items for these studies…..All these four human resource managers worked for companies that have operations in the PRC. After the survey questionnaires were finalized, they were sent out to members of the Employers’ Federation of Hong Kong and the American Chamber of Commerce……Forty-six member companies of the association who had operations in PRC completed and returned the pilot test questionnaire. The parent companies of these 46 respondents came from Hong Kong (16), U.S.A. (12), U.K. (7), Japan (5), France (1), Germany (1), Switzerland (1), and Denmark (1).
合併資料(Aggregating Data to Groups)-1 雖然HLM可以處理跨層次的構念,但是,它的依變項一定要是最低層次的,否則不能應用。 小組的研究: Y=小組的績效 X1=機構對小組的重視程度 X2=小組成員的能力 由於Y是在第二個層次的,不能用HLM來處理。 工作設計的研究: Y=每一類工作的意外率 X1=工作的性質(複雜性) X2=員工的因素(安全知識的水平) 由於Y是第二個層次的構念,不能用HLM來處理這個的研究問題。
合併資料-2 把第一個層次的資料合併成第二個層次: • 取小組所有成員能力的平均得分; • 取從事同一類工作的員工安全知識的平均得分。 之後可完全在第二個層次中進行分析。 如果這樣做,首先要確定這些平均後的得分能有足夠的變異量,否則繼續用來分析便沒有意義了。
合併資料-3 Bartko, J.J. (1976) 提出的Intraclass Correlation: 先把第一層次的變項作ANOVA分析(例如以小組為單位,對能力得分進行ANOVA分析;以工作種類為單位,對安全知識得分進行ANOVA分析,取得MSB 和MSW 的結果後,用以下的方程式計算(k是組別數目): 這個ICC(1)係數代表了組別間的變異量佔這個變項的總變異量的比率,因此它愈大,代表把每一組的數據合併的問題不大。
合併資料-4 Two reasons may explain the low intraclass correlation for the motivational value of the job. First, the intraclass correlation correlation ignored all single-respondent jobs in its calculation. The amount of between job variance for the group with multiple respondents was less because there were more low level jobs. Second, the intraclass correlation depended on the number of respondents per job.…..Two analyses were conducted to further investigate the within job respondents’ agreement on the motivational value of the job. First, the intraclass correlation for the 23 jobs with more than two respondents was calculated. It was .40, similar to a recent study when the same analysis was conducted for a sample with about five respondents per job. Second, the two analyst ratings of the single-respondent jobs were added to the sample as if they were from two additional incumbents of each job. This allowed the high level single-respondent jobs to be included in the calculation. The intraclass correlation was .43, despite the average number of “respondents’ of only 3.4. When the analyst ratings of the single- and two-respondent jobs were added to the sample, the intraclass correlation was .52 with an average number of respondents equaled to 4.1. From these analyses, it appeared that there could be reasonable agreement between incumbents on the motivational value of their jobs, if a large enough number of incumbents existed or could be approximated.
合併資料-5 其實在HLM的分析中,也可應用ICC(1)來證明第二個層次的分析是有意義的。例如Liao and Chuang (2004)也有計算這個係數: Null model. Our hypothesis predict that both individual- and store-level variables would be significantly related to employee service performance. In order for these hypotheses to be supported, there had to be significant between-store variance in employee service performance. Thus, using HLM, we estimated a null model in which no predictors were specified for either the level 1 or level 2 function to test the significance level of the level 2 residual variance of the intercept (τ00=.35, p<.001). The ICC(1) was .12, indicating 12 percent of the variance in employee service performance resided between stores, and 88 percent of the variance resided within stores.
資料的獨立性-1 一個常見的問題,那就是當牽涉由主管提供部屬的工作績效時,往往是同一個主管評核數名部屬,如果我們的分析是以部屬為單位,例如每名部屬提供自變項(如工作態度;X),而主管提供關於部屬的依變項(如工作績效;Y)。 問題是,如果這些主管之間對工作績效有很不同的看法,例如一些很寬鬆、有些很嚴格,這樣,由於依變項(Y)的不獨立性,是否會影響研究結果的效度呢? 計算James, Demaree and Wolf (1984)提出的Interrater reliability (RWG)係數。
資料的獨立性-2 如果在同一組別中,對某一變項(X)有幾個測量的得分結果(例如x1, x2, x3, x4, x5),這五個得分是否相關?相關的程度又有多大? RWG是要比較兩個變異量: • 當這幾個得分完全無關時它們應有的變異量(σEU2) • 收集到的數據中我們觀察所得的變異量(Sx2) : RWG = 1 – (Sx2 / σEU2) 如果RWG等於1.00(即Sx2等於零),這幾個得分百分之百吻合和相關; 如果RWG等於零(即Sx2等於σEU2),這幾個得分是完全無關,是獨立的。
資料的獨立性-3 一般來說,RWG是當我們要求不同的人(rater)來評估同一個特質,然後從他們評分的相關情形來判斷評分是否可信,因此稱為「Interrater reliability」,所以係數愈接近1.00愈好。 但是,當我們應用RWG在檢定同一主管對不同部屬的工作績效評分的獨立性時,我們希望看到的是較小的RWG,因為這樣證明了這個主管對某一部屬的工作績效評分並不受其他部屬的評分影響。所以,愈多主管的RWG較小,便代表這個變項的評分愈獨立。
資料的獨立性-4 We conducted another preliminary analysis for the performance data because in our data, 41 supervisors rated the performance of more than one subordinate. Independence of the performance data may have created a problem in data analysis. Thus, we calculated the within-group inter-rater reliability for these 41 supervisors according to the formula provided by James, Demaree and Wolf (1984). To be conservative, we did not consider any response bias and assumed a triangular null distribution. The mean inter-rater reliability for the 41 groups of performance ratings was 0.65 and its standard deviation was 0.31. Over half of these reliability coefficients (53.7%) were less than 0.70. George and Bettenhausen (1990) argued that an inter-rater reliability greater than 0.70 could be considered as an indicator of good within group agreement. From this result, we believe that the performance ratings may be regarded as independent and the results will not be affected significantly.
跨文化議題(Cross-Cultural Issues)-1 由於組織行為及人力資源管理的科學研究起源於西方社會,而以北美(尤其是美國)為主導,很多重要的學術期刊也以美國為基地,因此在非西方的社會進行研究,很多時便會被視為是「跨文化」性質的研究。這當然不一定正確,但是我們確實可能會面對這樣的誤解及從而引起的困難。 真正跨文化的研究,自然應有一定的要求,例如文化的定義是否合理,不同文化背景的樣本是否在其他非文化的特質上沒有差異等等。但在這裡不是要討論如何進行跨文化的研究,應另有專書討論。 在非西方社會中進行組織行為及人力資源管理的科學研究時,如何可以避免別人的誤解,及更公平地看到我們的研究對科學知識的貢獻。
跨文化議題-2 研究沒有文化限制的普遍構念: 在對情緒智能(Emotional Intelligence)的研究時,用的都是中國人的樣本,結果有一位評審質疑研究結果是否在其他文化環境中也是正確的。 我們的觀點是:在我們定義之下的情緒智能構念(即人處理情緒的能力)是一個人類普遍的構念,我們的理論也沒有牽涉文化的議題,所以並不同意這個評審的意見,結果把他和編輯說服了。
跨文化議題-3 Our second limitation is that all the data in this project were collected in Hong Kong and the PRC. Cross-cultural generalizability of the results may be a concern. We do not know whether EI would vary across different cultures. However, when we go back to the EI literature, we do not find any discussion of EI across cultural boundaries. Our position is that one’s ability to understand, regulate, and utilize one’s emotions in constructive ways are general human abilities. There is no immediate evidence that the validity of EI, as defined under our four-dimensional view, should vary across culture. While further studies may be needed to verify this position, we take the general scientific attitude that psychological and management phenomena are considered as universal unless there are theories or evidence showing their cross-cultural variations. While the EI construct may be universal, we agree that behaviors resulting from the EI of an individual may vary across culture. For example, a non-reactive quiet response by the subordinate when one’s boss is making unreasonable demands may reflect high EI among Chinese but probably not among non-Chinese. In this respect, our use of self-report measure of EI may, in fact, be a plus because we asked respondents about their final judgment of the EI of the target person irrespective of the assessment clues or methods they would use. By doing so, we may be able to avoid some cross-cultural differences in expressing emotions or diagnosing emotions because the assessors would be able to use the clues or methods that are appropriate for their specific culture. This issue may have to be considered when behavior- or outcome-oriented tests of EI, such as the MSCEIT, are used across cultural boundaries.
跨文化議題-4 發展或應用本土的構念: 在這類研究中,的確是建基於某一文化背景來發展一些新的構念,例如中國的「關係」(Guanxi)概念,但是,要在科學的領域中有所貢獻,這類研究最少要做到兩點。 • 首先是合理地論證無論是在理論上、或實證上,它都與現存西方或其他地方發展而已在文獻中報告的構念不一樣; • 第二點是在實證工作時提出在現存構念之上的「增加效度」(incremental validity),這樣我們提出新的構念也可能變成一個適用於其他文化背景的構念了。
跨文化議題-5 關係構念的例子
跨文化議題-6 中國議題: 清楚地說明我們關心的原來便是中國的議題,所以我們可以參考西方或其他地方發展的理論和知識,但必須從中國的情況出發,例如關於外資企業在中國起用本地的中高層管理和技術人才的理論模型,便不太可能與其他國家完全一樣。
跨文化議題-7 西方或其他理論在中國的適用性: 研究問題是驗證西方或其他地方發展的理論是否同樣適用於解釋中國社會的現象,當然,我們不應該盲目驗證,最好是基於現有理論的來龍去脈和對中國社會現象的分析,首先提出清晰的假說,例如這些理論應如何修訂,才可更準確地描述中國社會的現象,最後才進行驗證這些假說的針對性研究。
跨文化議題-8 In contrast to the educational system in the US, high school students in Hong Kong are assigned to science or arts streams after Grade 9. Except in a small number of special schools, most students are strictly confined to subjects within their stream of study. For example, a science student can study only subjects such as physics or chemistry and is not allowed to study literature and history. Similarly, a student who is assigned to the arts stream is generally not allowed to take chemistry no matter how strong his/her interest is in the subject. As a result, students in Hong Kong are basically separated into two groups (science vs arts) and confined to subjects within their groups after Grade 9. This education system is a good example of the environmental variable discussed by Holland (1985: 15-18) that would affect the students’ formulation of vocational orientations.
跨文化議題-9 The second characteristic of the education system in Hong Kong is its strong bias towards the science stream. Due to the design of the examination syllabus and teaching methods, it has been a long tradition in Hong Kong that students have a strong positive bias towards the science stream in their high school education. The arts stream is widely perceived by students as the course of study for those who are not smart enough or strong enough to deal with mathematical and scientific symbols. As a result, students in the science stream are usually academically brighter and quite homogeneous in their abilities and interests, while students in the arts are much more varied. These special environmental characteristics lead us to the hypothesis that science students are supposed to have an interest in realistic and investigative jobs while arts students will have interests in other jobs (i.e., artistic, social, enterprising, and conventional jobs) in Hong Kong.
跨文化議題-10 A final characteristic of the Hong Kong environment is that Hong Kong is a society with a strong business culture and is strongly influenced by Chinese traditions. As a result, interpersonal relationships or guanxi play very important roles in the business world (Alston, 1989; Hwang, 1987; Law et al., 2000). Davies (1995) reported that 98 per cent of the 150 Hong Kong Chinese executives who responded to their survey confirmed that ‘personal connections with local Chinese organizations’ form an extremely important part of their business lives. As a result, we hypothesize that the enterprising and social types would be strongly and closely related to each other and would be even stronger than the R-I, E-C and S-A links.
研究方法議題的論文-1 一些牽涉研究方法議題的研究,很多時人們都以為這類議題一定是計量的,以數學統計為主,但並不一定如此。 似乎最重要的還是提出的「研究問題」是否恰當和重要,具體的數學統計推論是次要的。
研究方法議題的論文-2 構念的定義及量化的問題: 「結構方程模型」(Structural Equation Model;SEM) 是以因子分析法來估計「潛在構念」的,但是,並不是所有「多構面構念」(Multidimensional Construct)都可以這樣估計的,例如在組織行為及人力資源管理學中常用的工作性質和工作滿足感便不是這樣的。但是,當時應用SEM的論文似乎都沒有注意這一點。
研究方法議題的論文-3 Latent Model: constructs as underlying higher-order abstractions behind their dimensions, e.g., g-factor
研究方法議題的論文-4 Aggregate Model: constructs exist at the same level as their dimensions and are formed as a mathematical function of their dimensions, e.g., job satisfaction, job perception
研究方法議題的論文-5 Profile Model: constructs exist at the same level as their dimensions and are formed as various combinations of their dimensional characteristics, e.g., personality High/Low High/Low High/Low
研究方法議題的論文-6 …..leads to different conclusions about the relative effects of job perception and social interactions on job satisfaction. Under the factor (latent) model, both job perception and liking are significant predictors of job satisfaction. In contrast, only job perception is a significant predictor of job satisfaction under the composite (aggregate) model. In other words, liking would be concluded as an important antecedent of job satisfaction under the factor model, but not under the composite model.
研究方法議題的論文-7 SEM可以用「nonrecursive model」來驗證兩個構念互為因果的關係,雖然在數學方法上可行,但它違反了基本的邏輯:兩個構念怎可能在同一時間點上互為因果呢?
研究方法議題的論文-8 Hunter & Gerbing (1982): “the problem that nonrecursive causation pose in cross-sectional models can be seen by considering the implications that a two-way arrow has for indirect effects. If X and Y have an effect on each other, then X has an impact on Y, which has an impact on X, which has an impact on Y…this cycling can go on (for) as many steps as can be imagined. In reality, there is no such instantaneous cycling process. (pp.288-289)”
研究方法議題的論文-10 其中一個模疑結果清楚顯示以nonrecursive model來驗證互為因果的關係,當樣本數是300時,我們有百分之四十七的機會作出錯誤的結論: 「In the second example, X and Y in the true time-lagged model were set to be reciprocally related. The path coefficients from both X to Y and Y to X were set to be .25…..Finally, M3 (the nonrecursive model)….. gave very poor estimates of the true population parameters…..the point estimates of the path from X to Y and vice versa were .18 and .34. The probability of concluding a reciprocal relation between X and Y was only 53% when N was 300.」
跟從標準的報告方式-1 在進行組織行為及人力資源管理的科學研究和撰寫報告時,盡量不要標奇立異,應以踏實和嚴謹為主要考慮。 學術研究的工作,在於提出足夠的證據以支持我們的論點和結論,我們應力求以最簡潔易明的方式陳述論點及報告證據,然後作出結論。 應盡量把複雜的概念和論據以簡單的方法說明,而不是把簡單的現象用艱深專門的術語來表達。
跟從標準的報告方式-2 科學研究的工作是有內在標準的: (1)用合符邏輯理性的方式建構理論和假說以描述現象 (2)用客觀系統的方法搜集資料來驗證理論和假說。 第一部分一定是說明「研究問題」,要說服讀者為什麼這是重要的及尚未有答案、或最少未有定論的問題,然後對構念和假說加以說明; 第二部分要報告研究採取的方法,最少要包括抽樣過程、測量及分析方法; 第三部分是報告分析資料的結果,一般而言先要報告各變項的平均、標準差、信度及相關係數,然後是驗證假說所採用的分析結果,這部分通常會以圖表來輔助說明; 最後一部分是討論整個研究的結果,一般應包括對「研究問題」的結論、研究的限制及對未來研究的建議。 這是基本的功夫,應盡量習慣用這樣的方式來作報告。
跟從標準的報告方式-3 嚴耕望先生: 「我從事學術工作五、六十年,發表的論文很少受人質疑,為什麼呢?因為我有十分證據,也祗講八分話,不單不說過頭,還一定留有餘地。」