850 likes | 1.17k Views
DATA COLLECTION TECHNIQUES. GROUP MEMBERS:- GYAN PRAKASH Ram POOJA YADAV. WHAT IS DATA?. The term is Latin term meaning “ to give” or “ those that are given”. It is a plural word and datum is its singular form . Data are the values of qualitative and quantitative variables.
E N D
GROUP MEMBERS:- • GYAN PRAKASH • Ram • POOJA YADAV
WHAT IS DATA? The term is Latin term meaning “to give” or “those that are given”. It is a plural word and datum is its singular form . Data are the values of qualitative and quantitative variables. Some terms associate with data that we need to know:- 1. Data point 2. Data set 3. variable 4. observation
Data Set:-Collection of data is known as data set.Data Point:-A single observation is known as data point.Variable:-It is a quantity whose value varies.Observations:-It is the value assigned to a variable.
Quantitative Variables:-These are the one who can accept only numerical values. Qualitative Variables:-These are the one who do not accept numerical values , but depend upon the quality. Independent Variables:-It is the one whose effect the experimenter is interested . It is also known as stimulus variable. Dependent Variable:-It is the one that varies according to the variation in the Independent Variable. y=a+bx x=independent ,y=dependent
Discrete Variables:- it is one which can take only isolated values . It appears by finite jump in between. For example number of rooms in a building will be either 4 or 5 or complete natural numbers. Continuous Variables:-It is the one which can manifest itself through conceivable fractional value within the range of possibilities. for example height of students . These can also assume non natural values.
Primary Data:-The data which is collected for the first time by the researcher is known as primary data . The primary data is in the shape of raw materials to which the different statistical techniques and methods and tools are applied to reach the final interpretations. Secondary Data:-It is the data which is already collected by someone /some agency and the researcher uses it for his/her research in order to save time , effort and finance . At times , it may not be possible to collect the information by the researcher himself . The secondary data is in the form of finished product which is ready for analysis.
Data Collection Techniques • Data collection usually takes place in an improvement process . It is an important part of the project and is formalized through Data Collection Plan that constitutes the following activities:-
Collection Activity Following of the plans associated with the selected data collection technique. Analysis of data by graphing as far as possible.
Data Collection , pre collection activity is one of the most crucial step in the process . After the pre collection activity is completed , Data Collection in the field by various methods which we will be talking about later on , can be done in a structured systematic and scientific way. A formal data collection is necessary if it insures that data gathered are both are defined and accurate and that substituent decision based on arguments embodied in the findings are valid . The process of data collection provides a base line from which we decide target points where we improve ourselves.
CENSUS The word is of ‘Latin’ origin . The census was a list that track of all adult males fit for military service. However according to modern definition, A census is the procedure of systematically acquiring and recording information about every member of a given population. The modern census is essential to international comparisons of a kind of statistics and census collect data on many of attributes of the population note just how many people are these, although population estimates remain an important function of census.
It is regularly occurring and official count of a particular population . Other common causes:- • Housing Census • Agriculture Census • Business Census • Traffic Census
Advantages of collecting data through census:- 1.Accurate 2.Detailed Disadvantages of collecting data through census:- 1.Costly 2. Time consuming 3. Constraint on geographical accessibility
A sampling is a Data Collection Methods that includes only part of the local population and in other words we can say that ,sampling is concerned with the selection of a subset of individuals from a statistical population to estimate the characteristics of whole population. SAMPLING
The main aim of the investigator in drawing the sample is to reduce the unmanageable heterogeneous population to handy one : so that all types are equally represented in it . The inferences are reliable because it is mathematically proved sample mean, is the best estimate of population mean.
Defining the population of concern. Specifying a sampling method. Implementing the sampling plans. Intermediate steps during collection of data through sampling
BROAD DIVISION OF SAMPLING 1.Subjective or purposive sampling:-It is the one where the samples are drawn according to certain rules. Here the personal element and predetermined choice of the investigator comes into play. For example:- if one is to select 20% towns from a lot ,for studying their characteristics and generalization it over all towns ,it may for some reason or the other(nearer approachability) have weakness for some of them and may even to include them in their sample whether they represent the group or area.
2. Objective or probability sampling:- In case of objective or probability sampling , the sample from a given set are selected. Each and every item or an individual have equal chance to be selected and hence it is also known as probability sampling. Samples are drawn by chit or card method or with the help of random no. table of triplet. All items are placed on the chits and then the required no. of chits are taken out.
Demerits of random sampling:- Sometimes the inferences drawn from random sampling may be unreliable because from one type of field there can occur more units while some other class of field there can occur less units and some classes may even remain unpresented. So in order to arrive at more correct . Inferences, it is advisable to divide the heterogeneous universe into homogeneous classes known as “STRATA”.
Simple Random Sampling:-It does not include ‘strata’ or we can say that heterogeneous population that need to be observed is not subdivided into homogeneous group. • Stratified Random Sampling:-When the sampling is done after subdividing the given heterogeneous population to a few homogeneous group known as ‘strata’ then it is known as stratified random sampling.
It relies on arranging the study population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. Systematic sampling involves a random start and then proceeds with the selection of every Kth element onward when (K=population size/sample size) it is important that the starting point is not automatically the first in the list, but instead randomly chosen from the first to the Kth element. SYSTEMATIC SAMPLING
As long as the starting point Randomized, systematic sampling is a type of Probability sampling. Within systematic sampling stratification can make it more efficient, if the variable by which the list is ordered is correlated with the variable of INTEREST. For example:-Suppose we wish to sample people from a long street that starts in a poor street(H.No. 1) and ends in an expensive district (H.No. 1000). A random selection of addresses from the street could easily end up with too many from the high end and few many from the low end ( or vice-versa).
Leading to an unpresentive sample ,selecting every 10th street ensures that the sample is spread evenly along the length of the street ,representing all of these districts equally.
Drawbacks of Systematic Sampling • Drawback associated with periodicity in data. • Drawback associated with arrangement of samples.
It is more cost effective to select respondents in groups ( clustures) sampling is then done on a geographical base. For example:- if we are surveying households in a city, we might choose to select 100 blocks and interview every household. It can reduce Travel and Administration cost. CLUSTURE SAMPLING
LINE INTERCEPT SAMPLING It is the method of sampling elements in a region by an element is sampled if a chosen line segment intersect that element.
1. Variability within strata are minimized. 2. Variability between strata are maximized. ADVATAGES OVER OTHER SAMPLING METHODS: Focuses on important subpopulation and ignores the irrelevant one. Allows to use different sampling techniques for different subpopulation. Improve accuracy. A stratified sampling is more effectivewhen these conditions are met:-
Disadvantages over other sampling methods: Requires selection of relevant stratification variables which can be difficult. It is not useful when no homogeneous forms are there. Can be expensive to lmplement.
SURVEYS A field of applied statistics, survey methodology studies the sampling of individual units from a population and associated Data Collection Techniques such as Questionnaire Construction and other methods for improving the accuracy in responses to the survey plan.
A single survey may be focus on different topics such as:- Preferential for candidate Opinion Behavior (smoking or alcoholing is good or bad) Factual information (income)
Survey Methodology Topics • Identify and select potential sample • Data collection from those who have to reach to • Evaluation and testing of questions. • Selection of mode for possessing questions and collecting responses. • Training and supreising interviews.
Administering of a Survey For administering a survey the choice for the mode of administering of survey is affected by the following factors:- Cost Coverage of target population Flexibility in asking question Respondents willingness to the participant Response accurately
Some common modes of administering survey Telephone Mail Online Personal –in-home surveys Hybrids of the above
Modes of data collection in surveys for given class of population , means proper modes of questionnaire should be applied for different class of people to be surveyed. Response Formats:- A survey contains a no. of questions that the respondent has to answer in a set format. A distinction is made between open- handed system and close -handed system. Factors that are together make the survey methods successful one:-
An open handed system asks the respondent to formulate his/her own answer. While a close- handed system provides respondent to pick a answer from a given number of options. The response option for a close- handed system should be mutually exclusive and exhaustive (detailed data).
Non –Response Reduction:- The following ways have been recommended for non -response reduction:- • Advance Letter:- a short letter in advance is sent to inform the sampled respondents about the upcoming survey . The style of the letter should be made more personalized. First, it announces that a phone call will made or an interviewer wants to make an appointment to do the survey face to face. Second the research topic will be described . • Language should be considered properly. • Interviewers effect.
TABULAR REPRESENTATION • The of data in the set of rows and columns is referred to as Tabular representation , To tabulate the data we first classify the given data on the basis on the basis of similarity . Classification provides the basis for tabulation.
Objectives of Tabular representation • Systematic representation of the given data. • Easy identification of desired value. • Easy identification of trends in data. • Basis for decision making.
Essential parts of Statistical Table • Table Number- The purpose is to identify a particular table ,it is to be used when in a given discussion we have more then one tabular representation. • Stub- The title given to rows is called a stub. • Caption- The title given to the columns is known as caption. It is also called Box-Head. There may be various sub caption under one caption.
Body- The numerical Information Present in the set of rows and columns is called body of table. • Footnote- This is an explanatory note which is to be written beneath the table and hence to have represent it. The purpose is to explain the omissions . • Totals-Where needed total and sub-total for columns and rows can be given for utility of the presentation for the reader.
Do and Donts’ in Tabular representation • It should be neat and simple. • It should avoid the use of abbreviations, if used they should be explained in the footnote. • It should have a self explanatory title. • Caption, Sub caption and stubs should all be clear and brief. • All numerical values should be rounded off a common decimal place.
Figures which we need to be emphasized should be put between two thick lines or in a box. • For missing values N.A.(NOT APPLICABLE) should be used written and it should be explained in the footnote.
DIAGRAMMATIC REPRESENTATON • There is a very old saying “A picture is worth 10,000 words” and it is very true and in statistics we desire and try to achieve same effect by the use of visual representation or data through diagram and graph.