Module-1
Introduction
to Research
Research
Research
is an art of scientific investigation. It is also a systematic design,
collection, analysis and the reporting the findings and solutions for the
marketing problems of a company.
Objectives of research
1. Promotes
better decision making.
2. Research
is the basis for innovation.
3. It
identifies the problem area.
4. It
helps in forecasting, which is very useful for managers.
5. Research
helps in formulation of policies and strategies.
6. It
helps in the development of new products and in understanding the competitive
environment.
7. It
helps in the optimal utilization of resources.
8. It
helps in identifying marketing opportunities and constraints.
9. It
helps in evaluating marketing plans.
Types of Business Research
Research may be classified as
1. Pure Research:
Pure
research is undertaken for the sake of knowledge without any intention to apply
it in practice. Pure research is also known as basic or fundamental research.
Pure research helps to find the critical factors in a practical problem .Pure
research develops many alternative solutions and thus enables us to choose the
best solution.
2. Applied Research:
Applied
research is conducted when decision must be made about a specific real-life
problem. It is thus problem oriented and action directed.
Contribution of Applied Research:
1. Applied research can contribute
new facts.
2. Applied research can put theory to
the test.
3. Applied research study offers an
opportunity to test the validity of existing theory.
4. Applied research may aid in
conceptual clarification.
5. Applied research may integrate
previously existing theories.
3. Exploratory
Research: Exploratory
Research is also known as formulative research. It is the first stage of a
three stage process of exploration, description and experimentation. Exploratory
research is a preliminary study of an unfamiliar problem about which the
researcher has little or no Knowledge. It is similar to a doctor’s initial
investigation of a patient. The need for
exploratory studies: The exploratory research is necessary to get initial
insight into the problems for the purpose of formulating them for more precise
investigation and so called as formulative research.
4. Descriptive Research: Descriptive study is a fact finding
investigation with adequate Interpretation. It is the simplest type of
research. It is more specific than an exploratory study.
This study aims to identify the
characteristics of a community. This study employs simple statistical
techniques.
5. Causal Research: Causal research (also referred to as
explanatory research) is the investigation of (research into) cause-and-effect
relationships. In order to determine causality, it is important to observe
variation in the variable that is assumed to cause the change in the other
variables, and then measure the changes in the other variables
6. Action Research: Action research is either research
initiated to solve an immediate problem or a reflective process of progressive
problem solving led by individuals working with others in teams or as part of a
"community of practice" to improve the way they address issues and
solve problems.
7. Conceptual research: it is used by Philosophers & thinkers. It is related to some
abstract ideas or theory. Do
research to prove or disapprove his hypothesis
8. Historical research: Based on past records & data in
order to understand the future trends. No direct observation, research has to
depend on the conclusions or inference drawn in the past.
Criteria or Features of
good research study
1.
Empirical-Research is based on
direct experience or observation by the researcher.
2.
Logical- Research is based on
valid procedures and principles.
3.
Cyclical- Research is a cyclical process because it starts with a
problem and ends with a problem.
4.
Analytical-Research utilizes proven
analytical procedures in gathering the data, whether historical, descriptive,
and experimental and case study.
5.
Critical- Research exhibits careful and precise judgment.
6.
Methodical- Research is conducted in a methodical manner without
bias using systematic method and procedures.
7.
Replicability- The research design and
procedures are replicated or repeated to enable the researcher to arrive at
valid and conclusive results.
Marketing Research
Marketing
Research is objective i.e. attempts
to provide accurate information that reflects a true state of affairs. It should be conducted impartially.


Limitations of
marketing research
1.
MR
is not an exact science: Results obtained are
not very accurate compared to physical science because of so many
uncontrollable variables.
2.
Complex
in nature: MR carried out on human begins, who
have tendency to behave artificially when they know they are being observed
3.
Inexperienced
research staff: Subjectivity is an important limitation
in MR. Very difficult to verify the results, which is main characteristic of
physical science.
4.
Limitations
of time: Generally takes a long time to conduct.
By the time results are presented market situation may be changed.
5.
Subjectivity:
However objective a person, some time
researcher might engage in cheating or misrepresentation of facts or sometimes
mislead by respondents. Also researcher might draw conclusion based on his past
experience.
6.
Time
frame: Top management may hold certain pre
conceived opinions on the outcome.
7.
Benefits
Vs cost: Benefits should out weight the cost
8.
Availability
of resources: Financial or human resource
Scientific
Research
Scientific Research is one which yields the same
results when repeated by different individuals”. Scientific
Research Method is one and same in the branches of science, which is used in
all the research and used by various scientists. This method attracted all the
trained minds
Characteristic of
Scientific Research
1.
Validity:
Ability of a measuring instrument to measure what it is supposed to. E.g. Barometer VS questionnaire
2.
Reliability:
the result will remain same even when u repeats it n no of times.
Difficulties in
applying scientific methods to marketing research
1.
Complexity
of the subject: Deals with human beings .Influence
of environmental factors & peers
2.
Difficulty
of obtaining accurate measurements: Since
information obtained are qualitative in nature. Since participants are human beings, Subjectivity invariably
creeps in.
3.
Influence
of measurement: When respondents realizes that he
is being measured, his response & behavior undergoes a change
4.
Difficulty
in testing of hypothesis: In MR it is almost
impractical to carry out experiments in a controlled way, which will affect the
hypothesis being tested. Very
difficulty in making accurate predictions
5.
Role
of investigators: Organization is clients
of researcher. Sometimes investigator tries to fit in results or manipulates
the data or does not conduct exhaustive study.
6.
Time
pressure: MR must be conducted & completed
within a given time. If more time is consumed in conducting the research,
competitors might enter and capture the market.
Scientific methods vs.
Non scientific methods
Business Research
Business
research involves establishing objectives and gathering relevant information to
obtain the answer to a business issue. Or
Business
research can be defined as the systematic and objective process of gathering,
recording and analyzing data for aid in making business decisions.
Importance of Marketing
Research Roles
1. Marketing
research servers two major functions; (i) it provides information for decision
making and (ii) develops new knowledge.
2. The
use of information gathered by the marketing research reduces the risks
involved in decision making.
3. It
influences decisions such as pricing of the product, scale of advertising, etc.
4. The
information collected directly affects the planning of the product.
5. Market
research is put to substantial use by firms that produce branded products and
are in competition with other brands to know and maintain the popularity of
their products among consumers.
External organizations
for conducting MR
1.
Advertising
agencies: the advertising agencies conduct
research for their clients. Ad agencies undertake media studies, group research
etc. The ad agencies also conduct opinion research, market potential research
etc.
2.
Trade
associations: Trade associations also conduct MR.
For instance, the confederation of engineering industries conducts MR for
various engineering products.
3.
Manufacturers:
Manufacturers of smaller industries join
together and undertake MR for their mutual benefit. Ex: textile companies have
the textile manufacturers association, which conducts research on potential for
garments made in the country.
4.
Retailers
and wholesalers: the retailers have been
predominantly concerned with shop location studies, special promotion studies,
pricing, retail stores investigation and sales research and so on. Retailers
conduct MR less frequently since they are in close proximity to the customers.
The wholesalers are interested in conducting MR on retailer’s behavior. They
are also interested in learning about the attitudes of the retailers towards
inventories, service provided by the wholesalers etc.
5.
Governmental
agencies: MR is also carried out by few government
agencies. The government departments collect information on subjects such as
agricultural market surplus, consumer goods market surplus, price indices,
imports and exports, etc. this helps them in formulating policies.
6.
Universities
and institutions: Universities and
institutions also conduct MR. Ex: the institutes like the IIMs, IIFT engage
themselves in doing marketing research for certain corporate entities.
Research process/steps
involved in preparing business research plan/ proposal or designing research
The
research process is the step-by-step procedure of developing one's research.
1.
Management problem: It is the most important step,
because only when a problem has been clearly and accurately identified can a
research project be conducted properly. It asks what the decision maker needs
to do. Mostly action oriented and Focuses on symptoms.
2.
Defining the research problem: The research problem is a general
statement of an issue meriting research. Its nature will suggest appropriate
forms for its investigation. Problem definition involves stating the general
marketing research problem and identifying its specific components.
3.
Formulating the research Hypothesis: A good hypothesis relates
and explains the known facts. It should also predict new facts. It must be stated in such a way that we
can test it by experimentation or further observation, or it is of no
scientific value. Also, it must stated in a way that would allow us to show if
it is incorrect, i.e., it must be "falsifiable."
4.
Developing the research proposals: Research proposal is a
specific kind of document written for a specific purpose. Research involves a
series of actions and therefore it presents all actions in a systematic and
scientific way. In this way, Research proposal is a blue print of the study
which simply outlines the steps that researcher will undertake during the
conduct of his/her study
5.
Research design formulation: A research design is a framework or
blueprint for conducting the marketing research project. It details the
procedures necessary for obtaining the required information, and its purpose is
to design a study that will test the hypotheses of interest, determine possible
answers to the research questions, and provide the information needed for
decision making.
6.
Design of data collection methods: Data collection is the
process of gathering and measuring information on variables of interest, in an
established systematic fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes. Generally there are three
types of data collection and they are:
1. Surveys
2. Interviews
3. Focus groups
7.
Sample design: Sampling is a means of selecting a subset of units
from a target population for the purpose of collecting information. This
information is used to draw inferences about the population as a whole.
8.
Analysis and interpretation of data: Analysis of data is a
process of inspecting, cleaning, transforming, and modeling data with the goal
of discovering useful information, suggesting conclusions, and supporting
decision-making. Data analysis has multiple facets and approaches, encompassing
diverse techniques under a variety of names, in different business, science, and
social science domains.
9.
Research report: Entire project documented, Report addresses the
specific research question identified, describes the approach, research design,
data collection & data analysis methods, & major findings. Reports
presented in a comprehensible format.
Research Design formulation
It is a frame work or blue
print for conducting research. Details the procedure and designs the study to
test the hypotheses of interest, determines possible answers to the research
question, and information needed for decision making.
It involves following
steps
1.
Secondary data analysis
(based on secondary research): Essential step in problem definition. Secondary data
are data collected for some purpose other than the problem at hand. Primary
data are collected for the specific purpose of addressing the research problem.
These data includes information made available by business & government
sources, commercial MR firms, and computerized databases.
2.
Qualitative research: When information obtain by
above methods are not sufficient then they may go for this method. QR is
unstructured, exploratory in nature, based on small samples. Popular QT are Focus
group, word association, depth interviews, pilot surveys, case studies. All
these tasks help the researcher to understand the environmental context of the
problem
3.
Methods of collecting
quantitative data: it is through survey, observation, and experimentation. Proper
selection, training, supervision, & evaluation of field force minimize data
collection errors.
4.
Definition of the
information needed: the company needs to know the extent of competition, price
quality acceptance from the market. In this context, following information
required:
1)
Total demand and company sales
2)
Distribution coverage
3)
Market awareness, attitudes and usage
4)
Marketing expenditure
5)
Competitors’ marketing expenditure
5.
Measurement and scaling procedures: Measurement means assigning numbers
or other symbols to characteristics of objects according to certain pre-specified
rules. Scaling is an extension of
measurement. Scaling
involves creating a continuum on which measured objects are located.
6.
Questionnaire design: a questionnaire is a
paper-and-pencil instrument that is administered to the respondents. The usual
questions found in questionnaires are closed-ended questions, which are
followed by response options. However, there are questionnaires that ask
open-ended questions to explore the answers of the respondents.
7.
Sampling process and sample size: Sampling is a means of
selecting a subset of units from a target population for the purpose of
collecting information. This information is used to draw inferences about the
population as a whole.
8.
Plan of data analysis: Analysis of data is a
process of inspecting, cleaning, transforming, and modeling data with the goal
of discovering useful information, suggesting conclusions, and supporting
decision-making. Data analysis has multiple facets and approaches, encompassing
diverse techniques under a variety of names, in different business, science,
and social science domains.
Management
problem vs. Research problem
Management
problem
|
Research
problem
|
1. Develop
the package for new product.
|
1. Evaluate
the effectiveness of alternative package design.
|
2. To
select a media for product advertising.
|
2. We
should conduct an investigation to determine suitable media. Evaluate the
impact of the media in terms of research.
|
3. Increase
the amount of repurchase behavior of the customer.
|
3. Assess
current amount repeat purchase behavior.
|
4. Introduce
new product.
|
4. Design
a test market through which the likely acceptance of new product can be
gauged.
|
5. Should
the price of the brand be increased?
|
5. To
determine the price elasticity of demand and the impact on sales and profits
of various levels of price changes.
|
************
Module-2
Business
Research Design
Research Design
A
research design is a framework
or blueprint for conducting the marketing research project. It details the
procedures necessary for obtaining the information needed to structure or solve
marketing research problems. It
specifies the details – the nuts & bolts of implementing that approach.
A Classification of
Marketing Research Designs
1.
Exploratory
Research
Exploratory
Research is also known as formulative research. It is the first stage of a
three stage process of exploration, description and experimentation.
Exploratory research is a preliminary study of an unfamiliar problem about
which the researcher has little or no Knowledge. It is similar to a doctor’s
initial investigation of a patient. The
need for exploratory studies: The exploratory research is necessary to get
initial insight into the problems for the purpose of formulating them for more
precise investigation and so called as formulative research.
Purpose
–
The purpose of exploratory research is to gather preliminary information that
will help define problems and suggest hypotheses
–
To gain familiarity with a phenomenon or acquire new insight into it in order
to formulate a more precise problem or develop hypothesis
Methods
of Exploratory Research
1.
Secondary resource analysis/ Review/Survey of concerned Literature: When the investigator proceeds on the
path of research he has to take advantage of his predecessors. This technique
will save time, cash, and effort. This kind of data can be obtained from
professional research organizations, websites, newspapers, magazines, journals
of the government, etc.
2.
Expert opinion survey /Experience Survey: It is better to interview
those individuals who know about the subject. The objectives of such survey are
to obtain insight into the relationship between variables and new ideas
relating to the research problem. The respondents picked are interviewed by the
researcher. The researcher should prepare an interview schedule for the
systematic questioning of informants. Thus an experience survey may enable the
researcher to define the problem more consciously and help in the formulation
of hypothesis.
3.
Focus Groups discussions: The majority of the organizations
engaging in the focus groups first screen the candidates to find out who will
compose the particular group. Group interaction is the key factor that
differentiates focus group interviews from experience survey that are conducted
with one respondent at a time. Furthermore it is the key advantage of the focus
group over the majority of exploratory techniques. Due to their interactive
nature, ideas sometimes drop “out of the blue” in a focus group discussion.
4.
Comprehensive case methods: Analysis of Insight
Stimulating Cases: This includes the study of one or a few situations. It is
focused on complex situations and problems; this kind of situations occurs when
the interrelations of several individuals are important. In this method of
Exploratory Research, some units are analyzed; each unit is called Case.
Uses of Exploratory
Research
1. Formulate
a problem or define a problem more precisely
2. Identify
alternative courses of action
3. Develop
hypotheses
4. Isolate
key variables and relationships for further examination
5. Gain
insights for developing an approach to the problem
6. Establish
priorities for further research
2.
Conclusive research Design
Conclusive research aims to verify
insights and to aid decision makers in selecting a specific course of action.
Conclusive research is sometimes called confirmatory research, as it is used to
"confirm" a hypothesis.
In this research there are two types:
1.
Descriptive research
2.
Causal research or experimental research
Descriptive Research
Descriptive study is a fact finding
investigation with adequate Interpretation. It is the simplest type of
research. It is more specific than an exploratory study. This study aims to
identify the characteristics of a community. This study employs simple
statistical techniques.
Use of Descriptive Research
1.
Objective is to describe something- usually market
characteristics or functions.
2.
To describe the characteristics of relevant groups, such as
consumers, salespeople, organizations, or market areas.
3.
To estimate the percentage of units in a specified population
exhibiting a certain behavior.
4.
To determine the perceptions of product characteristics.
5.
To determine the degree to which marketing variables are
associated.
6.
To make specific predictions
Types of Descriptive Research
1.
Cross sectional studies
A cross-sectional study (also known
as a cross-sectional analysis, transversal study, prevalence study) is a type
of observational study that involves the analysis of data collected from a
population, or a representative subset, at
one specific point in time—that is, cross-sectional data. Cross-sectional
studies are carried out at one time point or over a short period. They are
usually conducted to estimate the prevalence of the outcome of interest for a
given population, commonly for the purposes of public health planning. Data can
also be collected on individual characteristics, including exposure to risk
factors, alongside information about the outcome. In this way cross-sectional
studies provide a 'snapshot' of the outcome and the characteristics associated
with it, at a specific point in time.
Cross-sectional research studies all
have the following characteristics:
Takes place
at a single point in time
Variables
are not manipulated by researchers
Provide information only; do not
answer why
2.
Longitudinal studies
A longitudinal survey is a
correlational research study that involves repeated observations of the same
variables over long periods of time — often many decades. It is a type of
observational study. Longitudinal studies are often used in psychology to study
developmental trends across the life span, and in sociology to study life
events throughout lifetimes or generations. The reason for this is that, unlike
cross-sectional studies, in which different individuals with same
characteristics are compared, longitudinal studies track the same people, and
therefore the differences observed in those people are less likely to be the
result of cultural differences across generations.
Difference between Exploratory research and Descriptive
research
Exploratory research
|
Descriptive research
|
1.
It is concerned with the “why” aspect of consumer behavior
i.e., it tries to understand the problem and not measure the result.
|
1.
It is concerned with the “what”, “when” or “how often” on
the consumer behavior
|
2.
This research does not require large samples.
|
2.
This needs large samples of respondents.
|
3.
Sample need not be representing the population.
|
3.
Sample must be representative of population.
|
4.
Due to imprecise statement, data collection is not easy
|
4.
Statement is precise. Therefore data collection is easy
|
5.
Characteristics of interest to be measured is not clear
|
5.
Characteristics of interest to be measured is clear
|
6.
There is no need for a questionnaire for collecting the
data.
|
6.
There should be a properly designed for collecting the
data.
|
7.
Data collection methods are focus group, literature
searching and case study.
|
7.
Use of panel data, longitudinal, cross-sectional studies.
|
Causal Research
Causal research (also referred to as
explanatory and experimental research) is the investigation of (research into)
cause-and-effect relationships. In order to determine causality, it is
important to observe variation in the variable that is assumed to cause the
change in the other variables, and then measure the changes in the other
variables.
Uses of Casual Research
1.
To understand which variables are the cause (independent
variables) and which variables are the effect (dependent variables) of a
phenomenon
2.
To determine the nature of the relationship between the
causal variables and the effect to be predicted
Classification of
experimental designs
Classified as:
1. Pre
experimental design,
2.
Quasi-experimental design,
3. True
experimental design,
4. Statistical experimental design
1. Pre experimental design
Pre-experimental designs are so named
because they follow basic experimental steps but fail to include a control
group. In other words, a single group is often studied but no comparison
between an equivalent non-treatment group is made.
Pre-experimental designs include:
-case study design
-One group pre-test/post-test design
-static group comparison design
(cross-sectional study)
2. Quasi-experimental
design
A quasi-experiment is an empirical
study used to estimate the causal impact of an intervention on its target
population. Quasi-experimental research shares similarities with the
traditional experimental design or randomized controlled trial, but they
specifically lack the element of random assignment to treatment or control.
Instead, quasi-experimental designs typically allow the researcher to control
the assignment to the treatment condition, but using some criterion other than
random assignment (e.g., an eligibility cutoff mark)
3.
True experimental design,
True experimental design is regarded
as the most accurate form of experimental research, in that it tries to prove
or disprove a hypothesis mathematically, with statistical analysis.
For some of the physical sciences,
such as physics, chemistry and geology, they are standard and commonly used.
For social sciences, psychology and biology, they can be a little more
difficult to set up. For an experiment to be classed as a true experimental
design, it must fit all of the following criteria.
1.
The sample groups must be assigned randomly.
2.
There must be a viable control group.
3.
Only one variable can
be manipulated and tested. It is possible to test more than one, but such
experiments and their statistical analysis tend to be cumbersome and difficult.
4.
The tested subjects
must be randomly assigned to either control or experimental groups
4.
Statistical experimental design
The term Statistical experimental
design refers to a plan for assigning experimental units to treatment
conditions. A good experimental design serves three purposes.
1.
Causation: It allows the experimenter to make causal
inferences about the relationship between independent variables and a dependent
variable.
2.
Control: It allows the experimenter to rule out alternative
explanations due to the confounding effects of extraneous variables (i.e.,
variables other than the independent variables).
3.
Variability: It
reduces variability within treatment conditions, which makes it easier to
detect differences in treatment outcomes.
Types of errors affecting research design
1.
Total error is the variation between the true mean value in the
population of the variable of interest and the observed mean value obtained in
the marketing research project.
2.
Random sampling error is the variation between the true
mean value for the population and the true mean value for the original sample.
It occurs when the sample selected is not representative of population
3.
Non-sampling errors can be attributed to sources other
than sampling, and they may be random or nonrandom: including errors in problem
definition, approach, scales, questionnaire design, interviewing methods, and
data preparation and analysis. Non-sampling errors
consist of non-response errors and response errors.
4.
Non-response error arises when some of the respondents
included in the sample do not respond. The primary causes of non-response are
refusals & not at homes. This will cause the net or resulting sample to be
different in size or composition from the original sample.
5.
Response error arises when respondents give inaccurate answers or
their answers are misrecorded or misanalyzed. Response errors can be made by
researchers, interviewers or respondents.
6.
Errors made by researcher
include:
1)
Surrogate information error: Defined as variation b/w the
information needed for the MRP & the information sought by the researcher.
2)
Measurement Error: Is the variation b/w the information
sought and information generated by the measurement process employed by the
researcher.
3)
Population definition error: Is the variation between the actual
population relevant to the problem and the population as defined by the
researcher.
4)
Data analysis error: Encompasses errors that occur while
raw data are transformed into research findings
7.
Response errors made by
interviewer
1)
Respondent selection error: Occurs when interviewers select
respondents other than those specified by the sampling design
2)
Questioning error: Error made in asking questions or not probing
when more information is needed
3)
Recording error: Error due to hearing, interpreting
& recording the answer
4)
Cheating error: When interviewer fabricates answer to a part or whole
8.
Errors made by the
respondent are:
1)
Inability error: Inability to provide accurate answer
because of unfamiliarity, fatigue, boredom, faulty recall, question content,
etc
2)
Unwillingness error: Respondent may intentionally
misreport their answer
Exploratory & Conclusive Research Differences
A Comparison of Basic Research Designs
***********
Module-3
Data Collection
Data Collection
Data collection is the process of
gathering and measuring information on variables of interest, in an established
systematic fashion that enables one to answer stated research questions, test
hypotheses, and evaluate outcomes.
Primary and Secondary data
A common classification is based upon
who collected the data.
Primary data:
Data
collected by the investigator himself/ herself for a specific purpose.
Examples: Data collected by a
student for his/her thesis or research project. (In movies) The hero is
directly told by the heroine that he is her “ideal man”.
Secondary data: Data collected by someone else for some other purpose (but
being utilized by the investigator for another purpose).
Examples: Census data being used
to analyze the impact of education on career choice and earning.
Advantages of Primary
data
1.
It is very accurate.
2.
It is most suitable method of data
collection
3.
Results are very good.
Disadvantages of Primary
data
1.
It takes much time to collect
2.
It is an expensive method
3.
It takes much labor
Advantages of secondary data
1.
Identify the problem
2.
Better define the problem
3.
Develop an approach to the problem
4.
Formulate an appropriate research design (for example, by
identifying the key variables)
5.
Answer certain research questions and test some hypotheses
6.
Interpret primary data more insightfully
Disadvantages of secondary data
1.
The relevance & accuracy to the current problem may be
limited
2.
The objectives, nature, & methods used may not be
appropriate to present situation
3.
May not be completely current data.
Sources of data
Sources of secondary data
1.
Internal secondary data sources: the data
generated within the organization in the process of routine business
activities, are referred to as internal secondary data. Financial accounts,
production, Quality control, and sales records are examples of such data.
2.
External secondary data sources: the data
collected by the researcher from the outside the company. This can be divided
into four parts:
1. Census
data: it is the most important data among the sources of data.
2. Individual
project report being published
3. Data
collected for sale on commercial basis called syndicated data
4. Miscellaneous
data.
Primary Data collection methods/
sources of primary data
Data collection methods for impact
evaluation vary along a continuum. At the one end of this continuum are
quantitative methods and at the other end of the continuum are Qualitative
methods for data collection.
1.
Observations
Observation is a process of recording
the behavior patterns of people, objects, and occurrences without questioning
or communicating with them. Observation can take the place in a laboratory
setting or in a natural setting. Generally there are two ways to conduct
observation, namely non-participative observation and participative observation. The researcher in non-participative
observation does not involve in the activities of the people being observed. He
or she merely record whatever happens among the people, including their actions
and their behavior, and anything worth recording. On the one hand, the
researcher in a participative observation involves fully with the people being
observed, with the objective of trying to understand the values, motives and
practices of those being researched.
2.
Survey
Survey research is often used to
assess thoughts, opinions, and feelings. Survey research can be specific and
limited, or it can have more global, widespread goals.
The span of time needed to complete
the survey brings us to the two different types of surveys: cross-sectional and
longitudinal.
1. Cross-Sectional Surveys
Collecting information from the
respondents at a single period in time uses the cross-sectional type of survey.
Cross-sectional surveys usually utilize questionnaires to ask about a
particular topic at one point in time. For instance, a researcher conducted a
cross-sectional survey asking teenagers’ views on cigarette smoking as of May
2010. Sometimes, cross-sectional surveys are used to identify the relationship
between two variables, as in a comparative study.
2. Longitudinal Surveys
When the researcher attempts to
gather information over a period of time or from one point in time up to
another, he is doing a longitudinal survey. The aim of longitudinal surveys is
to collect data and examine the changes in the data gathered. Longitudinal
surveys are used in cohort studies, panel studies and trend studies.
3.
Questionnaires
Typically, a questionnaire is a
paper-and-pencil instrument that is administered to the respondents. The usual
questions found in questionnaires are closed-ended questions, which are
followed by response options. However, there are questionnaires that ask
open-ended questions to explore the answers of the respondents.
Questionnaires have been developed
over the years. Today, questionnaires are utilized in various survey methods,
according to how they are given. These methods include the self-administered,
the group-administered, and the household drop-off. Among the three, the
self-administered survey method is often used by researchers nowadays. The
self-administered questionnaires are widely known as the mail survey method
4.
Interviews
Between the two broad types of
surveys, interviews are more personal and probing. Questionnaires do not
provide the freedom to ask follow-up questions to explore the answers of the
respondents, but interviews do.
An interview includes two persons -
the researcher as the interviewer, and the respondent as the interviewee. There
are several survey methods that utilize interviews. These are the personal or
face-to-face interview, the phone interview, and more recently, the online
interview.
5.
Qualitative Techniques of data collection
Data collection approaches for
qualitative research usually involves:
1. Direct
interaction with individuals on a one to one basis
2. Or direct interaction with
individuals in a group setting
Qualitative research data collection
methods are time consuming, therefore data is usually collected from a smaller
sample than would be the case for quantitative approaches - therefore this
makes qualitative research more expensive. The benefits of the qualitative
approach is that the information is richer and has a deeper insight into the
phenomenon under study.
Measurement and Scaling
Technique
Measurement is a process of mapping
aspects of a domain onto other aspects of a range according to some rule of
correspondence.
Scaling is the assignment of objects
to numbers or semantics according to a rule. In scaling, the objects are text
statements, usually statements of attitude, opinion, or feeling.
Basic measurement scales techniques
1.
Nominal scale
Nominal Scale is the crudest among
all measurement scales but it is also the simplest scale. In this scale the
different scores on a measurement simply indicate different categories. The
nominal scale does not express any values or relationships between variables.
The nominal scale is often referred to as a categorical scale. The assigned
numbers have no arithmetic properties and act only as labels. The only
statistical operation that can be performed on nominal scales is a frequency
count.
2.
Ordinal scale
Ordinal Scale involves the ranking of
items along the continuum of the characteristic being scaled. In this scale,
the items are classified according to whether they have more or less of a
characteristic. The main characteristic of the ordinal scale is that the
categories have a logical or ordered relationship. This type of scale permits
the measurement of degrees of difference, (i.e. ‘more’ or ‘less’) but not the
specific amount of differences (i.e. how much ‘more’ or ‘less’). This scale is
very common in marketing, satisfaction and attitudinal research.
3.
Interval scale
Interval Scale is a scale in which
the numbers are used to rank attributes such that numerically equal distances
on the scale represent equal distance in the characteristic being measured. An
interval scale contains all the information of an ordinal scale, but it also
one allows to compare the difference/distance between attributes. Interval
scales may be either in numeric or semantic formats. The interval scales allow
the calculation of averages like Mean, Median and Mode and dispersion like
Range and Standard Deviation.
4.
Ratio scale
Ratio Scale is the highest level of
measurement scales. This has the properties of an interval scale together with
a fixed (absolute) zero point. The absolute zero point allows us to construct a
meaningful ratio. Ratio scales permit the researcher to compare both
differences in scores and relative magnitude of scores. Examples of ratio
scales include weights, lengths and times.
5.
Attitude measurement scale
Attitudes are composed of 1) Beliefs
about the subject 2) Emotional feeling (like-dislike) 3) Readiness to respond
behaviorally - i.e. buy7."Attitude is defined as the predisposition to
respond to an idea or object, and in marketing it relates to the consumers predisposition
to respond to a particular product or service".
6.
Likert’s Scale
Likert, is extremely popular for
measuring attitudes, because, the method is simple to administer. With the
Likert scale, the respondents indicate their own attitudes by checking how
strongly they agree or disagree with carefully worded statements that range
from very positive to very negative towards the attitudinal object. Respondents
generally choose from five alternatives (say strongly agree, agree, neither
agree nor disagree, disagree, strongly disagree). A Likert scale may include a
number of items or statements.
7.
Semantic Differential Scale
This is a seven point rating scale
with end points associated with bipolar labels (such as good and bad, complex
and simple) that have semantic meaning. It can be used to find whether a
respondent has a positive or negative attitude towards an object. It has been
widely used in comparing brands, products and company images. It has also been
used to develop advertising and promotion strategies and in a new product
development study.
8.
Thurstone scale
Thurstone's method of pair
comparisons can be considered a prototype of a normal distribution-based method
for scaling-dominance matrices. Even though the theory behind this method is
quite complex the algorithm itself is straightforward. A Thurstone scale has a
number of statements to which the respondent is asked to agree or disagree.
There are three types of scale that
Thurstone described:
Equal-appearing intervals method
Successive
intervals method
Paired comparisons method
9.
Multi-Dimensional Scaling
Multidimensional scaling (MDS) is a
means of visualizing the level of similarity of individual cases of a dataset.
It refers to a set of related ordination techniques used in information visualization,
in particular to display the information contained in a distance matrix.
Process of designing
questionnaire
There are eight steps
involved in the development of a questionnaire:
1. Decide the
information required: It should be noted that one does not start by writing
questions. The first step is to decide 'what are the things one needs to know
from the respondent in order to meet the survey's objectives?' These, as has
been indicated in the opening chapter of this textbook, should appear in the
research brief and the research proposal.
2. Define the target respondents: At the outset, the researcher must
define the population about which he/she wishes to generalize from the sample
data to be collected. For example, in marketing research, researchers often
have to decide whether they should cover only existing users of the generic
product type or whether to also include non-users. Secondly, researchers have
to draw up a sampling frame. Thirdly, in designing the questionnaire we must
take into account factors such as the age, education, etc. of the target
respondents.
3. Choose the method(s) of reaching your target respondents: It may seem strange to
be suggesting that the method of reaching the intended respondents should
constitute part of the questionnaire design process. However, a moment's
reflection is sufficient to conclude that the method of contact will influence
not only the questions the researcher is able to ask but the phrasing of those
questions.
4. Decide on question content: Researchers must always be prepared
to ask, "Is this question really needed?" The temptation to include
questions without critically evaluating their contribution towards the
achievement of the research objectives, as they are specified in the research
proposal, is surprisingly strong. No question should be included unless the
data it gives rise to is directly of use in testing one or more of the
hypotheses established during the research design
5. Develop the question wording: Survey questions can be classified
into three forms, i.e. closed, open-ended and open response-option questions.
So far only the first of these, i.e. closed questions has been discussed. It
provides the respondent with an easy method of indicating his answer - he does
not have to think about how to articulate his answer.
6. Put questions into a meaningful order and format:
1. Opening questions: it should be
easy to answer and not in any way threatening to THE respondents.
2. Question flow: Questions should
flow in some kind of psychological order, so that one leads easily and
naturally to the next.
3. Question variety: Respondents
become bored quickly and restless when asked similar questions for half an hour
or so.
7. Physical appearance
of the questionnaire: The physical appearance of a questionnaire can have a
significant effect upon both the quantity and quality of marketing data
obtained. The quantity of data is a function of the response rate.
8. Pre-test the questionnaire.
The purpose of pre-testing the
questionnaire is to determine:
1.
whether the questions as they are worded will achieve the
desired results
2.
whether the questions have been placed in the best order
3.
Whether the instructions to interviewers are adequate.
4.
whether the questions are understood by all classes of
respondent
Type of Observation methods/ Conducting an Observation study and Data collection
1. Structured - Unstructured Observation;
For structured observation, the researcher specifies in detail what is
to be observed and how the measurements are to be recorded, e.g., an auditor
performing inventory analysis in a store. It Reduces observation bias &
enhance reliability of the data. It is suitable for use in conclusive research.
In unstructured observation, the observer monitors all aspects of the
phenomenon that seem relevant to the problem at hand, e.g., observing children
playing with new toys. This method is appropriate for exploratory research.
2. Disguised-Undisguised Observation:
In disguised observation, the respondents are unaware that they are
being observed. Disguise may be
accomplished by using one-way mirrors, hidden cameras, or inconspicuous
mechanical devices. Observers may be
disguised as shoppers or sales clerks.
In undisguised observation, the respondents are aware that they are
under observation.
3. Direct- Indirect observation:
In direct observation, the actual
behavior or phenomenon of interest is observed.
In indirect observation, the results
of consequences of the phenomenon are observed.
4. Human-Mechanical observation:
Most of the studies in marketing
research are based on human observation, wherein trained observers are required
to observe and record their observation. In some cases, mechanical devices such
as eye cameras are used for observation.
A Classification of Survey Methods
1.
Telephonic interviews: are most popular,
followed by personal interviews & mail surveys
Telephonic methods:
1)
Traditional telephone interviews: It involves phoning a sample of
respondents and asking them a series of questions
2)
Computer assisted telephone interviewing: interviewing from a
central location is now more popular. CATI uses a computerized questionnaire
administered to respondents over the telephone. The computer checks the
responses for appropriateness & consistency. Interviewing time is reduced,
data quality is enhanced & the laborious data collection processes are
eliminated.
2.
Personal methods:
1)
In home interviews: Respondents are interviewed in
persons in their home. It incurs more cost and particularly used by syndicate firms
2)
Mall- intercept interviews: Mall shoppers are intercepted &
bought to test facilities in the malls. Advantage is more efficient for the
respondent to come to the interviewer.
3)
Computer assisted personal interviewing: The respondents sits in
front of the terminal & answers a questionnaire. CAPI has been used to
collect data at shopping malls, product clinics, conferences, and tradeshows.
3.
Mail methods
1)
Mail interview: In traditional mail interview, questionnaires are
mailed to preselected potential respondents. A typical mail interview package
consists of outgoing envelope, cover letter, questionnaire, return envelope,
& possibly an incentive. No verbal interaction b/w the researcher & the
respondent.
2)
Mail panels: A large, nationally representative sample of households
that has agreed to participate in periodic mail questionnaires, product tests,
telephone surveys. Mail panels can be used to obtain information repeatedly
from the same respondents
4.
Electronic methods:
1)
Email interviews: A list of e- mail address is
obtained. The survey is written within the body of the E mail message &
sent to respondents.
2)
Internet interviews: Internet uses hyper text markup
language. Respondents may be recruited from potential databases &
asked to go to particular location to complete the survey
Qualitative Research
An unstructured, exploratory research
methodology based on small samples that provides insight & understanding of
the problem setting.
Quantitative Research
A research methodology that seeks to
quantify the data & typically, applies some form of statistical analysis.
Quantitative research must be
preceded by appropriate qualitative research
Qualitative vs. Quantitative Research
A Classification of Qualitative Research Procedures
1.
Delphi technique: This process where group
of experts in the field gathers together. In the Delphi approach, the group
members are asked to make individual judgments about a particular subject, say
‘sales forecast’. These judgments are compiled and returned to the group members,
so that they can compare previous judgments with those of others.
2.
Focus group: Is an interview conducted
by trained moderator in a non structured & natural manner with a small
group of respondents. Main purpose is to
gain insights by listening to a group of people from the appropriate target
market talk about issues of interest to the researcher.
3.
Depth interviews: An unstructured, direct,
personal interview in which a single respondent is probed by a highly skilled
interviewer to uncover underlying motivations, beliefs, attitudes &
feelings on a topic. Interviewer attempts to follow a rough outline but the order
& specific wordings of probing is influenced by the subject’s replies.
4.
Projective Techniques: An unstructured, indirect
form of questioning that encourages respondents to project their underlying
motivations, beliefs, attitudes or feelings regarding the issues of concern. In
projective techniques, respondents are asked to interpret the behavior of
others.
1)
Word Association: in word association, respondents are presented with a list of words,
one at a time and asked to respond to each with the first word that comes to
mind.
2)
Completion Techniques: In Sentence completion, respondents are given incomplete sentences
and asked to complete them. Generally,
they are asked to use the first word or phrase that comes to mind. In story completion, respondents are
given part of a story – enough to direct attention to a particular topic but
not to hint at the ending. They are
required to give the conclusion in their own words.
3)
A Cartoon Test: In cartoon tests, cartoon characters are shown in a specific
situation related to the problem. Cartoon tests are simpler
to administer and analyze than picture response techniques.
4)
Thematic Appreciation Test (TAT): TAT is a projective
technique. It is used to measure the attitude and perception of the individual.
Some picture cards are show to respondents.
*************
Module-4
Sampling and Hypothesis
Sample
The objective of marketing research
is to obtain information about the characteristics or parameters of a population.
Population: is the aggregate of all the elements that share some common
set of a characteristics, & that comprise the universe
Census: a complete enumeration of the elements of a population or
study objects.
Sample:
is a subgroup of the population selected for
participation in the study. Sample characteristics called statistics, are then
used to make inferences about the population parameters.
The Sampling Design Process
1.
Define the population: The collection of all
units of a specified type in a given region at a particular point or period of
time is termed as a population or universe. Thus, one may consider a population
of persons, families, farms, cattle in a region or a population of trees or
birds in a forest or a population of fish in a tank etc. depending on the
nature of data required. The target population should be
defined in terms of elements, sampling units, extent, and time.
a)
An element is the object about which or from which the
information is desired, e.g., the respondent.
b)
A sampling unit is an element, or a unit containing
the element, that is available for selection at some stage of the sampling
process.
c)
Extent refers to the geographical boundaries.
d)
Time is the time period under consideration.
2.
Determine the sampling
frame: A list of all the sampling
units belonging to the population to be studied with their identification
particulars or a map showing the boundaries of the sampling units is known as
sampling frame. Examples of a frame are a list of farms and a list of suitable
area segments like villages in India or counties in the United States. The
frame should be up to date and free from errors of omission and duplication of
sampling units.
3.
Specify the sampling unit: Elementary units or group of such
units which besides being clearly defined, identifiable and observable, are
convenient for purpose of sampling are called sampling units. For instance, in
a family budget enquiry, usually a family is considered as the sampling unit
since it is found to be convenient for sampling and for ascertaining the
required information. In a crop survey, a farm or a group of farms owned or
operated by a household may be considered as the sampling unit.
4.
Select a sampling
technique:
Whether to select the sample through non probability or probability sampling. Non
– probability sampling: sampling
techniques that do not use chance selection procedures. Rather, they rely on
the personnel judgment of the researcher. Probability sampling: in which each element of the
population has a fixed probabilistic chance of being selected for the sample.
5.
Sample size: it refers to the no of
elements to be included in the study. Important qualitative factors in
determining the sample size are
a)
the importance of the decision
b)
the nature of the research
c)
the number of variables
d)
the nature of the analysis
e)
sample sizes used in similar studies
f)
incidence rates
g)
completion rates
h)
resource constraints
6.
Execute the sampling
process: Execution
of sampling process requires a detailed specification of how the sampling
design decisions with respect to the population, sampling frame, sampling unit,
sampling technique, & sample size are to be implemented.
Types of Sampling
1.
Probability Sampling
A probability sampling method is any
method of sampling that utilizes some form of random selection. In order
to have a random selection method, one must set up some process or procedure
that assures that the different units in selected population have equal
probabilities of being chosen.
Types of Probability Sampling include
Simple random sampling, systematic sampling, stratified random sampling,
cluster sampling
a)
Simple random sampling
A simple random sample is a subset of
individuals (a sample) chosen from a larger set (a population). Each individual
is chosen randomly and entirely by chance, such that each individual has the
same probability of being chosen at any stage during the sampling process, and
each subset of k individuals has the same probability of being chosen
for the sample as any other subset of k individuals
b)
Systematic sampling
Systematic sampling is a random
sampling technique which is frequently chosen by researchers for its simplicity
and its periodic quality. Systematic sampling is a statistical method
involving the selection of elements from an ordered sampling frame. The most
common form of systematic sampling is an equal-probability method. In this
approach, progression through the list is treated circularly, with a return to
the top once the end of the list is passed. The sampling starts by selecting an
element from the list at random and then every kth element in the frame is selected, where k, the
sampling interval (sometimes known as the skip): this is calculated as:
K=
where n is the sample size,
and N is the population size.
c)
Stratified random sampling,
A method of sampling that involves
the division of a population into smaller groups known as strata. In stratified
random sampling, the strata are formed based on members' shared attributes or
characteristics. A random sample from each stratum is taken in a number
proportional to the stratum's size when compared to the population. These
subsets of the strata are then pooled to form a random sample.
The main advantage with stratified
sampling is how it captures key population characteristics in the sample.
d)
Cluster sampling
Cluster sampling refers to a sampling
method that has the following properties.




Two types of cluster sampling methods
are:
v One-stage sampling. All of the elements within selected clusters are included in
the sample.
v Two-stage sampling. A subset of elements
within selected clusters is randomly selected for inclusion in the sample.
2.
Non Probability Sampling
Non-probability sampling is a
sampling technique where the samples are gathered in a process that does not
give all the individuals in the population equal chances of being selected.
a)
Convenience sampling
Convenience sampling is a
non-probability sampling technique where subjects are selected because of their
convenient accessibility and proximity to the researcher.
A statistical method of drawing
representative data by selecting people because of the ease of their
volunteering or selecting units because of their availability or easy access.
The advantages of this type of sampling are the availability and the quickness
with which data can be gathered. The disadvantages are the risk that the sample
might not represent the population as a whole, and it might be biased by
volunteers.
b)
Judgmental sampling
Judgmental sampling is a
non-probability sampling technique where the researcher selects units to be
sampled based on their knowledge and professional judgment.
This type of sampling technique is
also known as purposive sampling and authoritative sampling. The process
involves nothing but purposely handpicking individuals from the population
based on the authorities or the researcher's knowledge and judgment.
c)
Quota sampling
A sampling method of gathering
representative data from a group. As opposed to random sampling, quota sampling
requires that representative individuals are chosen out of a specific subgroup.
For example, a researcher might ask for a sample of 100 females, or 100
individuals between the ages of 20-30.
Step-by-step Quota Sampling




d)
Snowball sampling
Snowball sampling is a
non-probability sampling technique that is used by researchers to identify
potential subjects in studies where subjects are hard to locate. To create a
snowball sample, there are two steps: (a) trying to identify one or more units
in the desired population; and (b)using these units to find further units and
so on until the sample size is met.
Errors in sampling
Sampling error is the deviation of the
selected sample from the true characteristics, traits, behaviors, qualities or
figures of the entire population.
Sample Size and Sampling
Error
Given two exactly the same studies,
same sampling methods, same population, the study with a larger sample size
will have less sampling process error compared to the study with smaller sample
size. Keep in mind that as the sample size increases, it approaches the size of
the entire population, therefore, it also approaches all the characteristics of
the population, thus, decreasing sampling process error.
Ways to
Eliminate Sampling Error :
There is only one way to eliminate
this error. This solution is to eliminate the concept of sample, and to test
the entire population.
In most cases this is not possible;
consequently, what a researcher must to do is to minimize sampling process
error. This can be achieved by a proper and unbiased probability sampling and
by using a large sample size.
Non-probability vs. Probability Sampling
Hypothesis
A hypothesis (H) is an
unproven statement or proposition about a factor or phenomenon that is of
interest to the researcher. It is a tentative statement about relationship
between two or more variables as stipulated by the theoretical framework or the
analytical model.
Often, a hypothesis is a possible
answer to the research question. Hypothesis are declarative and can be tested
empirically where as RQ are interrogative. It provides guidelines on what &
how data are to be collected & analyzed. (Suggest variables to be included
in the research design.)
Types of hypothesis
1.
Simple hypothesis - this predicts the relationship
between a single independent variable (IV) and a single dependent variable (DV)
2.
Complex hypothesis - this predicts the relationship
between two or more independent variables and two or more dependent variables.
3.
Directional
hypotheses: These are usually derived from theory. They may imply that the
researcher is intellectually committed to a particular outcome. They specify
the expected direction of the relationship between variables i.e. the
researcher predicts not only the existence of a relationship but also its
nature.
4.
Non-directional hypotheses: Used when there is little or no
theory, or when findings of previous studies are contradictory. They may imply
impartiality. Do not stipulate the direction of the relationship.
5.
Associative hypotheses: Propose relationships
between variables - when one variable changes, the other changes. Do not
indicate cause and effect.
6.
Causal hypotheses: Propose a cause and
effect interaction between two or more variables. The independent variable is
manipulated to cause effect on the dependent variable. The dependent variable
is measured to examine the effect created by the independent variable.
7.
Null hypotheses: These are used when the researcher believes there is
no relationship between two variables or when there is inadequate theoretical
or empirical information to state a research hypothesis Null hypotheses can be:
a)
simple or complex;
b)
Associative or causal.
8.
Alternative
hypothesis: is one in which some
difference or effect is expected.
Accepting the alternative hypothesis will lead to changes in opinions or
actions. In marketing research, the null hypothesis is formulated in such a way
that its rejection leads to the acceptance of the desired conclusion. The alternative hypothesis represents the
conclusion for which evidence is sought.
9.
Analytical hypothesis: here relationship of analytical
variable is found. Used when one would like to specify the relationship between
changes in one property leading to change in another. E.g. Income level related
to life style. Literacy related to number of children in the family
10.
Statistical hypothesis: these are got from the samples that
are measurable. There are 2 types
a.
Hypothesis which indicates difference
b.
Hypothesis which indicates association
11.
Common sense hypothesis: based on what is being observed. E.g.
junior students are more disciplined than seniors.
Characteristics of a hypothesis
1.
Clarity of the concepts: concepts should not be
abstract. If concepts are not clear, precise problem formulation will be
difficult. Different people hold diff concepts about the same object also same
word may have several meaning. E.g. wearing a sunglass represents a lifestyle
for student whereas it is a protecting device to a doctor
2.
Ability to test: it should be possible to
verify the hypothesis. Therefore a good hypothesis is one in which there is
empirical evidence. General statement should be avoided. E.g. children of rich
parents do not do well in their studies
3.
Specific/ clear: the hypothesis we form should be specific & clear i.e., the relationship
b/w the variables should be clear
4.
Statistical tools: hypothesis should be such
that, it is possible to use statistical techniques
5.
Logical: if there are 2 or more hypotheses
is derived from the same basic theory, they should not contradict to each
other.
6.
Subjectivity: researcher subjectivity or
his biased judgment should be eliminated from the hypothesis.
7.
Simple: simple means less constraints or assumptions before
formulating it.
8.
Theory: backed up by theoretical
framework.
Sources of hypothesis
1.
General Culture: The
general pattern of culture helps not only to formulate a hypothesis, but also
to guide its trend. The culture has a great influence upon the thinking process
of people and hypothesis may be formed to test one or more of these ideas.
2.
Scientific Theory: The knowledge of theory leads to form
further generalizations from it. These generalizations form the part of
hypothesis.
3.
Analogies: Sometimes a hypothesis is formed from the analogy. A
similarity between two phenomena is observed and a hypothesis is formed to test
whether the two phenomena are similar in any other respect.
4.
Observation: people’s behavior is observed. In this method we
observed behavior to infer the attitudes. This is an indirect method of
attitude measurement.
5.
Case studies: case studies published can be used as a source for
hypothesis. Normally this is done before the launch of product to find customer
taste and preferences.
6.
Past experience: here researcher goes by past experience to formulate
the hypothesis.
Formulation of Hypothesis
Hypothesis Formulation
Once having identified research
question, it is time to formulate hypothesis. While the research question is
broad and includes all the variables one want to consider, the hypothesis is a
statement that specific relationship one expect to find from examination of
these variables. When formulating the hypothesis (es), there are a few things
one need to keep in mind. Good hypotheses meet the following criteria:
1) Identify
the independent and dependent variables to be studied.
2) Specify the
nature of the relationship that exists between these variables.
3) Simple
(often referred to as parsimonious). It is better to be concise than to be
long-winded. It is also better to have several simple hypotheses than one complicated
hypothesis.
4) Does not
include reference to specific measures.
5) Does not refer to specific
statistical procedures that will be used in analysis.
6) Implies the
population that one is going to study.
7) Is falsifiable and testable.
As indicated above, it is better to
have several simple hypotheses than one complex one. However, it is also a good
idea to limit the number of hypotheses one use in a study to six or fewer.
Studies that address more hypotheses than six will often be too time consuming
to keep participants interested, and uninterested participants do not take the
importance of their responses as seriously. Another advantage to limiting the
number of formal hypotheses one formulate is that too many can make the
discussion section of one paper very hard to write.
It is important to remember that one
do not have to have a formal hypothesis to justify all comparisons and
statistical procedures one might use. For instance, it is only when one starts
doing exploratory analysis of one data that one realize that gender is an
influencing factor. One does not have to back up and write a hypothesis that
addresses this finding. In fact, it is better in most cases to not do this. One
can report any statistical findings one feel are relevant, whether or not one
have a hypothesis that addressed them.
The final criterion listed above
warrants additional mention. A good hypothesis is not only testable, that is,
something one can actually test for in one study, but is must also be
falsifiable. It is tempting to ignore this requirement, especially as a new
researcher. We want so badly to find great things, and for our study to turn
out exactly as we expect it to, that we tend to ignore the possibility that we
don’t know everything and that no prediction is failsafe when it comes to
humans. Try to keep in mind that all research is relevant. Whether or not one
finding are what one expect, one will find something. Believe it or not,
failing to find group differences can be just as important as finding expected
group differences. In fact, studies that return results in opposition to what
we were hoping for, or believed would logically occur, often lead to many more
great studies than we could have hoped for. After all, it could be great for
the findings of one current research to act as a guiding principal to future
research… it is likely that this would require less work in terms of literature
review, as one would always be familiar with at least a portion of the
literature that is relevant to latest study.
Errors in Hypothesis
Type I error: Rejecting the null hypothesis
when it is in fact true is called a Type I error. Deciding, before doing
a hypothesis test, on a maximum p-value for which they will reject the null
hypothesis. This value is often denoted α (alpha) and is also called the significance
level. When a hypothesis test results in a p-value that is less than the
significance level, the result of the hypothesis test is called statistically
significant.
Type II error: Not rejecting the null hypothesis
when in fact the alternate hypothesis is true is called a Type II
error. (The second example below provides a situation where the concept of
Type II error is important.)
Parametric and Non
Parametric Test
If the information about the
population is completely known by means of its parameters then statistical test
is called parametric test E.g.: t- test, f-test, z-test, ANOVA
If there is no knowledge about the
population or parameters, but still it is required to test the hypothesis of
the population. Then it is called non-parametric test Eg: mann-Whitney, rank
sum test, Kruskal-Wallis test
T-Test
A T-test
is any statistical hypothesis test in which the test statistic follows a
Student's T distribution if the
null hypothesis is supported. It can be used to determine if two sets of data
are significantly different from each other, and is most commonly applied when
the test statistic would follow a normal distribution if the value of a scaling
term in the test statistic were known. When the scaling term is unknown and is
replaced by an estimate based on the data, the test statistic (under certain
conditions) follows a Student's t distribution.
A two-sample t-test examines whether
two samples are different and is commonly used when the variances of two normal
distributions are unknown and when an experiment uses a small sample size. For
example, a t-test could be used to compare the average floor routine score of
the U.S. women's Olympic gymnastics team to the average floor routine score of
China's women's team.
The t-test, and any
statistical test of this sort, consists of three steps.
1. Define the
null and alternate hypotheses,
2. Calculate
the t-statistic for the data,
3. Compare tcalc to the
tabulated t-value, for the appropriate significance level and degree of
freedom. If tcalc > ttab, we reject the null hypothesis
and accept the alternate hypothesis.
Otherwise, we accept the null
hypothesis.
The t-test can be used to
compare a sample mean to an accepted value (a population mean), or it can be
used to compare the means of two sample sets.
t-test
to Compare One Sample Mean to an Accepted Value
t-test to Compare Two Sample
Means
T-test to compare One Sample
Mean to an Accepted Value
In the example, the mean of arsenic
concentration measurements was m=4 ppm, for n=7 and, with sample
standard deviation s=0.9 ppm. We established suitable null and
alternative hypotheses:
Null
Hypothesis H0: μ = μ0
Alternate Hypothesis HA:
μ > μ0
Where μ0 = 2 ppm is the
allowable limit and μ is the population mean of the measured soil
(refresher on the difference between sample and population means).
We have already seen how to do the
first step, and have null and alternate hypotheses. The second step involves
the calculation of the t-statistic for one mean, using the formula:
Where s is the standard deviation of the sample,
not the population standard deviation. In our case,
For the third step, we need a table
of tabulated t-values for significance level and degrees of freedom,
such as the one found in your lab manual or most statistics textbooks.
Referring to a table for a 95% confidence limit for a 1-tailed test, we find tν=6,95%
= 1.94. (The difference between 1- and 2-tailed distributions was covered in a
previous section.)
We are now ready to accept or reject
the null hypothesis. If the tcalc > ttab, we reject the null
hypothesis. In our case, tcalc=5.88 > ttab=2.45, so we reject
the null hypothesis, and say that our sample mean is indeed larger than the
accepted limit, and not due to random chance, so we can say that the soil is
indeed contaminated.
T-test to compare Two Sample
Means
The method for comparing two sample
means is very similar. The only two differences are the equation used to
compute the t-statistic, and the degrees of freedom for choosing the
tabulate t-value. The formula is given by
In this case, we require two separate
sample means, standard deviations and sample sizes. The number of degrees of
freedom is computed using the formula
And the result is rounded to the
nearest whole number. Once these quantities are determined, the same three
steps for determining the validity of a hypothesis are used for two sample
means.
Z-Test
A Z-test is any statistical
test for which the distribution of the test statistic under the null hypothesis
can be approximated by a normal distribution. Because of the central limit
theorem, many test statistics are approximately normally distributed for large
samples. For each significance level, the Z-test has a single critical
value (for example, 1.96 for 5% two tailed) which makes it more convenient than
the Student's t-test which has separate critical values for each sample
size.
Therefore, many statistical tests can
be conveniently performed as approximate Z-tests if the sample size is
large or the population variance known. If the population variance is unknown
(and therefore has to be estimated from the sample itself) and the sample size
is not large (n < 30), the Student's t-test may be more appropriate.
If T is a statistic that is
approximately normally distributed under the null hypothesis, the next step in
performing a Z-test is to estimate the expected value θ of T under
the null hypothesis, and then obtain an estimate s of the standard
deviation ofT. After that the standard score Z = (T − θ) /
s is calculated, from which one-tailed and two-tailed p-values
can be calculated as Φ(−Z) (for upper-tailed tests), Φ(Z) (for
lower-tailed tests) and 2Φ(−|Z|) (for two-tailed tests) where Φ is the
standardnormal cumulative distribution function.
F-Test
The f statistic, also known as
an f value, is a random variable that has an F distribution.
An F-test is any statistical
test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have
been fitted to a data set, in order to identify the model that best fits the
population from which the data were sampled. Exact "F-tests"
mainly arise when the models have been fitted to the data using least squares.
The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher.
Fisher initially developed the statistic as the variance ratio in the 1920s.
steps required to compute an f statistic:
Select a
random sample of size n1 from a normal population, having a standard
deviation equal to σ1.
Select an independent
random sample of size n2 from a normal population, having a standard
deviation equal to σ2.
The f statistic is the ratio
of s12/σ12 and s22/σ22.
The following equivalent equations
are commonly used to compute an f statistic:
f = [ s12/σ12 ] / [ s22/σ22
] f = [ s12 * σ22 ] / [ s22 * σ12 ] f = [ Χ21 / v1
] / [ Χ22 / v2 ] f = [ Χ21 * v2 ] / [ Χ22 * v1 ]
where σ1 is the standard deviation of
population 1, s1 is the standard deviation of the sample drawn from
population 1, σ2 is the standard deviation of population 2, s2 is the
standard deviation of the sample drawn from population 2, Χ21 is the chi-square
statistic for the sample drawn from population 1, v1 is the degrees of
freedom for Χ21, Χ22 is the chi-square statistic for the sample drawn from
population 2, and v2 is the degrees of freedom for Χ22. Note that
degrees of freedom v1 = n1 - 1, and degrees of freedom v2
= n2 - 1.
U-Test
The Mann–Whitney U test (also
called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test (WRS), or Wilcoxon–Mann–Whitney
test) is a nonparametric test of the null hypothesis that two populations are
the same against an alternative hypothesis, especially that a particular
population tends to have larger values than the other.
It has greater efficiency than the
t-test on non-normal distributions, such as a mixture of normal distributions,
and it is nearly as efficient as the t-test on normal distributions.
The Wilcoxon rank-sum test is not the
same as the Wilcoxon signed-rank test, although both are nonparametric and
involve summation of ranks.
A very general formulation is to
assume that:
1. All the
observations from both groups are independent of each other,
2. The
responses are ordinal (i.e. one can at least say, of any two observations,
which is the greater),
3. The
distributions of both groups are equal under the null hypothesis, so that the
probability of an observation from one population (X) exceeding an observation
from the second population (Y) equals the probability of an observation from Y
exceeding an observation from X. That is, there is a symmetry between
populations with respect to probability of random drawing of a larger
observation.
4. Under the alternative hypothesis,
the probability of an observation from one population (X) exceeding an observation
from the second population (Y) (after exclusion of ties) is not equal to 0.5.
The alternative may also be stated in terms of a one-sided test, for example:
P(X > Y) + 0.5 P(X = Y) > 0.5.
K-W Test/H- test/Rank sum test
The Kruskal–Wallis one-way analysis
of variance by ranks (named after William Kruskal and W. Allen Wallis) is a
non-parametric method for testing whether samples originate from the same
distribution It is used for comparing two or more samples that are independent,
and that may have different sample sizes, and extends the Mann–Whitney U test
to more than two groups. The parametric equivalent of the Kruskal-Wallis test
is the one-way analysis of variance (ANOVA). When rejecting the null hypothesis
of the Kruskal-Wallis test, then at least one sample stochastically dominates
at least one other sample.
1. Rank all
data from all groups together; i.e., rank the data from 1 to N ignoring
group membership. Assign any tied values the average of the ranks they would
have received had they not been tied.
2. The test statistic is given by:
K= (N-1)
Where:
is the number of observations in
group i
is the rank (among all observations) of observation j from group i
N is the total number of observations across all groups,
is the average of all the
.
3.
If the data contain no ties the denominator of the expression
for K is exactly
and
The last formula only contains the
squares of the average ranks.
4. A correction for ties if using the
short-cut formula described in the previous point can be made by dividing K by
where G is the number of groupings of
different tied ranks, and
is the number of tied
values within group i that are tied at a particular value. This
correction usually makes little difference in the value of K unless there are a
large number of ties.
5. Finally, the p-value is
approximated by
If some
values are small (i.e., less than
5) the probability distribution of K can be quite different from this
chi-squared distribution. If a table of the chi-squared probability
distribution is available, the critical value of chi-squared,
Can be found by entering the table a tg
− 1 degrees of freedom and looking under the desired significance or alpha
level.
6. If the statistic is not
significant, then there is no evidence of stochastic dominance between the
samples. However, if the test is significant then at least one sample
stochastically dominates another sample. Therefore, a researcher might use
sample contrasts between individual sample pairs, or post hoc tests
using Dunn's test, which (1) properly employs the same rankings as the
Kruskal-Wallis test, and (2) properly employs the pooled variance implied by
the null hypothesis of the Kruskal-Wallis test in order to determine which of
the sample pairs are significantly different.[4]When performing multiple sample
contrasts or tests, the Type I error rate tends to become inflated, raising
concerns about multiple comparisons.
Research report
The
Word ‘Report’ is a formal or official statement or just ‘a statement of facts’.
A report is a formal document written for a specific audience to meet a
specific need it may contain facts of a situation, project, or process, an analysis &interpretation data, events
& record or suggestion & recommendations. A report is a factual &
systematic account of a specific business or professional activity.
Types of reports
1.
On
the basis of function:-
a.
Analytical
report:-An analytical report present data with
interpretation & analysis. The report writer’s analysis the facts of a
case, Problem, condition, or situation.
b.
Informational
report:-Informational report present facts of a
case, Problem, Conditional or Situation without any analysis interpretation or
recommendations.
2.
On
the basis of periodicity:-
a.
Routine
report:-As routine report are usually prepared
on a periodic basis that is daily weekly fortnightly, monthly &
annually.& it is also called periodic report.
b.
Special
report: - A specific report is prepared &
presented to convey special condition situation problems or occasions.
3.
On
the basis of communicative form:-
a.
Oral
report: - Oral reports are informal & face-
to- face presentation of information. Oral reports are useful for presenting
brief information related to routine activities. Projects, developments &
so on.
b.
Written
report:-Written report is more conventional than
oral reports. Most business & technical report use the written mode of
presentation because the organizations using this report need to maintain
proper for future use & reference.
4.
On
the basis of Nature, scope & length:-
a.
Formal
report: - A formal report is usually the result
of a thorough investigation of a problem condition or situation it is
comparatively longer & need elaborate description & discussion.
b.
Non
Formal report:-A Non formal report would be a
brief account of a specific business or professional activity. It is usually
written to provide introductory information about a routine affair. It is
usually short & do not need elaborate description & discussion.
Importance of report
writing
1.
Report-writing is an indispensable part of any profession.
Almost every important decision in business, industry or government is taken on
the basis of information presented or recommendation made in reports.
2.
Every member of the executive staff of an organization is
made to write a report at one time or another because without making report no
analysis of their work is possible.
3.
Reports keep records which are used if the same situation
recurs.
4.
Reports also provide objective recommendations on any problem.
Hence the skill of report-writing is as important as good raw material and
equipment for running an industry or a business efficiently. An efficient
executive need to possess these skills, if he wants to rise up the corporate
ladder.
5.
It helps him to perform his functions of planning and
evaluating men and material resources efficiently.
Parts or element of a
report
A
formal report may include the following parts or elements:-
1.
Title
page:-.A formal report usually begins with a
title page. It contains the title of the report, name of the person or
organisation to whom the report is being submitted, the name of report writers
& the date.
2.
Preface:-The
preface is an optional element in a formal report. It introduces the report by
mentioning its salient feature and scope.
3.
Letter
of transmittal:-The transmittal letter is a brief
is a brief covering letter from the report writer explaining the causes for
writing the report. It may contain the objective, Scope & Other highlights
of the report.
4.
Acknowledgement:-The
Acknowledgement section contains the names of persons who contributed to the
production of the report and made the report possible it is just a thank you
note.
5.
Table
of Contents:-The Table of contents provides the
reader an overall view of the report & Shows its Organization.
6.
List
of illustrations:-The list of the
illustrations gives systematic information about table’s graphs, figures &
chats used in the report. It is usually included if the no of these
illustrations are more than ten.
7.
Abstract
or executive summary:-An abstract or an
executive summary summaries the essential information in the report focusing on
key facts finding observation result conclusion & recommendations.
8.
Introduction:-This
section introduced the reader to the report & prepares them for the
discussion that follows by providing background information defining its aims
& objectives & discussing the scope& limitations of the report.
9.
Methodology:-While
writing a report, information may have to be gathered from library &
internet or interviews surveys & formal/ informal discussion. Methodology
summarizes the methods of data collection.
10.
Discussion/Description/Analysis:-This
is the main part of the report as it presents the data that has collected in an
organized form. It focuses on facts & findings.
11.
Conclusion:-This
section conveys the significance & meaning of the report to reader by
presenting a summary of discussions & finding, result & conclusion.
12.
Recommendations:-This
section contains recommendations that are based on result & conclusions.
13.
Appendices:-An
Appendix contains supporting material or data, which is kept separate from the
main body of the report to avoid interrupting the line of development of the
report.
14.
References
and bibliography:-This section may
contain reference to book, journals, reports, and other sources used in the
report. It may also consist of a list of materials for further reference.
Oral vs. Written report
Oral
report
|
Written report
|
1. No
rigid standard format
|
1. Standard format can be applied
|
2. Remembering
all that is said is difficult
|
2. Can be read a no of times & clarification can be sought whenever
the reader chooses
|
3. Tone,
voice modulation, comprehensibility play an imp role
|
3. Free from presentation problems
|
4. Correcting
mistake if any is difficult
|
4. Mistakes, if any can be pinpointed & corrected
|
5. The
audience has no control over the speed of presentation
|
5. Has control over speed of reading
|
6. Less
accurate
|
6. Tends to be more accurate
|
7. Audience
has no choice of picking & choosing from the presentation.
|
7. The reader can pick & choose what he thinks is relevant to him
|
****************
Module-5
Data Analysis
Data for Analysis
Analysis of data is a process of
inspecting, cleaning, transforming, and modeling data with the goal of
discovering useful information, suggesting conclusions, and supporting
decision-making.
Preparing the Data for
Analysis
1.
Editing
Editing is the process of checking
and adjusting the data for omissions, legibility, and consistency. Editing may
be differentiated from coding, which is the assignment of numerical scales or
classifying symbols to previously edited data.
The purpose of editing is to ensure
the completeness, consistency, and readability of the data to be transferred to
data storage. The editor's task is to check for errors and omissions on the
questionnaires or other data collection forms.
Types:
1. Field Editing: Preliminary editing by a
field supervisor on the same day as the interview to catch technical omissions,
check legibility of handwriting, and clarify responses that are logically or
conceptually inconsistent.
2. In-house Editing: Editing performed by a central
office staff; often dome more rigorously than field editing
•Pitfalls
of Editing: Allowing subjectivity to enter into the editing process. Data
editors should be intelligent, experienced, and objective. Failing to have a
systematic procedure for assessing the questionnaires developed by the research
analyst An editor should have clearly defined decision rules to follow.
• Pretesting
Edit: Editing during the pretest stage can prove very valuable for
improving questionnaire format, identifying poor instructions or inappropriate
question wording.
2.
Coding
Coding is translating answers into
numerical values or assigning numbers to the various categories of a variable
to be used in data analysis. Coding is done by using a code book, code sheet,
and a computer card. Coding is done on the basis of the instructions given in
the codebook. The code book gives a numerical code for each variable.
Manual processing is employed when
qualitative methods are used or when in quantitative studies, a small sample is
used, or when the questionnaire/schedule has a large number of open-ended
questions, or when accessibility to computers is difficult or inappropriate.
However, coding is done in manual processing also. Ex: Male- Code 1,female
–Code2
3.
Classification
Distribution of data as a form of
classification of scores obtained for the various categories or a particular
variable. There are four types of distributions:
a.
Frequency distribution: In social science
research, frequency distribution is very common. It presents the frequency of
occurrences of certain categories. This distribution appears in two forms:
Ungrouped: Here, the scores are not collapsed into categories, e.g.,
distribution of ages of the students of a BJ (MC) class, each age value (e.g.,
18, 19, 20, and so on) will be presented separately in the distribution.
Grouped: Here, the scores are collapsed into categories, so that 2 or 3 scores
are presented together as a group. For example, in the above age distribution
groups like 18-20, 21-22 etc., can be formed)
b.
Percentage distribution: It is also possible to
give frequencies not in absolute numbers but in percentages. For instance
instead of saying 200 respondents of total 2000 had a monthly income of less
than Rs. 500, we can say 10% of the respondents have a monthly income of less
than Rs. 500.
c.
Cumulative distribution: It tells how often the
value of the random variable is less than or equal to a particular reference
value
d.
Statistical distribution: In this type of data
distribution, some measure of average is found out of a sample of respondents.
Several kind of averages are available (mean, median, mode) and the researcher
must decide which is most suitable to his purpose. Once the average has been
calculated, the question arises: how representative a figure it is, i.e., how
closely the answers are bunched around it.
4.
Tabulation
After editing, which ensures that the
information on the schedule is accurate and categorized in a suitable form, the
data are put together in some kinds of tables and may also undergo some other
forms of statistical analysis. Table can be prepared manually and/or by
computers. For a small study of 100 to 200 persons, there may be little point
in tabulating by computer since this necessitates putting the data on punched
cards. But for a survey analysis involving a large number of respondents and
requiring cross tabulation involving more than two variables, hand tabulation
will be inappropriate and time consuming.
Tables are useful to the researchers
and the readers in three ways: 1. The present an overall view of findings in a
simpler way. 2. They identify trends. 3. They display relationships in a
comparable way between parts of the findings. By convention, the dependent
variable is presented in the rows and the independent variable in the columns.
5.
Validation
Data validation ensures that the
survey questionnaires are completed and present consistent data. In this step,
should not include the questions that were not answered by most respondents in
the data analysis as this would result to bias in the results. However, in the
case of incomplete questionnaires, must count the actual number of respondents
that were able to answer a particular question. This should be the same for the
rest of the questions.
6.
Analysis and Interpretation
The process by which sense and
meaning are made of the data gathered in qualitative research, and by which the
emergent knowledge is applied to problems.
Types: Descriptive and inferential
analysis
Statistical inference is the process
of deducing properties of an underlying distribution by analysis of data.
Inferential statistical analysis infers properties about a population: this
includes testing hypotheses and deriving estimates. The population is assumed
to be larger than the observed data set; in other words, the observed data is
assumed to be sampled from a larger population.
Inferential statistics can be
contrasted with descriptive statistics. Descriptive statistics is solely
concerned with properties of the observed data, and does not assume that the
data came from a larger population.
Statistical Analysis
Statistical analysis is a component
of data analytics. In the context of business intelligence (BI), statistical
analysis involves collecting and scrutinizing every single data sample in a set
of items from which samples can be drawn.
Bivarate Analysis (Chi-Square
only)
Bivariate analysis is one of the
simplest forms of quantitative (statistical) analysis. It involves the analysis
of two variables (often denoted as X, Y), for the purpose of
determining the empirical relationship between them. In order to see if the
variables are related to one another, it is common to measure how those two
variables simultaneously change together (see also covariance).
Bivariate analysis can be helpful in
testing simple hypotheses of association andcausality – checking to what extent
it becomes easier to know and predict a value for the dependent variable if we
know a case's value of the independent variable.
Chi-Square
A chi-square test, also referred to
as
test (infrequently as the
chi-squared test), is any statistical hypothesis test in which the sampling
distribution of the test statistic is a chi-square distribution when the null
hypothesis is true.
To review, the chi-square method of
hypothesis testing has seven basic steps
1. State the null and research/alternative
hypotheses.
2. Specify the decision rule and the
level of statistical significance for the test, i.e., .05, .01, or .001. (A
significance level of .01 would mean that the probability of the chi-square
value must be .01 or less to reject the null hypothesis, a more stringent
criterion than .05.)
3. Compute the expected values.
4. Compute the chi-square statistic.
5. Determine the degrees of freedom
for the table. Then identify the critical value of chi-square at the specified
level of significance and appropriate degrees of freedom.
6. Compare the computed chi-square
statistic with the critical value of chi-square; reject the null hypothesis if
the chi-square is equal to or larger than the critical value; accept the null
hypothesis if the chi-square is less than the critical value.
7. State a substantive conclusion,
i.e., describe the meaning and importance of the test results in terms of the
historical problem under investigation.
Multivariate Analysis
(Theory Only)
Multivariate Data Analysis refers to
any statistical technique used to analyze data that arises from more than one
variable. This essentially models reality where each situation, product, or
decision involves more than a single variable. The information age has resulted
in masses of data in every field. Despite the quantum of data available, the
ability to obtain a clear picture of what is going on and make intelligent
decisions is a challenge. When available information is stored in database
tables containing rows and columns
Multivariate analysis methods
typically used for:
Consumer and
market research
Quality
control and quality assurance across a range of industries such as food and
beverage, paint, pharmaceuticals, chemicals, energy, telecommunications, etc
Process
optimization and process control
Research and development
ANOVA: One- Way and Two Way
Classification. (Theory Only)
One- Way ANOVA
Analysis of variance (ANOVA) is a
collection of statistical models used in order to analyze the differences
between group means and their associated procedures (such as
"variation" among and between groups), developed by R. A. Fisher. In
the ANOVA setting, the observed variance in a particular variable is
partitioned into components attributable to different sources of variation. In
its simplest form, ANOVA provides a statistical test of whether or not the
means of several groups are equal, and therefore generalizes the t-test
to more than two groups.
Formula
SSwithin = SStotal - SSamong
x = individual observation
r = number of groups
N = total number of observations (all
groups)
n = number of observations in group
Steps (assuming three
groups)
Create six columns: "x1",
"x12", "x2", "x22", "x3",
and "x32"
1. Put the raw data, according to group, in "x1",
"x2", and "x3"
2. Calculate the sum for group 1.
3. Calculate (Sx) 2 for group 1.
4. Calculate the mean for group 1
5. Calculate Sx2 for group 1.
6. Repeat steps 2-5 for groups 2 and 3
7. Set up SStotal and SSamong formulas and calculate
8. Calculate SSwithin
9. Enter sums of squares into the ANOVA table, and complete
the table by calculating: dfamong, dfwithin, MSamong, and MSwithin, and F
10. Check to see if F is
statistically significant on probability table with appropriate degrees of
freedom and p < .05.
Two Ways ANOVA
The two-way ANOVA is an extension of
the one-way ANOVA. The "two-way" comes because each item is
classified in two ways, as opposed to one way.
For example, one way classifications
might be: gender, political party, religion, or race. Two way classifications
might be by gender and political party, gender and race, or religion and race.
Each classification
variable is a called a factor and so there are two factors, each having several
levels within that factor. The factors are called the "row factor"
and the "column factor" because the data is usually arranged into
table format. Each combination of a row level and a column level is called a
treatment.
Assumptions
1.
The populations from which the samples were obtained must be
normally or approximately normally distributed.
2.
The samples must be
independent.
3.
The variances of the
populations must be equal.
4.
The groups must have the same sample size.
Hypotheses
There are three sets of
hypothesis with the two-way ANOVA.
The null hypotheses for
each of the sets are given below.
1. The population means of the first factor are equal. This
is like the one-way ANOVA for the row factor.
2. The population means of the second factor are equal. This
is like the one-way ANOVA for the column factor.
3. There is no interaction
between the two factors. This is similar to performing a test for independence
with contingency tables.
Two-Way ANOVA Table: It is assumed that main
effect A has a levels (and A = a-1 df), main effect B has b levels (and B = b-1
df), n is the sample size of each treatment, and N = abn is the total sample
size. Notice the overall degrees of freedom are once again one less than the
total sample size.
Source
|
SS
|
df
|
MS
|
F
|
Main Effect A
|
given
|
A, a-1
|
SS / df
|
MS(A) / MS(W)
|
Main Effect B
|
given
|
B,b-1
|
SS / df
|
MS(B) / MS(W)
|
Interaction Effect
|
given
|
A*B, (a-1)(b-1)
|
SS / df
|
MS(A*B) / MS(W)
|
Within
|
given
|
N - ab, ab(n-1)
|
SS / df
|
|
Total
|
sum of others
|
N - 1, abn - 1
|
Discriminated analysis
1. Discriminated Analysis (DA), a multivariate statistical technique is commonly used to build a
predictive / descriptive model of group discrimination based on observed
predictor variables and to classify each observation into one of the groups.
2. The common objectives of DA are
i) To
investigate differences between groups
ii) To
discriminate groups effectively
iii) To
identify important discriminating variables
3. Expressed as D= b0+b1X1+b2X2-----bkXk.
Where D=discriminate
score
b’s = discriminate coefficient or weight
X’s= independent variable
4. E.g. To predict whether patients recovered from a coma or not based on
combinations of demographic and treatment variables.
5. The predictor variables might include age, sex, general health, time
between incident and arrival at hospital, various interventions, etc.
6. In this case the creation of the prediction model would allow a medical
practitioner to assess the chance of recovery based on observed variables. The
prediction model might also give insight into how the variables interact in
predicting recovery.
7. This analysis helps in
a) Determination of which independent variable contributes to most of the
intergroup differences.
b) Classification of cases to one of the groups based on the values of the independent
variable
c) Evaluation of the accuracy of classification
Factor analysis
Factor analysis
is a Procedures primarily used for data reduction and summarization. Variables
are not classified as either dependent or independent. Instead, the whole set
of interdependent relationships among variables is examined in order to define
a set of common dimensions called Factors.
Purpose of Factor
Analysis
1. To identify underlying dimensions called Factors that explain the
correlations among a set of variables.
-- Lifestyle statements may be used to
measure the psychographic profile of consumers.
2. To identify a new, smaller set of uncorrelated variables to replace the
original set of correlated variables for subsequent analysis such as Regression
or Discriminate Analysis.
-- Psychographic factors may be used as
independent variables to explain the difference between loyal and non loyal
customers.
Assumptions
1. Models are usually based on
linear relationships
2. Models assume that the data
collected are interval scaled
3. Multicollinearity in the data is
desirable because the objective is to identify interrelated set of variables.
4. The data should be amenable for
factor analysis. It should not be such that a variable is only correlated with
itself and no correlation exists with any other variables. This is like an
Identity Matrix. Factor analysis cannot be done on such data.
Cluster
analysis
Cluster analysis aims at grouping observations in clusters
Clusters should possibly be characterized by:
•
High within
homogeneity: observations in the same cluster should be similar (not dissimilar)
•
High between
heterogeneity: observations placed in different clusters should be quite
distinct (dissimilar)
This means that we are interested in determining groups internally
characterized by an high level of cohesion. Also, different clusters should
describe different characteristics of the observations
We consider here two different approaches to cluster analysis
Hierarchical
(agglomerative) algorithms: Sequential procedures.
At the first step, each observation constitutes a cluster. At each step,
the two closest clusters are joined to form a new cluster. Thus, the groups at
each step are nested with respect to the groups obtained at the previous step.
Once an object has been assigned to a group it is never removed from the
group later on in the clustering process.
The hierarchical method produces a complete sequence of cluster solutions
beginning with n clusters and ending with one cluster containing all the
n observations.
Partitioning
algorithms: Iterative procedures
In these methods, the aim is to partition cases into a given number of
clusters, say G. The algorithm usually begins with an initial solution
(partition). Observations are then reallocated to cluster so as to maximize a
pre-specified objective function. Objects possibly change cluster membership
throughout the cluster formation process
Conjoint analysis
Conjoint
analysis is concerned with the measurement of the joint effect of two or more
attributes that are important from the customer’s point of view. In a situation
where the company would like to know the most desirable attributes or their
combination for new product or service, the use of conjoint analysis is most
appropriate.
Ex: an airline
would like to know, which is the most desirable combination of attributes to a
frequent traveler: 1.Punctuality 2. Air face 3.Quality of food served on the
flight 4.Hospitality and empathy shown.
Conjoint
analysis is a multivariate technique that captures the exact levels of utility
that an individual customer places on various attributes of the product
offering. It enables a direct comparison.
There are 3
steps in
1. Identification of relevant products or service attributes.
2. Collection of data.
3. Estimation of worth for the attribute chosen.
Application:
1. Conjoint analysis is extremely versatile and the range of applications
includes virtually in any industry.
2. New product or service design, including the concepts in the
pre-prototyping stage can specifically benefit from the conjoint application.
************
BY
Chandana
No comments:
Post a Comment