MBA SEMESTER 1
MB0040 – STATISTICS FOR MANAGEMENT- 4 Credits
(Book ID: B1129)
Assignment Set- 1 (60 Marks)
Note: Each question carries 10 Marks. Answer all the questions
1. Why it is necessary to summarise
data? Explain the approaches available to summarize the data distributions?
Graphical representation is a
good way to represent summarised data. However, graphs provide us only an
overview and thus may not be used for further analysis. Hence, we use summary
statistics like computing averages. to analyse the data. Mass data, which is
collected, classified, tabulated and presented systematically, is analysed
further to bring its size to a single representative figure. This single figure
is the measure which can be found at central part of the range of all values.
It is the one which represents the entire data set. Hence, this is called the
measure of central tendency.
In other words, the tendency
of data to cluster around a figure which is in central location is known as
central tendency. Measure of central tendency or average of first order
describes the concentration of large numbers around a particular value. It is a
single value which represents all units.
Statistical Averages: The commonly used statistical averages are arithmetic mean, geometric mean,
harmonic mean.
Arithmetic mean is defined as the sum of all values
divided by number of values and is represented by X.
Before we study how to compute
arithmetic mean, we have to be familiar with the terms such as discrete data,
frequency and frequency distribution, which are used in this unit.
If the number of values is
finite, then the data is said to be discrete data. The number of occurrences of
each value of the data set is called frequency of that value. A systematic
presentation of the values taken by variable together with corresponding
frequencies is called a frequency distribution of the variable.
Median: Median of a set of values is the value which is the middle most value when
they are arranged in the ascending order of magnitude. Median is denoted by
‘M’.
Mode: Mode is the value which has the highest frequency and is denoted by Z.
Modal value is most useful for
business people. For example, shoe and readymade garment manufacturers will
like to know the modal size of the people to plan their operations. For
discrete data with or without frequency, it is that value corresponding to
highest frequency.
Appropriate Situations for
the use of Various Averages
1. Arithmetic mean is used
when:
a. In depth study of the
variable is needed
b. The variable is continuous
and additive in nature
c. The data are in the
interval or ratio scale
d. When the distribution is
symmetrical
2. Median is used when:
a. The variable is discrete
b. There exists abnormal
values
c. The distribution is skewed
d. The extreme values are
missing
e. The characteristics studied
are qualitative
f. The data are on the ordinal
scale
3. Mode is used when:
a. The variable is discrete
b. There exists abnormal
values
c. The distribution is skewed
d. The extreme values are
missing
e. The characteristics studied
are qualitative
4. Geometric mean is used
when:
a. The rate of growth, ratios
and percentages are to be studied
b. The variable is of
multiplicative nature
5. Harmonic mean is used when:
a. The study is related to
speed, time
b. Average of rates which
produce equal effects has to be found
4.9 Positional Averages
Median is the mid-value of
series of data. It divides the distribution into two equal portions. Similarly,
we can divide a given distribution into four, ten or hundred or any other
number of equal portions.
2. Explain the purpose of tabular presentation of statistical data. Draft a
form of tabulation to show the distribution of population according to i)
Community by age, ii) Literacy , iii) sex , and iv) marital status.
The objectives of tabulation
are to:
i. Simplify complex data
ii. Highlight important
characteristics
iii. Present data in minimum
space
iv. Facilitate comparison
v. Bring out trends and
tendencies
vi. Facilitate further
analysis
Marital Status
|
Age/Sex
|
Educated
|
Non-Educated
|
||||
|
|
Below 20yrs
|
20-40
|
Above 40
|
Below 20yrs
|
20-40
|
Above 40
|
Married
|
Male
|
|
|
|
|
|
|
Female
|
|
|
|
|
|
|
|
Unmarried
|
Male
|
|
|
|
|
|
|
Female
|
|
|
|
|
|
|
3. Give a brief note of the
measures of central tendency together with their merits & Demerits. Which is
the best measure of central tendency and why?
Graphical representation is a
good way to represent summarised data. However, graphs provide us only an
overview and thus may not be used for further analysis. Hence, we use summary
statistics like computing averages. to analyse the data. Mass data, which is
collected, classified, tabulated and presented systematically, is analysed
further to bring its size to a single representative figure. This single figure
is the measure which can be found at central part of the range of all values.
It is the one which represents the entire data set. Hence, this is called the
measure of central tendency.
In other words, the tendency
of data to cluster around a figure which is in central location is known as
central tendency. Measure of central tendency or average of first order
describes the concentration of large numbers around a particular value. It is a
single value which represents all units.
Arithmetic mean: Arithmetic mean is defined as the sum of all
values divided by number of values and is represented by
Merits and demerits of
arithmetic mean
Merits
|
Demerits
|
It is simple
to calculate and easy to understand.
|
It is
affected by extreme values.
|
It is based
on all values
|
It cannot be
determined for distributions with open-end class intervals.
|
It is rigidly
defined.
|
It cannot be
graphically located.
|
It is more
stable.
|
Sometimes it
is a value which is not in the series.
|
It is capable
of further algebraic treatment.
|
|
Median: Median of a set of values is the value which is the middle most value when
they are arranged in the ascending order of magnitude. Median is denoted by ‘M’
Merits and demerits of
median
Merits
|
Demerits
|
It can be
easily understood and computed.
|
It is not
based on all values.
|
It is not
affected by extreme values.
|
It is not
capable of further algebraic treatment.
|
It can be
determined graphically (Ogives).
|
It is not
based on all values.
|
It can be
used for qualitative data.
|
|
It can be
calculated for distributions with open-end classes.
|
|
Mode: Mode is the value which has the highest frequency and is denoted by Z.
Modal value is most useful for
business people. For example, shoe and readymade garment manufacturers will
like to know the modal size of the people to plan their operations. For
discrete data with or without frequency, it is that value corresponding to
highest frequency.
Merits and demerits of mode
Merits
|
Demerits
|
In many cases
it can be found by inspection.
|
It is not
based on all values.
|
It is not
affected by extreme values.
|
It is not capable
of further mathematical treatment.
|
It can be
calculated for distributions with open end classes.
|
It is much
affected by sampling fluctuations.
|
It can be
located graphically.
|
|
It can be
used for qualitative data.
|
|
The best measure of tendency is arithmetic mean. It is defined as a value
obtained by dividing the sum of all the observation by their number, that is
mean= [sum of all the observations]/[number of the observations] Arithmetic
mean is used because it is simple to understand and easy to interpret. It is
quickly and easily calculated. It is amenable to mathematical treatments. It is
relatively stable in repeated sampling experiments.
4. Machines are used to pack sugar
into packets supposedly containing 1.20 kg each. On testing a large number of
packets over a long period of time, it was found that the mean weight of the
packets was 1.24 kg and the standard deviation was 0.04 Kg. A particular
machine is selected to check the total weight of each of the 25 packets filled
consecutively by the machine. Calculate the limits within which the weight of
the packets should lie assuming that the machine is not been classified as
faulty.
Since the sample size is 25, which is less than 30, it is a case of small
sample. T distribution is used to calculate
confidence limit.
Since sample
size is 25 which is less than 30 therefore it is a case of small sample
t-test
distribution is used to calculate confidence interval.
Given, Sample
size = n = 25
Standard
deviation, S = 0.04
Degrees of
Freedom, df = n-1 = 25-1 = 24
Mean weight, = 1.24
Weight = µ
α = 5% = 0.05
tα/2
= t 0.05/2 = t 0.025 = 2-064 at 95% confidence and degree of
freedom df = 24
The limits are,
= ± tα/2 S/√n
= 1.24 ± 2.064(
0.04 / √25 )
= 1.24 ± [
2.064 ( 0.04 / 5) ]
= 1.24 ±
0.016512
- tα/2 S/√n ≤ µ ≤ + tα/2
S/√
= 1.24 – 0.016512 ≤ µ ≤ 1.24 + 0.016512
= 1.223488 ≤
µ ≤ 1.256512
==========
5. A packaging device is set to fill
detergent power packets with a mean weight of 5 Kg. The standard deviation is
known to be 0.01 Kg. These are known to drift upwards over a period of time due
to machine fault, which is not tolerable. A random sample of 100 packets is
taken and weighed. This sample has a mean weight of 5.03 Kg and a standard
deviation of 0.21 Kg. Can we calculate that the mean weight produced by the
machine has increased? Use 5% level of significance.
Since sample
size is 100 which is a case of large sample
So Z-test
statistics will be used for hypothesis testing.
Let us take the
null hypothesis, H0
Let mean weight
has increased
H1
and HA for alternate hypothesis
H0 : µ =
5
H1 : µ
> 5 ( Right Tailed test )
Given, Sample size = n = 100
Mean Weight = = 5.03
kg
Standard
deviation = S = 0.21 kg
Level of
significance, α = 5%
Z = ( - µ ) / (S / √n)
=
(5.03 – 5 ) / (0.21 / √100)
Z calculated
= 1.428
Now, check the
table for 5%
Now, Z critical
= Zα = Z0.05
= 1.645 ( For one tailed
test )
Since
calculated value, Z calculated =
1.428 is less than its critical
value Zα = 1.645
Therefore, H0
is accepted.
Hence we
conclude the mean weight produced by the machine has increased.
6. Find the probability that at most
5 defective bolts will be found in a box of 200 bolts if it is known that 2 per
cent of such bolts are expected to be defective .(you may take the distribution
to be Poisson; e-4= 0.0183).
Given, total
number of bolts, n = 200
P (defective
bolt) =
2% = 0.02
Therefore, m =
np = 200 * 0.02 = 4
P(X = 0) = P (zero defective bolt)
= (e-m m0 )
/ 0!
= (e-4 40 ) / 1
= ( 0.0183 ) ( 1 ) / 1
= 0.0183
=========
P ( at most 5
defective bolts )
= P (X≤5)
= P (X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4) +
P(X=5)
= (e-m m0) / 0! + (e-m
m1) / 1! + ( e-m
m2) / 2! + ( e-m
m3) / 3! + (e-m m4) / 4! + (e-m
m5) / 5!
= e-m [ 1 + m1 / 1! + m2/2! + m3/3! + m4/4! + m5/5!
]
= e-4
[1 + 41 / 1 + 8/2
+ 64/6 + 256/24
+ 1024/120 ]
= 0.0183 [ 1 +
4 + 8 + 10.67 + 10.67 + 8.53 ]
= 0.0183 *
42.87
= 0.784521
=======
No comments:
Post a Comment
Note: only a member of this blog may post a comment.