Monday, 12 December 2011

Statistics in IX Grade IX Grade

Hello Friends, in today's class we all are going to discuss about one of the most interesting and a bit complex topic of mathematics, Probability and statistics. Here I am going to tell you the best way of understanding probability problems and also discuss about its applications not just in mathematics world but also in other areas of studies as well. The other areas of study where it plays an important role are Physics, Chemistry, social science etc. In ninth grade what students suppose to do with probability and statistics are : They use an appropriate language to express their findings and duplications. Teachers tells them how to create questions to help them find the differences among several samples in a population. They assemble their studies of situations to include the part of experimental and random surveys. Students try to learn things to use and explain the univariate and bivariate in measurement and definitive data. The overall information is basically used to develop scatter plots, regression coefficients, and regression equations using technological tools.

Additionally, students also study the application of sample statistics for creating explanations using desired data analysis. This analysis is basically used to understand basic patterns of randomness for that probability that certain events may be independent of other events. Students need to understand the use of simulations to explain randomness of events.

Now Start with basic terminologies behind probability and statistics :
Probability and Statistics are used to understand the randomness and to collect, organize, describe, and analyze numerical data. Probability and statistics are dependent on each other starting from weather reports to studies of genetics, from election results to product preference surveys, probability and statistical language and concepts are increasingly present in the media and in everyday conversations. Students need this area of study to help them judge the correctness of a bickering supported by seemingly actuating data.

Firstly discuss about Probability : It is the study of random events. It is a way of telling or expressing a knowledge that an event will occur or has occurred. Probability theory plays an important role in various human activities that involves the analysis of a large set of data.

The probability of an event occurring given that another event has already occurred is called a conditional probability. The Mathematical view of this statement comes with the following equation that an event A occurs, given that an event B is already occurred.
P(B|A) = P(A and B) / P(A)
P(A/B) = Probability of occurrence of event A when the event B as already occurred and P(B/A) = Probability of occurrence of event B when the event A has already occurred.

The another method to calculate conditional probability is Baye's formula. It states that the probability of event B is the sum of the conditional probabilities of event B given that event A has or has not occurred. Mathematical expression for the above is : P(B) = P(B|A)P(A) + P(B|Ac)P(Ac)
And for the two independent events (for event A and event B) the expression is:
P(B)P(A) + P(B)P(Ac) = P(B)(P(A) + P(Ac)) = P(B)(1) = P(B)

Now let's take the basic behind statistics : Statistics is the practice or science of collecting and analyzing numerical data in large quantities. It is basically a study of the collection, organization, analysis and interpretation of the data. It is used to describe and analyze sets of test scores, election results, etc. Probability and Statistics together play an important role in finding out measures of central value, measures of spread, and helps in comparing of two data. These two are closely related to each other as statistical analysis or data are regularly analyzed to understand whether results can be formulated accurately about a particular event and also to make predictions on future or upcoming events. For example in an election time the early election results are analyzed to see if the results which is predicted is correct or not. They also try to predict or assumed the final outcome of the election.

So we can say that Statistics is the science which deals with variations , randomness and chance. It is different from other science as others works or studied on exact deterministic mathematics laws. In a lot of statistical analysis and experiments, the result depends on probability distributions as probability plays an important part in statistical analysis. Statistical analysis uses probabilities and probability calculations uses statistical analysis. For example in an experiment in Social Science, we assume a normal or common distributions for sample and population. The normal distribution is one of the probability distribution.

Let's take a basic example of probability theory to understand the basic concept:
Problem : Two coins tossed possibly multiple times and outcome is ordered pair:
Sample Space : (H, H), (T, H), (H, T), (T, T)
Field of subsets of all sample spaces : ΓΈ, (H, H), (T, H), (H, T), (T, T), sample space, (H, H), (T, H), ….
Solution : Let
A = (H, H), (T, H)
B = (H, T), (T, T)
Then:
P(A) = ½ , P(B) = ½
P(A intersection B) = ¼, P(A union B) = ¾


Mean and Mode :
The sample mean is basically an average and is formulated as the total of all the predicted outcomes from the sample divide by the total number of given events.
Let's take an Example to understand it better :
Problem : Suppose a randomly sampled six plots in the badland for a non-domestic weed and came up with the following counts of this weed in this region :
34, 43, 81, 106, 106 and 115
What we need to do is to compute the sample mean by adding and dividing by the number of samples by total number that is 6.
Solution:
34 + 43 + 81 + 106 + 106 + 115 / 6 = 80.83
Here we can say that the sample mean of non-indigenous weed is 80.83.
Mode : In a simple mathematical manner we can say that the mode of a set of data is the number with the highest frequency. In the above given example 106 is the mode. As in the above example 106 occurs twice and the rest of the possible outcomes occur only once.
The population mean is the average of the entire population and in real life it is impossible to compute.
Median
The problem with mean is that it mostly does not depict the original outcome. If only one possible outcome that is very much far from the rest of the data, then the mean will be affected by this outcome. This outcome is known as outlier. To avoid this situation we can use median. The median is basically a middle score. In this if we have an even number of events we can take the average of the two middle values. It is the best way for describing the typical value. It is mostly used for income and home prices.
Let's take an example to understand it better:
Problem: A person randomly selected 10 house prices in the city area. In Rs. 100,000 the prices were:
2.7, 2.9, 3.1, 3.4, 3.7, 4.1, 4.3, 4.7, 4.7, 40.8
If we are going to formulate the mean we would say that the average house price is Rs. 744,000.

3.7 + 4.1 / 2 = 3.9
The median house price is 390,000. This better reflects what house shoppers should expect to spend.
Another Example to understand it better : These are the given values, we need to apply both mean and median :
44, 50, 38, 96, 42, 47, 40, 39, 46, 50.

 
To find the sample mean, add them and divide by 10:
44 + 50 + 38 + 96 + 42 + 47 + 40 + 39 + 46 + 50 / 10 = 49.2
Noticeable thing is that the mean value is not a value of the sample.
Now on same set of data we need to find median:
To find the median, first sort the data:
38, 39, 40, 42, 44, 46, 47, 50, 50, 96
Notice that there are two middle values 44 and 46. To find the median we take the average of the two.
Median : 44 + 46 / 2 = 45.
Here the mean is larger than all. The mean is affected by outliers while the median is robust.
Law of averages:
Median : The Median is the middle value in the given data list. When the totals of the list are odd, the median is the middle entry in the list after sorting the list into increasing order. When the totals of the list are even, the median is equal to the sum of the two middle (after sorting the list into increasing order) numbers divided by two.

 
This is all for today. In next class I am going to discuss about other topics like variations, standard deviations etc.

No comments:

Post a Comment