TOPIC 7: APPLICATION OF SIMPLE STATISTICS | GEOGRAPHY FORM 3

#### Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population).

STATISTICAL DATA
Data refers to the actual pieces of information collected through your study.

For example, if you ask five of your friends how many pets they own, they might give you the following data: 0, 2, 1, 4, 18.

Also data defined as facts or figures from which conclusions may be drawn. Datum is the singular form of the noun data.

VARIABLE

A variable is anything or characteristic that data may have, or an attribute which changes in value under given conditions.

Variables include population size, age, sex, altitude, temperature and time.

#### For example the higher the attitude the lower the temperature and vise versa, for that reason increase or decrease of temperature depends on attitude.

DATA PRESENTATION

Data presentation refers to the process of organizing data and presenting them into different forms such as line graphs, pie chart, bar proportional diagrams, polygons and others.

GRAPHICAL DATA

After data have been collected, the next step is to present the data in different ways and forms.

Some of the forms in which the data may be presented include charts, graphs, lists, diagrams, tables, essays, graphs, histograms, and even sketches.

LINE (LINEAR) GRAPHS

Line graphs have unique properties that distinguish them from other graphs.

The properties of line graphs are as follows: General procedure to present data using line graphs

a. Get the data needed for plotting the graph.

b. Identify the independent and dependent variable. Statistically, the independent variables are placed on the x-axis while the dependent variables are placed on the y-axis.

c. Decide on the vertical scale depending on the graph space and values of the independent variable available.

d. Decide on the horizontal spacing of the graph according to graph space available.

e. Draw and divide the vertical and horizontal axes depending on the respective scales.

f. Plot and join the points to get the graph.

g. Write the title of the graph you have drawn.

h. Indicate the scale of the graph.

i. Show the key for the graph where necessary

Line graphs can be sub-divided into the following categories.

a. Simple line graphs

b. Group (comparatives) line graphs

c. Compound line graphs

d. Divergent line graphs

Simple line graph Construction procedure:

Use the following table which shows the average monthly temperature recorded in a certain weather station:

Average monthly temperature for station X Month Jan Feb Mar Apr May Jun Jul Aug Sept Oct Nov Dec Temp (°C) 23 24 26 28 29 28 26 26 26 27 26 25

#### Average monthly temperature for Station X Source: Hypothetical data Scale • Vertical – 1cm:3°C • Horizontal – 1cm: 1month

1. They are easy to draw, read and interpret.

2. They show specific values of data

3. They show patterns in data clearly

4. They enable the viewer to make predictions about the results of data.

5. It is easy to read the exact values against plotted points on straight line graphs.

1. They limit presentation of only one data or item over time.

2. One can change the data of a line graph by not using consistent scales on the axis.

3. They can give a wrong impression on the continuity of data even when there are periods when data is not available.

4. They do not give a clear visual impression of the actual quantities.

Group (comparative) line graph

A group line graph is the graph that involves drawing more than one line on the same graph.

It shows the relationship between sets of similar statistics for two or more items.

A group line graph is also known as Comparative line graph, Composite line graph, multiple line graph, Polygraph.

Usefulness of a group line graph

• Comparing different values or trends in two or more data variables.

• Examining the possibility of a relationship existing between the distributions of a number of variables over time.

• Comparing the distribution of the same variable at different places.

Construction:

The method of drawing a group line graph is the same as for a simple line graph.

Therefore, to draw each single line in a group line graph, follow similar steps used for construction of the simple line graph.

#### These data have been used to plot the group (comparative) line graph as shown below: Village/Year Geisangora Itiryo Bungurere Nyansincha 2000 10 15 25 25 2001 20 10 15 20 Source: Hypothetical data Maize production by three villages between 2000 and 2002 Scale: Vertical scale: 1cm to 5 tones Horizontal scale: 2 cm to 1 year

1. The quantity of each component is shown clearly by different line shadings.

2. Gives comparative analysis of data

3. It saves time and space since all the line graphs are drawn as a group.

1. The lines can be overcrowded and hence become difficult to read and interpret if many data are involved.

2. It does not give a clear visual impression of actual quantities. Compound line graph A compound line graph is used to analyse the total and the individual inputs of the specific commodities or economic sectors.

The graph involves drawing two or more lines, each line corresponding to one item in a different year or region.

The items are differentiated from each other or one another by shading differently.

Construction: The table below is used for construction of the graph.

The table contains hypothetical figures for mineral exports between 2010 and 2012.

Year/Mineral Diamond Gold Tanzanite 2010 10,000 16,000 20,000 2011 20,000 25,000 32,000 2012 25,000 35,000 40,000

Procedure:

• Simplify the data to make the presentation work easy by dividing each value by 1000. Year /Mineral Diamond Gold Tanzanite 2010 10 16 20 2011 20 25 32 2012 25 35 40

• Add the values for each year to get the cumulative export

• Plot the values for mineral exports against years on a graph. Usually the line graph for data with the highest values is drawn first.

• Draw the second line graph above the first one to show the next component. To get the values for plotting the second line graph, add the values of the first

• Draw the line graph for the last item (diamond) above that of the second item.

• Shade the component parts between the line graphs using different shadings.

• Label the axes, show the key and indicate the scale used to construct the graph.

1. Total values are shown clearly shown

2. It gives good visual impression which encourage understanding and interpreter

3. Combining all graphs in one saves space.

1. Graph construction is difficult and time-consuming.

2. It involves a lot of calculations which are difficult and time-consuming.

3. Reading and interpretation the value is difficult

Divergent line graph
Are graphs which represents negative (minus value) and positive (plus value) around a mean.

They are loss and gain graphs which show divergence or variation between export and import or profit and loss etc.

The mean is represented by zero axis drawn horizontally across the graph paper.

Year Yield (tonnes) 2012 1000 2013 1500 2014 500 2015 3000

Construction

• Sum up the values of all items or commodities. 1000 + 1500 + 500 + 3000 = 6000

• Calculate the arithmetic mean (average) of the values.

• Calculate the deviation from the mean of each value as shown in the table below. Deviation from the mean value Year X X – 2012 1000 -500 2013 1500 0 2014 500 -1000 2015 3000 +1500

• Plot the graph using the values of deviation from the mean; and remember to include the title and scale of the graph.

1. It clearly shows how items fluctuate from the mean.

2. It compares the values of the items and hence facilitates a sound conclusion.

3. It shows both the positive (profit) and negative (loss) phenomena. 4. It is easy to construct, read and interpret.

1. It involves many calculations and hence time-consuming.

2. It might be difficult to interpret if one lacks statistical skills.

3. It is applicable for only one item per graph.

BAR GRAPHS

Are graphs drawn to show variation of distribution of items by means of bars.

The bars should be separated from one another by a space.

A bar graph is also called bar chart or columnar graph. Types of bar graphs:

a. Simple bar graphs

b. Group or comparative bar graphs

c. Compound bar graphs

d. Divergent bar graphs

Simple bar graph
A simple bar graph is drawn to show a single item per bar and represents simple data.

Consider the data in the table below which shows the value of sisal exported by Tanzania between 1900 and 1993: Year Sisal export (Tsh ‘000) 1990 106126 1991 107430 1992 142601 1993 161180 1994 202425

Construction:-

1. Choose the appropriate scale.

2. Draw the axes and insert the bars. All the bars must have the

same width and spacing.

4. Insert vertical and horizontal scales and the title. Tanzania sisal export Scale: 1 cm to 50,000 tonnes

Advantages of a simple bar graph

1. It is simple to construct, read and interpret.

2. It has a good visual impression.

3. It can be used to compare how the amount of an item varies from time to time.

Disadvantages of a simple bar graph

1. It is limited to only one item or commodity and hence not suitable for massive data.

2. Not suitable for continuous data such as temperature.

Group (comparative) bar graph
A comparative bar graph consists of several bars drawn side by side on the same chart for the purpose of comparison.

The technique involves grouping of bars in a chart.

The graph can be used to show how production of certain commodities varies each year.

Construction:
The procedure for construction of the comparative bar graph is similar to that of drawing the simple bar graph except that the simple bar graph contains a single bar while the comparative bar graph comprises of multiple bars.

Consider the data in the table below, showing agricultural production in metric tonnes.

#### Group (comparative) bar graph showing crop yields in ‘000 kg (1986-1988)

Advantages of a group bar graph

1. The total values are expressed well for illustration of points.

2. It is easy to construct, read and interpret.

3. The importance of each component is shown clearly.

Disadvantages of a group bar graph

1. It is difficult to compare the totals of each item/component.

2. Trends such as fall and rise cannot be shown easily.

Compound (divided) bar graph
This is a method of data presentation that involves construction of bars which are divided into segments to show both the individual and cumulative values of items.

The length of each segment represents the contribution of an individual item in the total length while that of the whole bar represents the total (cumulative) value of the different items in each group.

Construction

• Get the data needed for presentation. For example, consider the table below, which shows the number of tourists who visited the named Tanzania National Parks from 1998 to 2002. Year /park 1998 1999 2000 2002 2003 Manyara 120,000 160,000 172,000 170,000 203,000 Serengeti 175,000 160,000 148,000 185,010 201,000 Tarangire 29,000 30,000 54,100 79,000 102,000 Mikumi 100,000 110,000 111,000 150,000 183,400

• Simplify the data (to make the presentation work easy) by dividing each value by 10,000.

Then add the values to get the total for each year. The simplified data are as shown in the table below.

• Determine the scale of the bar length based on the highest total value.

In this case, the highest total value is 68 (20 + 20 + 10 + 18).

• Decide on the bar spacing, for example, 1 cm apart.

• Draw the axes and label them.

• Start by drawing bars that represent the highest values.

• The first sets of bars to be drawn are those that represent the highest values.

On top of these, the second highest segments are drawn.

The last segments to be drawn are those with the lowest values in general.

• To make it easy to follow the rise and fall of individual values, a soft line could be drawn across bars to separate individual segments.

• Colour or shade the segments to improve the appearance and simplify interpretation.

• Inset the scales, key and title. Compound (divided) bar graphs showing tourist visits in 0’000 (1998-2002)

Advantages of compound (divided) bar graph

1. It is easy to read and interpret as the totals are clearly shown.

2. It gives a clear visual impression of the total values.

3. It clearly shows the rise and fall in the grand total values.

Disadvantages of compound (divided) bar graph

1. The values of individual segments above the first set are difficult to establish because they don’t start at zero.

To get the correct values of the top segments, you have to add the figures, which is difficult for someone not well equipped with statistical skills.

2. The graph is very difficult to construct and interpret.

3. It is not easy to represent a large number of components as this would involve very long bars with many segments.

Divergent bar graph

A divergent bar graph is a graph which shows the fluctuation of individual items from the mean.

Construction:

1. Calculate the arithmetic mean (average) of the items.

2. Subtract the mean from each item.

3. Draw the graph using the resulting values.

4. Insert the scale and title of the graph.

The data below show the enrolment of Form One students at Mara Secondary School from 1980–1985. Study the table and present the data by a divergent bar graph.

Year Number of students 1980 100 1981 150 1982 175 1983 200 1984 225 1985 300 Procedure:

• Find the arithmetic mean:

• Subtract the mean from each item: Year Number of students X – 1980 100 -92 1981 150 -42 1982 175 -17 1983 200 8 1984 225 33 1985 300 108

• Choose a suitable scale and construct the graph using the obtained values (X – ). A divergent bar graph showing student enrolment (1980-1985)

1. Fluctuation in values, which helps to detect the problem in general terms, is shown.

2. It is important for comparison of positives and negatives.

3. Profit (success) or loss (failure) can easily be deduced.

4. They are simple to construct, read and interpret.

1. Graph construction is time-consuming since it involves many steps.

2. The calculations involved may be difficult to someone who is poor at mathematics.

3. It is limited to analysis of only one variable.

Divided circles (pie charts)
A divided circle is also known as pie chart, circle chart or pie graph.

The chart involves dividing the circle into “pie slices” to represent and show relative sizes of data.

The size of each slice or segment is always proportional to the value it represents.

Divided circles can appear in two forms:

a. Simple divided circles.

b. Proportional divided circles.

A simple divided circle involves a single set of data whereas the proportional divided circle involves more than one set of data such that the circles will be proportional to the total quantity that each circle represents.

Simple divided circle Construction:

• Obtain the data to work on. Study this hypothetical record showing enrolment of Form One students in selected Secondary Schools in Tarime District: A table showing student enrolment in selected schools in Tarime District Name of school Number of students Nyansincha 85 Bungurere 80 Nyanungu 78 Magoto 78 Tarime 65 Nyamongo 70 Total 456

• Calculate the total number of students as shown in the table.

• Calculate the angle in a circle that would represent the number of students enrolled in each school.

For example, 85 out of 456 students enrolled in Nyansincha Secondary School will be represented in the circle by a segment with an angle of 85/456 ×630 = 67 degrees.

This will give the following results. Name of school Number of students Degrees Nyansincha 85 67° Bungurere 80 63° Nyanungu 78 62° Magoto 78 62° Tarime 65 51° Nyamongo 70 55° Total 456 360°

• Draw a circle of a reasonable size.

• Using a protractor, draw a radius from the 6 o’clock mark to the centre of the circle.

• Starting with the largest segment representing a specific component, measure and draw its angle from the centre of the circle.

• Do the same for other components in ascending order.

• Divide a circle into segments according to the sizes of the angles.

• Shade the segments and write the title and key of the drawn graph.

Student enrolment in selected Secondary Schools in Tarime District

1. It is easy to compare components as they are represented by angles.

2. Analysis and interpretation of data is easy.

3. It is easy to assess the proportion of individual components against the total.

4. Construction of this graphical representation is relatively simple.

5. It is easy to determine the value of each component since it is indicated on each segment.

6. Visual impression of the individual components is clear and facilitates the understanding of the information in the data.

1. It is time-consuming because it involves a lot of calculations.

2. The represented actual values remain hidden as the values shown on the faces of the segments may be in percentages.

3. Where the range of data is large and involves small and big values, accurate construction of the chart is difficult.

4. When the values of data set vary slightly, it is difficult to visualize the proportional differences between values (as it is the case in the pie chart above).

The Importance of Statistics to the User Statistics is important in geography because of the following reasons:

1. It enables the geographers to handle large sets of data and summarize them in a way that can be easily understood.

2. Statistics is very useful for planning at local and national levels. For example, statistics on census can be used to plan for social services.

3. It can also enable the geographers to make comparisons between geographical phenomena, e.g. to compare the amount of rainfall and agriculture production or population distribution in different regions, etc.

4. Statistics translates data into mathematical ways which make the application of quantitative techniques possible.

5. It enables the geographers to store the information in forms of numbers, graphs, tables, charts, etc.

6. Statistics give precise rather than generalized information. This offers a lot of satisfaction to the user.

SUMMARIZATION OF MASSIVE DATA
The massive data collected from the field have to be summarized so as to make it easy to read, interpret and apply.

The massive data can be summarized by the following ways:

1. Frequency distribution A frequency distribution shows a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class.

It is a way of showing unorganized data e.g. to show results of an election, income of people for a certain region, sales of a product within a certain period, student loan amounts, etc.

#### Such measures include the measures of central tendency, measures of dispersion (variability), measures of relationship (correlation) and measures of relative position.

METHODS OF PRESENTING SIMPLE AND MIXED DATA
Measures of central tendency (averages) A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data.

As such, measures of central tendency are sometimes called measures of central location.

#### 3. It is not appropriate in some distributions

Median

The median is the middle score for a set of data that has been arranged in order of magnitude.

#### So, if we look at the example below: 65, 55, 89, 56, 35, 14, 56, 55, 87, 45 We again rearrange the data in order of magnitude (smallest first): 14, 35, 45, 55, 55, 56, 56, 65, 87, 89 Only now we have to take the 5th and 6th score in our data set and average them to get a median of 55.5.

1. It is easy to calculate and understand

2. It can also be calculated in qualitative data

3. It is appropriate for skewed distribution

4. It is not affected by all extreme observations.

Hence, it is a better average than the arithmetic mean when extreme observations are present.

5. The values of a median can be obtained graphically.

1. It is not suitable for further mathematical treatment.

2. It is not rigidly defined.

3. It is based on all values or observations.

4. Compared to mean, median is more affected by fluctuation of sampling.

5. In case of ungrouped data, rearrangement of values in order of magnitude becomes necessary.

Mode
The mode is the most frequent score in a data set. It represents the highest bar in a bar chart or histogram.

You can, therefore, sometimes consider the mode as being the most popular option. An example of a mode is presented below:

1. It is simple to compute.

2. It is easy to understand and calculate. In some cases it can be located merely by inspection. The value of the mode can be obtained graphically from the histogram.

3. It gives a rough idea of the differences of the data set.

4. It is the only average that can be used when the data is not numerical.

1. It is not rigidly defined; hence it is unstable for large samples.

2. It is independent of sample size except under special circumstances.

3. It is not based on all the values of the data.

4. Mode is not suitable for further mathematical treatment.

5. As compared to mean, mode is affected to a great extent by the fluctuation of sampling.

6. There may be more than one mode (as is the case in the previous graph).

7. There may be no mode at all if none of the data are the same.

8. It may not accurately represent the data.

The Significance of Mean, Mode and Median
Measures of central tendency are very useful in statistics.

Their importance is because of the following reasons:

#### Note: we find the class interval by using the class limits as follows: i = upper class limit – lower class limit + 1 Importance of statistics Helps in the comparison of different geographical phenomena for example climate, population, commodity and production Used to summarize raw and bulk data for easy interpretative and visual explanation It facilitates land use planning Helps resources allocation and provision of social services for example food, health, water, education. Makes it easy to compare data Its knowledge simplifies research activities

