From Raw Data to a Grouped Frequency Table

From Teach

Contents

Lesson introduction / Recap

  • "Good morning, class!"
  • "Today we're going to do a quick recap on frequency tables which we learnt about last week, and then we're going to extend that knowledge to what are known as Grouped Frequency Tables."
  • "We use simple frequency tables wherever the data can only take a few values -- for example, counting up the number of people ill on each day of the week -- but in most cases in the real world, the data we measure can take a continuum of values -- for instance, measuring people's height -- and that is where the Grouped Frequency Table comes into their own."
  • "But first, let's remind ourselves about simple frequency tables by looking at this example. Here I asked all the pupils in my last class to tell me their shoe size:"

RAW DATA

9, 8, 7, 8, 6, 7, 8, 6, 8, 8, 6, 9, 6, 7, 7, 9, 7, 7, 9, 9
  • What is raw data?
    • A: It is the data written down in the order in which it arises.
  • Can anyone tell me how many pupils there are in my last class?
    • Good answer, but actually it's 21, because Chardonnay Thomas was off sick today and unable to tell me her shoe size.

So we use the raw data to start constructing our frequency table. But as you can see, I haven't finished it. Can you help me fill in the gaps?

INCOMPLETE FREQUENCY TABLE

Shoe sizes
Size (x)
Tally
Frequency (f)
Size x Frequency (fx)
6 IIII 4  ? (ans = 24)
7 †††† I  ? (ans = 6)  ? (ans = 42)
8  ? (ans = ††††) 5  ? (ans = 40)
9  ? (ans = ††††)  ? (ans = 5)  ? (ans = 45)
TOTAL -  ? (ans = 20)  ? (ans = 151)

First, is everyone happy with the term 'Tally'?

  • Who can tell me what we mean by the term? Hands up, please.
    • A: It is the physical record of an amount as it is being counted. They are usually marked in groups of five. It's there to help you in your calculation.
  • So for shoe size 7, what is the frequency?
    • A: 6.
      • Does everyone see how we got that?
  • And for shoe size 8, can someone come up to the board and write the tally?
  • For shoe size 9, we have neither the tally, nor the frequency, so we have to go back to the raw data.
    • (Do this myself, crossing them off at the top as I count them on the tally on the bottom.)
  • Can anyone tell me what the Mode is?
    • It's 7.
  • What do we mean by mode?
    • It's the piece of data found most often.
  • Out of interest, can anyone tell me the RANGE of this set of data?
    • A: 9 - 6 = 3
  • What do we mean by range?
    • It's the difference between the smallest and largest value in a set of data.
    • So does everyone see how we got to 3?
    • We will be returning to the range in the main section of today's lesson.
  • And finally, we come on to the mean?
    • Who can give me another word for mean?
    • And what do we mean by mean?
      • A: It's the value found by adding together all the separate values of the data and dividing by how many pieces of data there are.

So in our example, we could return to the RAW DATA, and add up all the numbers on our calculators, and then divide by the number of pieces of data which is, Rabia ...

But now we have the frequency table, we have a slicker and more reliable way to perform the calculation.

  • If we look at the 6s in our Raw data, for example, we could add them up: 6, 12, 18, 24.
  • But instead, we could get the same result by multiplying it out in our Frequency Table: 6 x 4 = 24, and writing the answer in a new column.
  • If we call the frequency f and the the thing we are measuring (shoe size), x, then we could make the heading for our new column fx.
  • Multiply out all four rows.
  • So the penultimate step in the process is to add up the numbers in this column.
  • And then we divide it by the total number of pieces of data, which is .... 20.

BACKUP QUESTION IF GOING TOO FAST

  • Suppose we're working with a slightly different set of raw data. We don't know the frequency of size 9 shoes, but we do know the mean is now exactly 8. What is the frequency of size 9 shoes?

Introduction to main topic

The sub-title of the problem which makes up the main feature of this session is "What Price should I charge for my newsletter?".

In the last job I had before I joined this school, I worked in IBM's market intelligence department.

One of my roles at IBM was to monitor the computer-related news and articles written by external consultants about the IT industry and produce a one-page summary each week.

The strange thing is that, although I left IBM on the 6th June, I haven't stopped producing this weekly newsletter, and distributing it to about 600 IBM employees for free. No-one replaced me at IBM, as far as I know, and no-one else has the knowledge or the inclination to take over the task of writing the newsletter.

This is it -- it's nothing special -- but its key strengths are that no-one else produces this somewhat IBM-biased summary of the week's news, and it always guarantees to be no longer than one page.

Finding it difficult to drop the habit of producing this newsletter, I wondered whether I could make any money out of the newsletter to fund my little writing addiction. The question was: "What price should I charge?". So I sent out a very short questionnaire to my 600 readers, and this is an extract from the replies I received.

  • 500 receipients did not reply.

Key questions

  • What sort of objective might I set myself for setting the price?
    • i.e. What figure might I want to maximise as someone running a business?
  • Bear in mind that the cost of sending out 10,000 copies on the Internet is the same as sending out 10. Every additional subscription received is pure profit.
  • Some may say: maximise the number of readers.
    • But I feel I have to charge everyone the same price, so here I would have to charge them £0 and I would be back to square one.
  • Others may say: maximise the revenue received.
    • That's my view too.
    • I need to get some sort of picture about what the most popular price is, and then do the sums to work out how much subscription revenue I will get each year.

Deciding to group

  • But if we look at the raw data, it's a lot messier than our shoe size example.
    • For a start, the pound sign makes it harder to read. That's a lesson for all of us: leave the currency symbol off each number, but make a note at the top that all the values are in pounds.
    • and second, the data is taking lots of different values -- even numbers with decimal places.
  • First, can I ask you: what is the range of this set of data?
    • A: £65 - £0 = £65
  • Our solution is to group the numbers of a similar size together -- so that, for example, £28 is in the same group as £26.
  • The guidance from the text books is that we should choose our group sizes so that there are between 5 and 9 groups.
    • If you choose too many groups, you get too fragmented a bar chart. (Draw lots of bars of different, ragged heights.)
    • If you choose too few, you lose all insight. (e.g. £0-70 bar)

So we have to choose the number of groups to be between 5 and 9, so that our range of £65 can be split evenly between them.

  • My suggestion is that we go for groups of £10. So the first group is £0 to £9.99, the second group is £10 to £19.99, and the top group is £60 to £69.99.
    • How many groups will that give us?

Building the frequency table (on the whiteboard)

Price they will Pay
Maximum Price (x)
Tally
Frequency (f)
Size x Frequency (fx)
£0 - £9.99 8
£10 - £19.99 4
£20 - £29.99 5
£30 - £39.99 3
£40 - £49.99 2
£50 - £59.99 2
£60 - £69.99 1
TOTAL - 25

Follow-up Questions on the Grouped Frequency Table

  1. What is the mode?
    • A: £0-£9.99
  2. If we ignore them, because none of them are going to pay anything, what is the next most popular group?
    • A: £20-£29.99.
  • That gives us an indication of a price we might charge.
  • The key point is that if we charge say, £20, then we get all of these readers (13), and we lose all of these (12).
    • What would be our total revenue? (£260)
  • What happens if we charge £30? How many readers do we hang on to (8)?
    • What would be my total revenue then? (£240)

We are beginning to make some sense of the raw data, by using a grouped frequency table.

And in the next lesson, my very able colleague Lucy will show you another way in which you can exploit the grouped frequency table.

Post-exercise thoughts

  • What do you think I ought to do about the readers who didn't reply?
    • Chuck em?
    • Give them another chance?
  • What should I do about the readers who answered £0?
    • I can't very well continue to send them the newsletter for free if I'm charging others for it.

Summary of main topic

  • So in today's lesson, we've shown how you can use a grouped frequency table to handle data that arrives in messy quantities.
  • We've reminded ourselves that ranking the data and using a tally to add up the frequencies helps us to make sense of the data.
  • And we've learnt that we should choose our group size so that there are between 5 and 9 groups across the range.

notes to myself in preparation

  • Draw out the episode flow chart, tick them off as you make progress
Personal tools
lessons in production
School-specific
Lessons by class
University Exercises