Why three large data sets for the MEI specification?
29 August 2024
Keith Proffitt, MEI Curriculum Developer
A blog with this title was originally published on 1 August 2017 to cover the general questions raised during the reform on the approach taken in the H630/H640 OCR AS/A Level Mathematics B (MEI) specifications regarding the new large data set requirement. Each year the blog is updated to coincide with the release of the new large data set and updated support.
Large data sets
The large data sets (LDS) associated with AS and A Levels in mathematics should serve two purposes: they are a teaching resource and they provide a context for setting examination questions.
There are three distinct data sets published by OCR for use with this specification. One has data from individual people collected in health surveys, one has information about different countries of the world, and one has data about London boroughs and England regions. The aim is that teachers will use all three for teaching, but for each cohort of students just one will be the focus of some of the questions in the exam. Each data set is clearly labelled as to when it is used.
Data set |
7
Countries data
|
8
Boroughs and regions
|
9
Health data
|
Publish |
Sept 2022 |
2023
|
2024 (new)
|
Start teaching |
Sept 2023 |
2024
|
2025
|
AS Level exam (if sat) |
June 2024 |
2025
|
2026
|
A Level |
June 2025 |
2026 |
2027
|
So if you teach A Level Maths over two years, then the class you start teaching in September 2024 will have large data sets LDS 7, LDS 8 and LDS 9 for use in lessons, but will only see some questions on LDS 8 in their H630/02 AS exam in 2025 (if they sit AS) and their H640/02 A Level exam in 2026.
The data sets are refreshed versions of ones used in previous years, so that:
LDS 8 ≈ LDS 5 (used in H640/02 in 2023) ≈ LDS 2 (used in H640/02 in 2020).
You can find the latest data sets on our Maths B (MEI) A Level qualification webpage and all are available on Teach Cambridge.
Support for teachers
You can find notes on each individual large data set on Teach Cambridge as follows:
Each year I run free online professional development in the MEI Staffroom (external link) about using the current large data set. In 2024/5 there will be sessions around teaching ideas for using LDS 8 and other sessions focusing on the assessment of LDS 7. These sessions will also include access to extra exam-style questions based on LDS 7 for your students to practise on in the run up to the 2025 exams.
History of the development of the large data set
MEI and OCR have some experience of pre-release data from our Core Maths B qualification. The CIA World Factbook data set that forms the current pre-release for that qualification became the basis for our thinking and development for AS and A Level.
We tried to write different types of questions using that data set, based on A Level content. When doing this, we realised that in some countries things have changed quite a lot during the lifetime of the legacy mathematics specifications, so the data set would need to be updated from time to time. We didn’t want students learning about how things used to be in the world 15 years ago if that no longer reflected the current position.
We were aware that some students (and maybe teachers) did not enjoy the statistics in the legacy Mathematics A Levels. We think that may be because in mathematics the focus has been on learning statistical techniques without much idea of why you might want to use them.
The large data sets provide a place to use the techniques. As part of the development, MEI worked on a project with students and teachers working with different large data sets; the students were really enthusiastic about working with real data and the way this helped them to extend their understanding.
We thought that working with more than one data set could encourage students to understand that the techniques they are learning are applicable to a wide variety of data.
This needs a three-year cycle – two years for using the data set in teaching and a year to review and update if necessary. LDS 7 was a refreshment of the data from LDS 4 (which was itself a refreshment of LDS1), and LDS 8 is a refreshment of the data from LDS 5 (which was a refreshment of LDS2). The data in LDS 3 is not as focused upon time series as the other two, and no issues have been raised in the post series reviews so the decision was taken to keep this data set in its original form, so LDS 9 and LDS 6 are simply LDS 3 republished.
The decision to replace or refresh the LDS is dependent upon the post-series review of the questions set in the live assessment in 2025, so LDS 10 may be a refreshment or replacement of LDS 7.
Data sources
We wanted to make the process of working with data manageable for teachers, educationally valuable for students and workable for examining. We decided that three data sets – one per cohort – updated on a rotating cycle would do the trick. In the first year of teaching the specification, teachers might choose to work with one data set. The next year, they could still use the lessons that had gone well as well as introducing the next data set and so on.
Our hope is that teachers will use all the data sets for teaching, concentrating more on the examination data set nearer the end of the course. For students, working with more than one data set will help them see that statistics is about working with a variety of data sets.
The data in the CIA World Factbook is grouped by country; we realised that data based on individuals would allow better teaching of distributions. There aren’t many publicly available data sets which contain ungrouped data on individuals. The NHANES data set, from American health surveys, is often used in statistics courses and it contains a wealth of data so we decided to use that as one data set.
Having got data about countries and data about (American) individuals, we thought it would be good to have some England-based data. The London Datastore is a good place to find suitable data and so we ended up with the following three initial data sets, which we hope will appeal to students with different interests in terms of other subjects they are taking.
- LDS_1 – data about countries
- LDS_2 – comparative data about the boroughs of London and the regions of England
- LDS_3 – health-based data about individuals.
The second cycle was a process of refreshment:
- LDS_4 – data about countries (refreshed LDS 1)
- LDS_5 – comparative data about the boroughs of London and the regions of England (refreshed LDS 2)
- LDS_6 – health-based data about individuals (refreshed LDS 3)
We have now completed the third cycle of refreshment
- LDS_7 – data about countries (refreshed LDS 1/LDS 4)
- LDS_8 – comparative data about the boroughs of London and the regions of England (refreshed LDS 2/LDS 5)
- LDS_9 – health-based data about individuals (refreshed LDS 3/LDS 6)
Stay connected
How have you used the large data sets in class? We’d love to hear your thoughts in the comments below.
If you have any questions, get in touch with us by email at maths@ocr.org.uk or message us at @OCR_Maths. You can also sign up to receive email updates and receive information about resources and support.
If you are considering teaching any of our qualifications, use the expression of interest form to let us know, so that we can help you with more information.
About the author
Keith Proffitt is a Curriculum Developer for MEI. Keith has a BA in Mathematics and a PGCE in Secondary Mathematics. He taught in secondary schools for 25 years, including 13 as Head of Mathematics. He worked for OCR for over 5 years, which included being Qualifications Manager for the MEI A Level specifications in mathematics and further mathematics. Since April 2014 he has worked for MEI. He was involved in developing the current OCR B (MEI) qualifications in mathematics and further mathematics. A large part of his job is supporting teachers of these qualifications with resources, professional development and advice. He also does a range of other things supporting teachers of Core Maths and GCSE.
Related blogs