1 Background

STEM Learning aimed to analyse the impact that engaging with CPD has on STEM A level uptake. The Department for Education publish statistics on the uptake and attainment of A level subjects amongst 16-19 year olds. STEM Learning have access to their CPD attendance data and can identify schools and colleges who have engaged with our CPD.

This analysis explores the relationship between engagement with STEM Learning’s suite of science CPD at the National STEM Learning Centre in York and across the network of Science Learning Partnerships. We propose that engagement with KS3 and KS4 science CPD has a long-term impact on student attainment, aspirations, motivation and ultimately progression within STEM.

We predict that engagement will lead to a higher proportion of students progressing to study STEM A levels compared to non-engaged schools.

2 Engagement Data

library(tidyverse)
library(plotly)
setwd("C:/Users/B.Dunn/OneDrive - STEM Learning Limited/GitHub/KS5 Uptake/KS5_uptake2019")
engagement_uptake <- read.csv("Engagement_KS5_Uptake.csv")

Our dataset shows schools who have engaged with STEM Learning CPD in the period 01 Aug 2014 to 31 July 2016 and the number of students from their Year 11 GCSE cohort who have progressed to complete at least 1 STEM A level in the periods of summer 2016, 2017, 2018 and 2019.

We can see that 2,107 secondary schools engaged with STEM Learning science CPD over this period, representing approximately 57% of secondary schools.

library(kableExtra)
engaged <- engagement_uptake %>%
  group_by(engagement_all) %>%
  summarise(Schools = n()) %>%
  mutate(Proportion = Schools / sum(Schools)) %>%
  mutate(Proportion = round(Proportion * 100, 1))

kable(engaged, col.names = c("Engaged?", "Number of Schools", "% of schools")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) 
Engaged? Number of Schools % of schools
0 1593 43.1
1 2107 56.9

3 Uptake Data

We can see the number of students who were on roll at the end of GCSE (and therefore potential could have taken STEM A levels) and also the number who actually did take at least one STEM A level. This data is shown in the exam periods of June 2016, 2017, 2018 and 2019.

uptake <- engagement_uptake %>% summarise(STEM_2019 = sum(All_1_Alevel_2019),
                                All_2019 = sum(All_count_2019),
                                STEM_2018 = sum(All_1_Alevel_2018),
                                All_2018 = sum(All_count_2018),
                                STEM_2017 = sum(All_1_Alevel_2017),
                                All_2017 = sum(All_count_2017),
                                STEM_2016 = sum(All_1_Alevel_2016),
                                All_2016 = sum(All_count_2016)) %>%
  pivot_longer(cols = 1:8, names_to = "Who", values_to = "Students") %>%
  separate(col = Who, into = c("Who", "Year"), sep = "_") %>%
  pivot_wider(names_from = "Who", values_from = "Students") %>%
  arrange(Year) %>%
  mutate(Proportion = STEM / All) %>%
  mutate(Proportion = round(Proportion * 100, 1)) %>%
  ungroup()

kable(uptake, col.names = c("Year", "Number of Students entering 1+ STEM A level", "Total GCSE Students", "% of GCSE students progressing to enter 1+ STEM A level")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) 
Year Number of Students entering 1+ STEM A level Total GCSE Students % of GCSE students progressing to enter 1+ STEM A level
2016 81347 557001 14.6
2017 86657 549396 15.8
2018 91791 534404 17.2
2019 93821 511059 18.4
uptake_plot <- ggplot(uptake, aes(x = Year, y = STEM)) +
  geom_point() +
  geom_line(group = 1) +
  theme_bw() + 
  ylab("Number of Students entering at least 1 STEM A level")

uptake_plotly <- ggplotly(uptake_plot)
uptake_plotly

Overall, we can see that the number of students taking at least one STEM A level has increased over the past 4 years. Similarly, the proportion of GCSE students who progress to enter at least one STEM A level has also increased from 14.6% to 18.4%.

4 How does engagement with STEM Learning CPD affect uptake?

In the period 01 Aug 2014 to 31 July 2016 2,107 schools engaged with relevant STEM Learning CPD. This compares to 1,306 schools who did not engage with relevant STEM Learning CPD (although, they may have engaged in other ways). These two groups form our intervention and control groups. We also excluded 287 schools from the sample, who had engaged with relevant STEM Learning CPD outside the period of interest, and therefore may have affected STEM A level uptake.

For both schools engaged with STEM Learning CPD and those non-engaged, we see a year on year increase in the numbers of students progressing to STEM A levels, as shown in the graph below.

## Calculate the number of entries
engagement_summary <- engagement_uptake %>% 
  filter(engagement_all == 1 | engagement_all == 0 & engaged.pre.either == 0) %>%
  group_by(engagement_all) %>% 
  summarise(
    STEM_Entries_2016 = sum(All_1_Alevel_2016),
    STEM_Entries_2017 = sum(All_1_Alevel_2017),
    STEM_Entries_2018 = sum(All_1_Alevel_2018),
    STEM_Entries_2019 = sum(All_1_Alevel_2019)) %>%
  pivot_longer(!engagement_all, names_to = "Year", values_to = "Entries") 

engagement_summary$Year <- gsub("STEM_Entries_", "", engagement_summary$Year)

## Calculate the total cohort
engagement_total <- engagement_uptake %>% 
  filter(engagement_all == 1 | engagement_all == 0 & engaged.pre.either == 0) %>%
  group_by(engagement_all) %>% 
  summarise(
    Total_2016 = sum(All_count_2016),
    Total_2017 = sum(All_count_2017),
    Total_2018 = sum(All_count_2018),
    Total_2019 = sum(All_count_2019)) %>%
  pivot_longer(!engagement_all, names_to = "Year", values_to = "Total") 

engagement_total$Year <- gsub("Total_", "", engagement_total$Year)

## Join entries and totals and calculate percentages
engagement_summary <- left_join(engagement_summary, engagement_total) %>%
  mutate(Pct_Entries = Entries / Total)

engaged_uptake_plot <- ggplot(engagement_summary, aes(x = Year, y = Entries)) +
  geom_point() +
  geom_line(aes(group = engagement_all, colour = as.factor(engagement_all))) +
  theme_bw() + 
  ylab("Number of Students entering at least 1 STEM A level")

ggplotly(engaged_uptake_plot)

However, we can also present this data as the proportion of total students.

engaged_pct_plot <- ggplot(engagement_summary, aes(x = Year, y = Pct_Entries)) +
  geom_point() +
  geom_line(aes(group = engagement_all, colour = as.factor(engagement_all))) +
  theme_bw() + 
  ylab("Percentage of Students entering at least 1 STEM A level") +
  geom_text(aes(label = (round(Pct_Entries*100, 1))), position = position_nudge(y = -0.005))

ggplotly(engaged_pct_plot)

The graph suggests that progression to STEM A levels has increased at a similar rate for both engaged and non-engaged schools, however engaged schools start at a much higher baseline in 2016. Further analysis of the data shows that the year-on-year increase for engaged schools is higher than for non-engaged schools, as shown in the table below.

## Calculate %s
pct_diffs <- engagement_summary %>%
  select(engagement_all, Year, Pct_Entries)

pct_diffs$engagement_all <- gsub("1", "Engaged", pct_diffs$engagement_all)
pct_diffs$engagement_all <- gsub("0", "Non_Engaged", pct_diffs$engagement_all)

pct_diffs <- pivot_wider(pct_diffs, names_from = engagement_all, values_from = Pct_Entries)

## Then calculate diffs vs. 2016 baseline
# CDefine 2016 baseline
Engaged_baseline <- as.double(pct_diffs[1,3])
NE_baseline <- as.double(pct_diffs[1,2])

# calculate diffs vs baselines
YonY_diffs <- pct_diffs %>%
  mutate(Engaged_difference = Engaged - Engaged_baseline,
         Non_engaged_difference = Non_Engaged - NE_baseline) %>%
  select(1,2,5,3,4)

YonY_diffs[1,3] <- NA
YonY_diffs[1,5] <- NA

# Present as a Kable

kable(YonY_diffs,
      col.names = c("Year",
                    "% of Students entering 1+ STEM A level, from non-engaged schools", 
                    "Non-engaged % difference vs. 2016", 
                    "% of Students entering 1+ STEM A level, from engaged schools",
                    "Engaged % difference vs. 2016"
)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) 
Year % of Students entering 1+ STEM A level, from non-engaged schools Non-engaged % difference vs. 2016 % of Students entering 1+ STEM A level, from engaged schools Engaged % difference vs. 2016
2016 0.1202236 NA 0.1530160 NA
2017 0.1294557 0.0092321 0.1654322 0.0124162
2018 0.1428854 0.0226618 0.1795929 0.0265769
2019 0.1539138 0.0336902 0.1919553 0.0389393

We can compare the differences for engaged and non-engaged schools to create a difference-in-difference score, with postive numbers showing the extent to which engaged schools progressed versus non-engaged schools.

Diff_in_diff <- YonY_diffs %>%
  select(1, 3, 5) %>%
  filter(Year != 2016) %>%
  mutate(DinD = Engaged_difference - Non_engaged_difference)

kable(Diff_in_diff,
      col.names = c("Year",
                    "Non-engaged % difference vs. 2016", 
                    "Engaged % difference vs. 2016",
                    "Difference in difference (engaged - non-engaged)"
)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) 
Year Non-engaged % difference vs. 2016 Engaged % difference vs. 2016 Difference in difference (engaged - non-engaged)
2017 0.0092321 0.0124162 0.0031841
2018 0.0226618 0.0265769 0.0039150
2019 0.0336902 0.0389393 0.0052491

We can use these difference in difference scores to estimate the number of additional students progressing to STEM A levels as a result of the school’s engagement with STEM Learning CPD. This is done by multiplying the difference-in-difference by the total GCSE cohort from engaged schools. In other words, we estimate how many additional GCSE students progressed to take STEM A levels, who may not have done so if the school did not engage with STEM Learning CPD, adn thus increased at the same rate as non-engaged schools.

## Draw in the total GCSE cohort (from above) and then multiply it by diff in diff
Diff_in_diff2 <- Diff_in_diff %>%
  select(1, 4) %>%
  left_join(., filter(engagement_total, engagement_all == 1)) %>%
  select(-3) %>%
  mutate(Additional = round(DinD * Total, 0))
## Joining, by = "Year"
kable(Diff_in_diff2,
      col.names = c("Year",
                    "Difference in difference (engaged - non-engaged)",
                    "Total GCSE Students from engaged schools",
                    "Additional STEM A level students from engaged schools"
)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) 
Year Difference in difference (engaged - non-engaged) Total GCSE Students from engaged schools Additional STEM A level students from engaged schools
2017 0.0031841 383305 1220
2018 0.0039150 373194 1461
2019 0.0052491 357802 1878

5 Conclusion

By calculating the difference in progress made by engaged and non-engaged schools, we can estimate the number of additional students from engaged schools who have progressed to enter at least one STEM A level. The above analysis shows that over 4,500 additional students have progressed to enter STEM A levels as a result of STEM Learning support.