Acknowledgements: I’d like to thank Sasha Dietrichson & Bryan O’Haver for their work on the data extraction, transformation and analysis required for this study.

Our data science team has been focused lately on how people use Blackboard Learn. We’re conducting analysis to tune the learning analytics algorithms in Blackboard Learn with the Ultra experience and ensure that the trigger-based notifications are both accurate and at the right frequency to be acted upon. Too many notifications, and they’ll be ignored; not enough notifications, and it’ll be too late to make a difference.

I’m writing this post to share some interesting findings. It is less a final product than reporting out a few relevant findings answering some long-standing questions about the value and effectiveness of data from the LMS. The guiding question behind the analysis: “is total time spent in a Blackboard Learn course a good predictor of final grade?”

Multiple research studies on individual courses have found a significant relationship between frequency of use of the LMS and student grade (Rafaeli and Ravid 1997, Morris, Finnegan et al. 2005, Dawson, McWilliam et al. 2008, Macfadyen and Dawson 2010, Fritz 2011, Ryabov 2012, Whitmer, Fernandes et al. 2012). The value of LMS data has been far larger than what is found in conventional demographic or academic experience variables in explaining variation in course grades. However, when analysis is expanded to all courses at an institution, several studies have found no relationship or an extremely small relationship (Campbell 2007, Lauria 2015). Does learning analytics only apply to only a small number of courses, or is it broadly applicable to most courses? And what is the size of this relationship? How useful is this data, and what does it tell us about the value people are getting from their LMS?

Before diving into the findings, I want to clarify my assumptions underlying this analysis and LMS data altogether. Use of the LMS does not make a difference in student learning in itself; it’s the instructional practices, pedagogical approaches, and relationships underlying use of the LMS that make a difference (classic “No Significant Difference” theory, for those familiar with that literature). It’s not that the application doesn’t matter; but how it’s used can’t be ignored. In a similar fashion, I interpret frequency of LMS use as a proxy for student effort. With time, we’ll be able to generalize more sophisticated constructs, but this seems like the best place to begin for a variety of reasons.

Sampling Approach and Data Anonymization

We extracted 1,599M records of Blackboard Learn log use that were transformed into normalized data. Because we were interested in the relationship between activity and grade, we sampled records that included a graded item for that week (admittedly this biases the sample toward courses with more active use of the LMS). Each row contained information for one user in one course for one week. We calculated the final grade and final duration for the course and included it in each record for ease of analysis.

We filtered the data using criteria that would hopefully include course shells from a real instructional course:

  • standard grade (between 0 % and 120%)
  • reasonable duration (at least 60 average minutes in the course, and less than 5,040 minutes per week)
  • enrollment indicating likely academic course with potential for range in activity (between 10 and 500 students)

Our resultant data set was reduced by 13% and included:

  • 1.2M students
  • 34,519 courses
  • 788 institutions

All data in the sample was not only anonymized, but de-personalized without any information that could be used to identify individuals nor institutions.

Finding 1: Small relationship between duration and grade by student across courses

Figure of regression model showing relationship total time in Blackboard Learn versus final grade

We ran a linear regression of final course grade on student time in the Blackboard Learn course. We ran an additional similar regression using “relative” time compared to other students in the same course. The results were statistically significant, which in itself isn’t very meaningful given the size of the sample. The effect of time on final grade explained a mere 1.55% (R2=.0155) of the variation in final grade. The effect was slightly higher for relative effort within the same course; it explained 1.7% of the variation in final grade.

These findings are disappointing in terms of potential for moving the needle on student learning; but the numbers are the numbers. Reflecting on the results, perhaps this is not surprising. Looking at student effort in itself, without considering the course context in which that effort is expended, doesn’t lead to meaningful learning outcomes. Our next analysis began to explore the relationship within the course context.

Finding 2: Substantial number of courses with strong relationship between duration and grade

Next, we ran the same model for each course in the sample. Yes, 34,519 regression models; good thing for nested loops and multi-core servers. We created a data table with key statistics for each of the models, and ran descriptive analyses against these statistics. We found interesting and promising results. Adjusted R squared for significant courses

Out of the courses analyzed, 7,648 (22%) had statistically significant results – affecting almost 390,000 students. The mean effect size explained 20% of the variation in final grade. This was quite an improvement! The distribution in this effect size is largely normally distributed as illustrated at the right, with a slight skew toward lower values.

We analyzed factors within the data set that might explain the variation in this effect size – namely, the average time spent in a course, enrollment size, and other context criteria – and did not find a clear indicator that could help us to understand and predict stronger from weaker effects.

Implications & next steps

We are using the results from this research to revise the triggers and notifications that are built into Blackboard Learn with the Ultra experience, and make the results more accurate for instructors and learners. The results also validate the approaches we are taking in X-Ray Learning Analytics and Blue Canary solutions by building custom predictive models at the course level.

Our next step in this research is to examine course design: how is Blackboard Learn being used? Which tools are being used, and how often, and what common constellations of tools are being used in courses? We hope that these results will provide deeper insight into effective uses of Blackboard Learn that we can share and recommend to faculty and people charged with professional development. We’ll also use these results to inform our analytics and other components of Blackboard Learn.

If you’d like to further discuss these results or have ideas for research, please drop me a line and I’d be happy to connect with you.



Campbell, J. P. (2007). Utilizing student data within the course management system to determine undergraduate student academic success: An exploratory study. A. G. Rud. United States — Indiana, Educational Studies. Educational Studies.

Dawson, S., et al. (2008). Teaching smarter: How mining ICT data can inform and improve learning and teaching practice. ascilite 2008, Melbourne.

Fritz, J. (2011). “Classroom walls that talk: Using online course activity data of successful students to raise self-awareness of underperforming peers.” The Internet and Higher Education 14(2): 89-97.

Lauria, E. J. M. B., Joshua (2015). Mining Sakai to Measure Student Performance: Opportunities and Challenges in Academic Analytics. European Conference on e_learning, Hatsfield, UK, Academic Conferences and Publishing International Limited.

Macfadyen, L. P. and S. Dawson (2010). “Mining LMS data to develop an “early warning system” for educators:  A Proof of Concept.” Computers & Education(54): 11.

Morris, L. V., et al. (2005). “Tracking student behavior, persistence, and achievement in online courses.” The Internet and Higher Education 8(3): 221-231.

Rafaeli, S. and G. Ravid (1997). OnLine, Web Based Learning Environment for an Information Systems course: Access logs, Linearity and Performance. ISECON 1997, Orlando, FL.

Ryabov, I. (2012). “The Effect of Time Online on Grades in Online Sociology Courses.” MERLOT Journal of Online Learning and Teaching 8(1).

Whitmer, J., et al. (2012). “Analytics in Progress: Technology Use, Student Characteristics, and Student Achievement.” EDUCAUSE Review Online (July 2012).


Related Posts

Share This Article

Twitter Facebook LinkedIn Pinterest Email