learning with insight

Anthropomorphic Pedagogical Agents (APA)

APAs embody human traits such as the analysis process and insighfulness of humans. They are able to interact with humans by detecting and exhibiting emotion, communicating through natural language, and enacting a pedagogical role.


Researcher – Ireti Fakinlede

❚  Introduction

Language acquisition, the process by which a human acquires the ability to recognize and produce language that other speakers can understand, remains a hotly debated topic in linguistics. Whether genetically, cognitively or socially inspired, generic or specialized, we take as given the existence of the language acquisition device (i.e. the mix of cognitive tools), knowledge and learning mechanisms by which human children acquire their first language. This project aims at artificially producing such a device, named GIOIA. The goal is to use this device to effectively develop pragmatic competence in conversational software agents.

❚  Further description

Language acquisition, the process by which a human acquires the ability to recognize and produce language that other speakers can understand, remains a hotly debated topic in linguistics. Nativist theorists i.e. Chomsky, Pinker and Gold, offer their observations of the human language acquisition process as proof that human beings possess a specialized innate neurological support for linguistic competence (phonology, lexicon and grammar). That is, a prewiring to adapt to language which Pinker calls ‘the language instinct’ and Chomsky refers to as ‘the language acquisition device’. Their views are opposed by behaviorists, constructivist and social interaction theorists who emphasize the role of social interaction in shaping human language acquisition. Constructivist focus on the formative years during which human learners gain linguistic competence through interaction with and mimicry of mature speakers of the language. I take the stance that both arguments are not necessarily mutually exclusive. Both arguments favor the existence of cognitive support for language acquisition; the difference is whether those tools have been adapted specially for language or are these the same tools that support other human learning abilities. Literature shows that there are some aspects of language acquisition that may be explained by innate abilities and some aspects that can be explained as a consequence of socialization. Whether genetically, cognitively or socially inspired, generic or specialized, I take as given the existence of the language acquisition device i.e. the mix of cognitive tools, knowledge and learning mechanisms by which human children acquire their first language. In this work I propose to artificially produce such a device, named GIOIA. My goal is to use this device to effectively develop pragmatic competence in conversational software agents.

Big Data Analytics – Technologies, Standards

Scaling learning analytics systems at the big data level requires the ingestion of multimodal data, standardization of data of interests, and the integration of large-scale storage and data processing technologies.

PROJECT | Gravité – Analytics Platform (2017-present)

Researcher – Jeremie Seanosky

The Gravité system is a generic learning analytics system architecture/infrastructure that currently works with the coding and writing analytics tools. The idea of Gravité is to have a central REST API-based hub (currently in NodeJS) that handles all the data traffic to and from sensors (i.e. the coding and writing analytics tools, etc.). The Hub being the most important component of the system, it receives all JSON-based data packets from the sensors that it stores in a NoSQL database (MongoDB). The Hub also provides all the results of the analytics to be shown in the dashboard.

In a typical workflow, the client-side sensors such as CODEX capture learning events from the user. These events are then sent through the Hub as HTTP requests and unto the NoSQL database. All learning events are stored in the popular JSON format. From there, the analytics engines query the hub for ‘unprocessed’ learning events, which they process. The output from the processing engines is then sent back through the hub and into the database from where the dashboard queries the hub for processed results and metrics.

Gravité also includes some custom encryption to ensure security of the data in transit. This encryption is generic to work with the different sensors that we attach. We designed Gravité to be modular so adding new sensors is seamless as is also the processing side.

PROJECT | OpenACRE – Caliper and Open Analytics Collaborative Research Environment (2015-2016)

Researchers – David Boulanger, Jeremie Seanosky, Rahim Virani

❚  Introduction

The Caliper project aims to develop solutions that capture learning events in a standardized and interoperable manner. Caliper events arising from a variety of sensors attached to numerous learning tools used by learners can be stored in OpenACRE. Analytics solutions can then be applied on sanitized data to extract, analyze, or generate insights.

❚  Impact and Opportunity

Extend the current implementation from a traditional structured data-warehouse to an implementation that can support semi-structured data otherwise known as Big-Data.

Extend the framework for researchers so history can be kept and datasets paving the way for continuous improvement for not only researchers but students as well

Allow for the integration of third-party data sets where applicable such as Text-book vendors and MOOCs.

Support integrated Learning Tools such as MI-Writer to provide storage of Learner data and engine support for real-time or pseudo real-time feedback

❚  Why Caliper?

The research team evaluated a few other learning analytics frameworks based on the documentation available and has elected to use Caliper. The main driver behind the research is that the research team feels Caliper is the best framework to extend because of its modularity and ontology-based approach. Please read on for further information.


Researchers – David Boulanger, Jeremie Seanosky, Colin Pinnell, Jason Bell

❚  Introduction

LAMBDA is the learning analytics operating system that provides generic and customizable functionalities to sense data, shape the data into experiences, analyze learning experiences, measure the impact of learning activities, and offer reflection/regulation opportunities to improve learning experiences.

LAMBDA is the name of the overall learning analytics system developed by Athabasca University’s research group under Professor Vive Kumar.

Much has been written on big data and learning analytics individually, but an important gap lies between both areas. Our research proposes a generic learning analytics competence-based framework called Lambda as a potential candidate to reach big data through learning analytics. Precursor systems to Lambda have already been experimented in programming and in the energy industry and is currently being applied in a Java programming course at Anna University, Chennai, India. For more information about some of Lambda’s tools, please view: Codex, SCALE, MI-DASH, and SCRL. The universal applicability of the Lambda framework to any learning domain as well as its ability to recognize learning artifacts supporting the evidence of higher-level problem solving-oriented competences make it an ideal vehicle to get to big data. Lambda’s researchers and developers are currently working at expanding its volume, velocity, and variety dimensions to consolidate its stance in big data realm.

Learning analytics can be readily exemplified using the analogy of the gold-mining process, where raw gold-laden minerals are extracted, transported to the refinery, and then processed into fine, highly-priced jewels and decorations and/or precious gold bars.

LAMBDA includes three general parts that are implemented in different ways.

❚  Part 1 – Sensing Technologies

We develop client-side technologies that sense learning activities of interest in different areas of learning.

Examples of those client-side sensor-based technologies are CODEX, MI-Writer, ART, SCRL, etc.

These sensors provide the raw data existential to our overall learning analytics endeavor. Basically, those raw data are absolutely meaningless prior to their being processed and analyzed.

The raw learning data can be compared to the raw gold nuggets or gold-laden minerals extracted from different mines. The gold can be extracted in many different ways and may be in many different forms, but without this input of raw gold, no market-ready gold is available.

Those learning sensors are also responsible for transporting the raw data to the refinery (processing engine). They must always ensure a uniform data format that can be handled by the processor.

❚  Part 2 – Processing & Analysis Engine

The processor is responsible for making sense out of the raw data it received from the sensors. It takes each data packet and subjects it to several types of analyses depending on the desired outcome or what we want to understand from it. Then the results of those analyses are stored and made available for use.

Using our gold-mining analogy, different refining processes are used for different types of gold in different forms to achieve different end products. The process used to refine gold into 24-carat jewels is different from the one used in making gold bars used in banks.

Depending on the desired end product, we carefully choose and customize the refining (analysis) process in order to achieve exactly what’s needed.

❚  Part 3 – Visualization & Reporting

The last part of the LAMBDA system is about providing feedback (reporting) to the student and/or teachers about the student’s performance, as assessed by LAMBDA.

The analysis results are meaningful and in their purest form. However, we need a medium by which to convey that valuable information to the students so they can make good use and benefit from the analysis performed on their work.

The most important goal of learning analytics is and should be to analyze how learners are learning in order to provide them with information as to how they are faring in various areas and what are their strengths and weaknesses so they can work on them throughout their learning sessions and improve themselves, thus resulting in better grades and a better knowledge level.

Some students may tend to think they’re bad in all subjects and understand nothing, while others might tend to overestimate their performance. Then when the final examination comes, some may be surprised by their performances while others might be disappointed. That’s where LAMBDA will come into play to tell students the truth about their performance in a real-time environment so they can work on the proper facets as they surface and as a result become much more confident and proficient in the subjects they learn about.

Likewise, after the refining process, gold is in its purest form, but it’s not of much use if it provided as a crude block. It’s now into the hands of the goldsmith to fashion the block of gold into fine jewelry, bank gold bars, electronic components, plating, etc. In brief, LAMBDA is a learning analytics system still in its debut on which we are working actively in order to make it fully usable and beneficial to learners in general.

PROJECT | P-PSO – Parallel Particle Swarm Optimization

Researcher: Kannan Govindarajan

P-PSO is a parallel particle swarm optimization algorithm to cluster and classify large volumes of datasets associated with learning analytics.

Healthcare Analytics

Healthcare analytics automates through education data the detection of mental health disorders such as ADHD and remedies to these disorders through educational rehabilitation.

PROJECT | MHADS – Mental Health Analysis and Diagnostic Service

Reasearcher: Diane Mitchnick

The Mental Health Analysis and Diagnostic Service (MHADS) is a web-based tool that allows patients and medical practitioners to understand the process of diagnosing adult ADHD. It measures signs and symptoms that indicate the disorder. MHADS takes information from various sources, and through a series of questions, exam results and performance test scores it offers a more reliable diagnosis on the disorder. It also aims to detect the potential for ADHD based on writing competences of learners.

PROJECT | Synthetic Biology Analytics

Researcher: Bertrand Sodjahin

This research aims at developing educational and training tools for synthetic biology in genome sequence assembly, biochemical pathways, and gene expression analysis.

Coding Analytics

Coding analytics captures, assesses, and provides timely feedback to learners on their coding processes, testing, and debugging habits with the goal of fostering high-quality, professional programming skills demanded in the industry.

PROJECT | CODEX 2.0 – Coding Experiences

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

Description coming soon…

PROJECT | CODEX 1.0 – Coding Experiences

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

CODEX, which stands for CODing EXperiences, encompasses sensors that capture coding experiences of students, measures coding competences and confidence of students, and offers analytics-based solutions to improve them.

PROJECT | MI-DASH – Mixed-Initiative Dashboard

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

MI-Dash is a web-based reflection dashboard for students, teachers, administrators, general public, politicians, parents, and loved ones. One can visit MI-DASH to view the system’s depiction of individual learner performances or learner groups. MI-DASH also enables users to interact with the information and seek additional insights. Interestingly, the dashboard itself may propose study initiatives, with reasoning, that might be of interest.

MI-Dash is a reflection dashboard for students, teachers, administrators, general public, politicians, and parents and loved ones.

MI-DASH is a web-based dashboard where are displayed the results of the analyses performed by SCALE on the data from CODEX and other learning activity sensors.

Students visit MI-DASH to view the system’s evaluation of their performance with information about the different areas that they need to improve.

MI-DASH is intended to be a generic dashboard comprising sub-dashboards for each of the different learning areas of interest that we evaluate in the LAMBDA system.

In other words, we currently use the MI-DASH technology in relation with CODEX, but with the expansion of our learning analytics software towards other learning areas such as math, science, writing, reading, music, industry operations training, etc., MI-DASH will provide a centralized application where every learner can go and see their progress in any particular area of learning.

Our vision for MI-DASH is to become an interactive, dynamic tool for learners where they customize the views, they get of their learning progress. MI-DASH will also feature an embedded chat feature where students can communicate with teachers and/or peers for guidance or answers to questions. In the future, as we develop new learning analytics software, MI-DASH will become the learner’s companion being available everywhere he/she goes. MI-DASH will be available on mobile devices as well as laptops and desktop computers so students can access it easily from anywhere.


Researchers: Jason Bell, Colin Pinnell, Moushir El-Bishouty

JFlapEx is a system that analyzes learner experiences in understanding formal languages and solving problems in formal languages.

Competence Analytics

Competence analytics measures the factual and procedural knowledge of learners as well as their skills independently of the instructional materials at their disposal during the learning processes. It also estimates the transfer effect of learning one concept over another, which allows to fill in knowledge gaps in a new domain.

PROJECT | SCALE – Smart Competence Analytics on Learning

Researchers: David Boulanger, Jeremie Seanosky

❚  Introduction

SCALE is a smart analytics technology that transforms learning traces into standardized measurements of competences.

❚  Further description

SCALE is a smart competence analytics technology that analyzes your learning experiences in different learning areas. SCALE basically transforms your learning traces into measurements that will help you assess how proficient you are in the concepts introduced in your course. SCALE will also allow you to evaluate how confident you are at solving a particular exercise and how confident you are in the overall learning domain. SCALE’s mission is to provide you with a scale that will help you measure and optimize your learning as it occurs.

SCALE has been redesigned to become completely independent of the client-side CODEX and vice versa. Prior to that, the client-side sensor was communicating directly with the server-side SCALE processor to transfer data instances, but this had adverse consequences when we consider potential denial-of-service attacks and loss of service for other clients.

It was also impossible for the SCALE engine to handle tremendous amounts of data packets flowing continuously towards the server.

We, therefore, added one more layer between CODEX (client-side) and SCALE (server-side). This new layer is a NoSQL OrientDB database called the Transit Database.

In sequence, CODEX continuously sends data instances it captures at fixed intervals (e.g. 30 seconds) to the TransitDB via a Socket server. The Socket server handles huge quantities of data rather easily in comparison to the HTTP request approach previously used.

Data received by the Socket from CODEX are all relayed to the TransitDB to be stored and accumulated. This is the ONLY connection between CODEX and SCALE, therefore removing any dependency between CODEX and SCALE.

In simpler terms, CODEX does not have any direct link to SCALE, and vice versa.

SCALE, on its side, operates autonomously and independently. Based on its internal design, SCALE continuously looks into the TransitDB to see if there are data available from CODEX. If so, SCALE takes one CODEX data packet at a time, processes and analyzes it, and then marks that data packet as “processed” so the SCALE engine won’t process it again.

Upon completing the processing and analysis on a given data packet, SCALE stores the analysis results in a MySQL database, ready to be used by the visualization and reporting tools available, such as MI-DASH.

Industry Training Analytics

Industry training analytics assesses operators’ skill development during learning-by-doing processes done within immersive environments (virtual and augmented reality) in both standard and emergency settings.

PROJECT | STAGE – Smart Training with Analytics and Graphics Environment

Researchers: Jeremie Seanosky, Rebecca Guillot, Isabelle Guillot, Claudia Guillot

Description coming soon…

PROJECT | ART – Augmented Reality Training/Testing

Researcher: Rebecca Guillot

ART stands for Augmented Reality Training/Testing. The main goal of this project is to create a dynamic environment for reality-oriented training or testing. ART aims to assist teachers and supervisors to understand the weaknesses of their students or trainees in their competence and confidence development and provides practical solutions to improve competence and confidence.

PROJECT | PeT – Procedure e-training Tool

Researchers: David Boulanger, Jeremie Seanosky, Rebecca Guillot

PeT is a Procedure e-Training tool empowering organizations to train and re-certify their operators in their standard and emergency operating procedures.

PeT is a Procedure e-Training tool empowering organizations in the energy industry to train and re-certify their operators in their standard and emergency operating procedures. PeT tracks the knowledge and behavior of operators through highly-monitored multiple-choice questionnaires. Managers, trainers, and operators can visualize their performance in a learning analytics dashboard and see how proficient the operator is in every step in every operating procedure. Additionally, PeT enables to track the behavior of an operator in an emergency setting to ensure the operator not only has the proper knowledge to perform the right actions in the proper sequence but also has the ability to perform those critical actions typical of emergency situations in the proper timeline. In emergency situations, operators often have only a few minutes to intervene with no possibility to consult any resource. The time constraint as well as the high risk of serious consequences can hinder the operator’s ability to intervene properly. The rapidity and correctness of the operator’s actions will make the difference in terms of expensive material damage, productivity loss, injuries, and even death within the organization.

PeT is built upon a six-factor confidence model. This model enables to track the operator’s behavior by recording the amount of time the operator takes to answer a question, the number of times he/she revised his/her answers, the number of different answers he/she selected, his/her reaction time that is the time before giving a first answer, the total number of selections made in answer to a question, and the correctness of the final answer to a question. This model describes the degree of hesitation of the operator and will pinpoint any area in which the operator will need further training.

The data collected from these training and evaluation sessions will give organizations the ability to manage and optimize their knowledge assets and human capital.

Instructional Designs Analytics

Evaluates the quality and effectiveness of learning resources and course designs on student performance and provides recommendations on how to improve the learning experience.

PROJECT | MI-IDEM – Mixed-Initiative Instructional Design Evaluation Model

Researcher: Lino Forner

❚  Introduction

The goal of the MI-IDEM project is to design a model to evaluate the effectiveness of instructional designs of courses. Using Bayesian Belief Networks, MI-IDEM assesses the quality of course design based on learning analytics data collected from the student’s learning environment, through surveys, from the instructors, and other means.

❚  Further description

Overall, the goal of the MI-IDEM project is to design a model to evaluate the effectiveness of instructional designs, such as courses, etc. We are researching methods to assess the quality of course design based on learning analytics data collected from the student’s learning environment through surveys and/or learning data collection mechanisms (sensors). Using Bayesian Belief Networks (BBN), we intend to use probabilistic analysis to determine the level of quality of the instructional design, especially in the setting of online education.

PROJECT | Curricular Analytics

Researcher: Geetha Paulmani

❚  Introduction

Curricular Learning Analytics explores relations between efficiency of student learning with curricular elements such as topics coverage, prerequisite relations among topics, learning outcomes, instructor effectiveness, student workload, students’ capacity, cultural constraints, socio-economic-political influences, and so on. We are looking for insights and evidences from learner interactions with respect to the curricular elements that allow us to measure the degree of achievement of curricular outcomes.

❚  Further description

While we have demos for individual pieces, should we also think about an overarching demo that exemplifies the vision of learning analytics at the curricular level? Can we take up the exercise of identifying learning analytics opportunities in each course in BSc CIS? Suppose we have done that, can we image the flow of data from a variety of sensors from students taking these courses? How much data would that be? What types of data? What kind of support we could offer students and teacher? How do we measure their effectiveness? Can we imagine and simulate, with fake data, such a curricular encompassing interactive dashboard used by a) a student, b) a teacher, c) a parent, d) a university administrator, and e) a politician?

Math Analytics

The Knowledge Space Theory allows to capture the dependencies between a set of related math concepts to form a knowledge space within which the student’s learning state is constantly recorded and his/her overall learning path tracked. Through continuous adaptive formative assessments, gaps in the student’s comprehension are easily detected and remedied. This is crucial to help students build their confidence.

PROJECT | MATHEX – Mathematics Experiences

Researcher: Rebecca Guillot

❚  Introduction

The goal of this project is to study students’ physical, affective, problem-solving, and metacognitive behavior as they work on math exercises. MATHEX also focuses on finding ways to strengthen the weaker points of the students, create opportunities for motivation, and continuously assess their progress.

❚  Further description

We all know that mathematics is a key subject in the education of every student. But how many of them are failing in Math? And why? How to prevent math failure for every student? How to captivate the curiosity of the more reluctant student? How to create appetite in those who have this subject in aversion? How to give self-confidence to these students with negative thinking? This is the goal of this research project. A main part of this research is obviously in identifying the specific need of each student as an individual. To do so, MATHEX is creating a software that captures every aspect necessary to understand the student’s behavior. The results of these captures will give tutors and teachers a clearer view of the particular weaknesses of their students. In fact, detecting the weaknesses is only the beginning of the process. Every weakness has its own root somewhere and it is crucial that these roots be clearly identified in order to work on this rather than working on the weakness itself. MATHEX also focuses on finding ways and technologies to strengthen the weakest points of the students, create motivation by revealing the strong points, and constantly evaluate their progress by adapting appropriate steps with the constant goal of leading every child towards success. MATHEX is a system under development for K-12 students. Some of the goals for MATHEX research are: 1. Formally capture subject knowledge competencies in targeted K-12 and higher education mathematics 2. Standardize recording of math experiences in a continuous manner 3. Offer initiatives-based guidance for students to commit to mathematics at levels ‘comfortable’ to them 4. Explore effectiveness of modes of expression (hand written, typed, mouse-based) and means of content interaction (formally guided, social, and collaborative) 5. Test efficiencies of competency development patterns and math study skills development patterns in contexts of self-regulated and co-regulated learning.

Music Analytics

Assessing music skills, whether it is singing, playing instruments, or understanding the theoretical concepts of music, is a challenging but exciting task. Nowadays, technologies such as computer vision and audio digital processing allow to capture valuable data during the learning process and measure the level of engagement, and music skills of learners providing timely feedback to musicians in training and their teachers.

PROJECT | MUSIX – Music Experiences

Researchers: Claudia Guillot, Rebecca Guillot

Description coming soon…

Observational Study Approach

The Internet of Things improves the precision with which learners can be non-intrusively observed, increasing the potential of observational studies to approximate randomized block designs and estimate causal effects without the ethical concerns and recruitment limitations of randomized experiments.

Regulation Analytics

Self-regulated learning analytics implies assessing the student’s ability to set goals, craft a plan to reach those goals, stick by the plan when working toward that goal, and monitor one’s progress to adapt the goal and its learning path as needed. It also provides students with feedback to guide them toward desirable self-regulatory traits.

PROJECT | Regulate/Manage

Researcher: Rebecca Guillot

Description coming soon…

PROJECT | SCRL – Self-/Co-Regulated Learning

Researchers: Colin B. Pinnell, Jason Bell, Moushir M. El-Bishouty, Lanqin Zheng

❚  Introduction

SCRL aims at measuring and improving students’ self-regulation and co-regulation abilities.

❚  Introductory Note

SCRL proactively engages learners in self-regulated and co-regulated activities based on Winne and Hadwin’t information processing perspective where regulation is experienced as an iterative process of defining tasks, setting goals, planning to achieve them, enacting tactics/strategies, and adapting. SCRL can be customized to observe, promote, and measure other types of regulation models such as Zimmerman’s.

❚  Overview

by Moushir M. El-Bishouty

It aims at supporting students’ self-regulation and co-regulation in order to foster their computational competency. SCRL tool implements Winne and Hadwin’s model of self-regulated learning that influenced by information-processing theory (IPT). In this model, learning occurs in four basic phases (task definition, goal setting and planning; studying tactics; and adaptations to metacognition. Moreover, the tool monitors learners’ competency through analyzing their performance and interaction data captured by a set of software sensors that measure competency development. The current phase of the project focuses on two application domains: programming and writing competencies. The developed sensors capture students’ interactions while using leaning platforms, such as learning management system (LMS) and integrated development environment (IDE). The captured data conform the raw data that are utilized by SCRL for monitoring, measuring and evaluation students’ regulated learning and competency development.

❚  SCRL’s goals are:

  1. to engage learners in creating SRL/CRL-specific activities,
  2. to trace learners’ interactions related to those SRL/CRL-specific activities,
  3. to analyse/measure the degree of regulation exhibited in these activities,
  4. infer the relation between enacted SRL/CRL activities and learning performance,
  5. to model SRL/CRL traits in a distributed causal network to observe the evolution of SRL/CRL traits in each learner.

❚  Development Journal

SCRL is a large, ambitious project, and as such we are developing it in several stages. This gives us two major benefits: firstly, it allows us to develop in chunks. Given that we’re a smaller team, we need to be able to break things apart into smaller pieces in order to get meaningful work done.

More importantly, though, many of our ideas are untested and need verification. There’s no sense implementing a large system if parts of it simply aren’t true in the field! Developing in stages lets us test individual parts of our hypotheses before joining them into the full system.

❚  Specifically, SCRL can engage and measure the following based entirely on student’s interactions with the software/software agents:

  1. Task Perception: SCRL engages the self-regulation model at the Task Perception level by providing an easily-understood view on participant competencies and activities. It actively engages the participant by calling attention to deficient areas through suggesting initiatives and alerting students to focus areas through the activation of triggers.
  2. Goal Setting / Planning: SCRL engages Goal Setting by suggesting default initiatives for all participants within a field, as well as suggesting initiatives to address problem areas or areas of interest. SCRL aids in planning paths to those goals through the initiative design process, which makes explicit the steps that the participant may take towards their goal.
  3. Enacting: SCRL engages the enactment of initiatives through monitoring participant behaviour with relation to the competencies addressed by their initiatives. Through this monitoring, SCRL can identify issues in enactment and keeps a record of all enactment details available to it.
  4. Adaptation: SCRL engages the adaptation of initiatives through enactment monitoring over time. By keeping a record of all changes within the domain of a monitored competency, SCRL can identify enactment problems and suggest changes to some or all steps of an initiative plan. Such changes can then be affected in the initiative design interface.

❚  SCRL 1.0 – Deployed, Tested, In Service

SCRL 1.0 was designed to service two goals – to prove the concept that a self-regulation tool could present an analytics team with useful data, and to provide us with a repository of practical data regarding student regulatory behaviours. It was developed over the course of the summer and saw a great deal of change over that time. We developed the database and UI, going through three major versions of the UI and two major versions of the database before settling on the final package – a Java client for students and a corresponding server and mySQL database. Study development occured late fall, with deployment near Christmas.

As of this writing, SCRL 1.0 is encountering a few hiccups as the students of our study approach their use of the tool, and we’re working to smooth them out so that further studies don’t have the same issues. We’ve got a database that’s growing quickly and are looking forward to expansion to further groups. Planning is going well for this, and we hope to have a solid plan within a week.

❚  SCRL 2.0 – Conceptual Design

We plan on making thorough examination of the data collected from SCRL 1.0 before deploying 2.0, but our design and development team still has a lot of work to do in order to fulfill our wish-list of features.

The major difference between SCRL 1.0 and 2.0 will be the inclusion of embedded analytics – we intend to examine student behaviours both within SCRL itself and while doing online reading and lecture-viewing. This will involve a completely new information pipeline that will capture and translate micro-scale system events, package them up as incomplete learning events, and then match them with competency assessment from sensor tools. Sensor tool data will also be polled at this time, bundling them along with the matching system events where possible.

Another feature we’re working towards will be embedded learning resources within SCRL. Educators will be able to embed learning resources in the tool for students, and students will be able to promote external content into the tools’ libraries for others to see. A sort of voting system will allow learners to grade the usefulness and enjoyability of these resources to create a hierarchy of topical courses. These resources will of course be under the auspices of the behaviour-capture methods spoken about above, providing another domain for SCRL to understand student self-reflection and engagement.

We have other hopes as well, but we consider these two to be sufficient for launch of SCRL 2.0.

❚  SCRL 3.0 and Beyond

Once we have a useable platform that can capture student work and perform assessment of self-regulation, we should have a library of a number of sensor tools available – Java programming, English language writing, Finite state machines and some mathematics are a few examples we should have done by then. SCRL 3.0 and beyond will focus on creating a stable API for development of new sensors, as well as extending the Caliper and Tin Can/xperience APIs to contain regulation and motivation state information. We’ll also be working to bind SCRL with solid sentiment analytics, allowing deeper inspection of motivational states within learners.

Of course, we don’t want to plan too deeply into the future – we’ve already got enough on our plates, and we need to see the results from SCRL 1.0 and 2.0 before we can extend our grasp!

❚  Conclusion

SCRL is right now a fancy data-collection and monitoring system – it’s sort of like a diary for students, to help them organize their learning and understand their motivations a little more explicitly. According to the Self-Regulation model, that on its own improves learning effectiveness. We plan on going much, much further than that, however. This is only the first few steps.

PROJECT | SDLeX – Self-Directed Learning environments

Researcher: Stella Lee

SDLeX aims to analyze learner interactions on Self-Directed Learning environments and capture holistic learner experiences to provide opportunities for reflection and regulation. It captures learner’s perceptions, physical responses, motivations, emotions, and social reactions that emerge from interacting with a learning environment to offer a positive learner experience and measure such experience.

Research Analytics

Research analytics aims at promoting and facilitating the sharing and integration of scientific study results with the purpose to discover new and finer-grained insights in education to optimize the learning and teaching processes and the environments in which they occur. The ultimate goal is to avoid actionable insights to sleep over decades, and rather optimize the research benefits to the education community.

PROJECT | RPA – Research Publications Analytics

Researcher: Jeremie Seanosky

Research Publications Analytics provides an interface for researchers to add their publications to a database, including a Google Scholar input mechanism. RPA analyses each publication entry for missing elements. It also provides a report generation mechanism that includes indexed values for the quality of the publication avenue.

PROJECT | xDesign – Experiment Design simulated environment

Researcher: Moushir El-Bishouty

This research offers a simulation environment for learners to design an experiment and analyze the effect of a study using simulated data and research processes. The simulation environment includes interactions on human research ethics, data validation, and statistical analysis.

Sentiment Analytics

Sentiment analytics is the detection of the learner’s emotion during the learning process and the identification of the cause identified through computer vision, audio, neuro, physical, and text data.


Researcher: Steven Harris

In addition to textual analysis, we also look to develop a multiple-media sentiment analysis engine, called MeMoo, where other media information such as facial expression and physiology trackers can be used to augment sentiment observations. The current sentiment analysis work focuses on developing and testing a variety of natural language classifiers that recognize and identify students who are potentially frustrated or confused in the online learning environment, based on their interactions with the online learning system, discussion forums, and course materials.

Secondary areas of related research include using similar algorithms to identify areas of weak learning content, based on identifying areas where students regularly seem to have problems; and ultimately even identifying student learning styles, and suggesting additional materials that may aid in their success with the course.

Sport Analytics

Sport analytics consists of designing sportswear that tracks players’ movements as well as developing sensors in sports equipment that capture players’ skills and velocity. The data analyzed allow to provide real-time and concise feedback that help teams optimize their performance and assist referees/umpires in the decision-making process, with careful attention to not interfere with the sport itself.

PROJECT | BOOTT – Badminton Officials Online Testing and Training

Researchers: Jeremie Seanosky, Rebecca Guillot

Description coming soon…

PROJECT | Slapshot

Researchers: Colin Pinnell

This research aims at inferring play skills, tactics, and strategies from observational data from sensors, video, SME, and reflection/regulation activities of players.

Traffic Analytics

Traffic analytics leverages techniques developed in educational data mining and learning analytics to innovatively apply them to better determine and improve traffic conditions thanks to smart technologies.

PROJECT | TADA – Traffic Analytics with Data Accretion

Researcher: Liao Ming

❚  Introduction

TADA stands for Traffic Analytics with Data Accretion, a brand-new tool/technique that allows contextualization of sensor data from physical sensors (e.g., GPS, vehicle sensors, traffic sensors) and from personal observations using physiology sensors.

❚  Further description

  • Contemporary traffic lights cause uneven flow of traffic in urban centers (e.g., the city of Edmonton); 3.6 billion vehicle-hours of delay; 67.5 billion in lost productivity – cost of traffic congestion to the USA economy in year 2000. In addition to human impact, this is a major impact on the environment – 21.6 billion liters of wasted fuel.
  • Contemporary mathematical and computational models are good at predicting the flow density of traffic; however, they are also impractical in the context of automating the flow of traffic.
  • We propose a novel technology that integrates mathematical and causal traffic models with the goal of optimizing the flow of traffic, at real-time. The technology will be able to use historical data as well as real-time data.
  • The proposed technology can be embedded in each traffic light controller along a major artery in an urban center. The traffic lights will communicate and cooperate with each other, in a semi-automatic fashion, with the goal of smoother flow of traffic, hence saving the liquid gold.
  • Requirements
    • A pilot city partner (e.g., The City of Edmonton)
    • A technology manufacturing partner firm to reconfigure the controllers of a few traffic lights along a major roadway (e.g., Gateway Boulevard in Edmonton); the firm will develop embedded controllers that dynamically synchronise/regulate the timing of the lights depending on traffic data
    • A technology video analysis partner firm to estimate the number of vehicles crossing a traffic light
    • A technology data analysis firm to compute wireless data
    • Funding to engage graduate researchers in this research and to develop software & hardware technologies by the industry

Traffic is a major factor in all walks of life, consuming space, time, and energy in planetary scales. Literature reports a number of studies, systems, and policies that identify the impact of traffic on the quality of life and the environment in many dimensions such as urban traffic control, traffic of hazardous material, traffic pollution, and traffic and the human psyche. The GDP typically devoted to transportation by developed countries is in between 5 and 12 percent [Hazelton 2010].

Traffic is a major factor in all walks of life, consuming space, time, and energy in planetary scales. Literature reports a number of studies, systems, and policies that identify the impact of traffic on the quality of life and the environment in many dimensions such as urban traffic control, traffic of hazardous material, traffic pollution, and traffic and the human psyche. The GDP typically devoted to transportation by developed countries is in between 5 and 12 percent [Hazelton 2010].

Traffic modelling is a key research area in urban planning and environmental sciences. In 2000, road traffic congestion in USA alone caused 3.6 billon vehicle-hours of delay, 21.6 billion liters of wasted fuel, and US$67.5 billion in lost productivity. Yearly estimates on economic, health, and environmental cost of traffic congestion in New Zealand is in excess of NZ$1 billion [Hazelton, 2010]. Understandably, these statistics were derived almost exclusively from urban traffic data.

Traditionally, traffic modelling has concentrated on simulating traffic behaviour. The science of traffic analysis, modelling, and optimization aims to estimate traffic load, to detect and prevent traffic congestion, and to optimize the flow of traffic. Optimization of traffic flow not only reduces drivers’ stress levels, but also reduces air pollution [Angleno, 1999] and controls fuel consumption with respect to the environment and the economy. This proposal directly addresses the later – to optimise vehicular gasoline consumption in urban centers by regulating the flow of traffic using smart traffic lights.

Classical traffic models are mostly based on the treatment of vehicles on the road, their statistical distribution, or their density and average velocity as a function of space and time. Most models employ techniques ranging from cellular automata, particle-hopping, car-following, gas-kinetics, through to fluid dynamics present a passive approach to traffic optimization. That is, traffic data is collated apriori and the models are validated posthoc. In a compelling argument for the need to change the manual adjustments to traffic signals, Thorpe [1997] showed, using limited simulation models, that the best traffic signal performance could be achieved using Reinforcement Learning.

Modern traffic control is carried out in three incrementally informed methods. The first method, the least informed of all, simply employs humans at traffic junctions to manually regulate traffic signals. The second method employs traffic signal lights with static states1, where the states are fine-tuned manually based on information obtained from abstract traffic models [Huang et al., 2005; Wei et al., 2005]. For example, Thorpe [1997] reports the re-timing of a major artery in Denver, CO, USA, from 90 seconds to 100 seconds, in the heavy-flow direction, to yield 87% reduction in times vehicles stopped at light. The third method employs traffic lights that respond to real-time data obtained from devices such as road loops, video cameras, and other traffic detectors [Olsson, 1996].

In contemporary models, traffic situations are represented by statistical or mathematical abstractions and traffic control is exerted by methods that utilize information gleaned from these abstractions. These methods employ sparse traffic data and/or abstract models of traffic dynamics. Approaching from a different angle, this proposal focuses primarily on loosely modelling the causality of traffic. The causal model then drives the state changes in traffic control. Such a causal model approaches a fully-informed solution; that is, the more we know at real-time about vehicles on the road the smoother the flow of traffic and the better the gasoline usage. Data is obtained from every vehicle that contributes to traffic and this data is used to contextualise traffic situations. Hence, the proposed method will be accurate enough to capture the exact nature of undesirable traffic outcomes (e.g., traffic jams, longer wait period, higher number of stops) as well as to model the causality of these undesirable traffic outcomes. It is also possible to direct the traffic to enact a desirable traffic outcome, such as one that clears a pathway for an ambulance or a VIP’s convoy. Further, the causal model is updated at real-time and hence the state changes are real-time responses to the dynamics of the causal model. In essence, we propose to develop a probabilistic model for traffic signal control to optimize traffic flow in real-time.

We propose to use Dynamic Bayesian Belief Network (DBBN) technique to model and simulate urban ground traffic behaviour and to show how the DBBN optimizes traffic flow in real-time by controlling states of traffic signals. GPS and GIS technologies enable real-time access to traffic data such as vehicle’s location, speed, and direction [Haupt, 1999]. This real-time data, in combination with data specific to road segments, is sufficient to model entities affecting the flow of the traffic in a DBBN.

Each traffic signal in the road network is associated with its own instance of DBBN. Such an island of traffic signal connected through road segments forms the topology of the DBBN. The island accepts traffic flow data as observation model (effect) and current states of the traffic signal controller as the transition model (cause), and computes probabilities of states of the traffic signal for optimal flow of traffic.

Adjacent islands coordinate to optimize flow of traffic across road segments, thus one would be able to dynamically route traffic in order to minimize the waiting time for a designated vehicle (such as an ambulance) or for a group of vehicles in between two traffic signals.

The proposed Bayesian technique has no more intrinsic ability to represent causation than a traditional mathematical model except that it is more explicit and is directly manipulable by end users (e.g., The traffic control section of the City of Edmonton). Also, Bayesian models use the same amount of information as that is available to contemporary mathematical and statistical models, hence model updates and validation metrics are quite comparable across all models if not the same.

In line with the goals of contemporary traffic control models, the proposed Bayesian traffic signal control has the:

  1. ability to model the causal elements of traffic situations
  2. ability to control traffic signals using probabilistic inferences
  3. ability to regulate traffic for vehicle-specific situations (e.g., ambulances, fire engines, VIP vehicle)
  4. ability to predict undesirable traffic situations and propose timely alternatives

❚  References

M.Hazelton, “Statistical Methods in Transportation Research”, PowerPoint Presentation to Statistical Society of Australia Inc, WA Branch, Perth, 2005,

E.Angleino, “Traffic Induced Air Pollution in Milan City: a Modelling Study”, Urban Transport and the Environment for the 21st Century”, Rhodos, September 1999.

T.Haupt, “Planning and Analyzing Transit Networks: An Integrated Approach Regarding Requirements of Passengers and Operators”, the 2nd GIS in Transit Conference, Tampa, Florida, 1999.

T.L.Thorpe, “Vehicle Traffic Light Control Using SARSA”, Masters Thesis, Department of Computer Science, Colorado State University, April 1997.

Huang YS, Chung TH, Chen CT, “Modeling traffic signal control systems using timed colour Petri nets”, International conference on Systems, Man, and Cybernetics, Vol 1-4, pp: 1759-1764, 2005.

Wei J, Wang A, Du N, “Study of self-organizing control of traffic signals in an urban network based on cellular automata”, IEEE Transactions on Vehicular Technology, 54 (2), pp 744-748, 2005.

1 Traffic states refer to all possible combinations of light changes (red, green, amber), pedestrian controls (walk or no walk), turn signals (left, right, and so on.), and other control mechanisms

Writing Analytics

Assisting a student during the writing process is a colossal task that requires combining the forces of state-of-the-art natural language processing and deep learning techniques. Predicting essay final scores and rubric scores is just one of many ways of providing formative feedback to students to help them reinforce their writing skills.


Researchers: David Boulanger, Jeremie Seanosky, Rebecca Guillot

Description coming soon…


Researcher: Clayton Clemens

❚  Introduction

MI-Writer aims at learning analytics for English composition. Any client application can send text data to MI-Writer for natural language processing using a variety of APIs. This data, gathered at highly granular levels, provides intricate and detailed information on how students learn to compose and how their writing competences evolve.

❚  Further description

MI-Writer is a project aimed at learning analytics for English composition. It is designed using a software-as-a-service architecture that handles processing and user authentication on the server side and allows nearly any application to plug into it. Any client application can send text data to MI-Writer for natural language processing using a variety of established techniques. This data, when gathered at very granular levels can provide intricate and detailed information on how students learn composition. It is hypothesized that, by gathering this writing data and quantifying several aspects of it, we can begin an empirical investigation into how writing competencies develop.

Future Potential Analytics Projects

Interview Mastery Analytics

Interview Mastery Analytics Skills needed to succeed in an interview include ‘mastery of the subject’, ‘comprehension of the question’, ‘quick thinking’, ‘body language’, ‘pacing of answers’, and ‘dress code’. This system simulates an online interview environment, conducts different types of simulated interviews, observes the response of the candidate, offers feedback, and allows reflection/regulation opportunities to improve interviewee skills.

Listening Analytics (ListenEx)

ListenEx stands for Listening experiences. ListenEx offers an environment where learners can listen to utterances and respond to follow up questions to measure their level of listening comprehension in terms of response time and response quality.

Reading Analytics (ReadEx)

The reading analytics tool ReadEx tracks the ability of the learner to read and understand the content, the speed with which reading comprehension happens, and the relation between reading comprehension and working memory.

Speaking Analytics (SpeakEx)

SpeakEx is a speaking analytics tool that parses translated text material arising from speech utterances and presents a scaffolded dashboard for feedback, reflection, and regulation opportunities to learners. The types of feedback include grammar-based feedback, feedback on the pacing of the speech, feedback on the breaks in speech, and identification of misspoken words.