-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
"Let(')s audit Learning Analytics" (LaLA) consists mainly of the following classes:
LaLA class diagram exported from PHPStorm
These components, their relationships, and how they interact with the Moodle Learning Analytics (LA) system are described in the following.
The original Moodle model (see the Moodle LA API diagram) is re-interpreted by LaLA in two parts: The model configuration (class model_configuration
) and the model versions that can be produced with this configuration (class model_version
).
Upon first access to the plugin page, for each existing Moodle model a LaLA model configuration is automatically created and stored in the database. The logic of creating model configurations is currently handled by the model_configurations
class.
The model configuration saves a loose reference to the Moodle model and copies its properties and settings: target, predictions processor, analysis interval type and indicators. If some properties are not set for the Moodle model, meaningful defaults are chosen. The currently set context IDs that limit the scope of the Moodle model are stored as the default context IDs in the model configuration.
🛠️ In a future version of LaLA, one will be able to set the scope by context ID for a model version when creating it. This is to help selecting which data from the Moodle instance to use as training and testing data.
🛠️ LaLA will currently always train a Logistic Regression model, no matter which predictions processor is configured to be used. In a future version, LaLA will be able to used different predictions processors.
The model version
is one possible model that is created from a model configuration. One can create multiple model versions of the same configuration and each time, the trained model will be a bit different, due to the random selection of training data from the overall data and the nature of machine learning. The model version i.a. stores the relative test set size (by default 0.2
, like for the Moodle models), the included context IDs (or null if all contexts are in the scope), and whether an error occurred when creating the model version.
The model version creation is triggered through the secured endpoint /admin/tool/laaudit/modelversion.php?configid=<configid>
. After accessing the endpoint, one is redirected to the new version on the index page.
The model version creation is split into multiple steps, which each add a piece of evidence for the concerned model version.The process makes use of object-oriented programming: The evidence types inherit directly or indirectly from an abstract class evidence
and each implement the methods collect(array $options)
and store()
. So, for each step in the model version creation, an options array is constructed, then the collect(array $options)
method of the evidence is triggered, then there might be some post-processing for anonymization, before finally calling the evidence's store()
method. The collect($options)
and store()
methods are described below under "evidence". In the model version object, the evidence is stored in a multi-dimensional indexed array (evidence[$evidencetype][$evidenceID]
).
The model version creation follows the following steps, after each of which the evidence is stored.
- First,
gather_dataset(bool $anonymous = true)
triggers the collection of data from the Moodle platform, that will be used for the model version. If necessary, an ID-map is created, and the collected data is anonymized with it. - Then,
split_training_test_data()
triggers the splitting of the previously collected data into training and testing data sets. Note that the data is shuffled first, in order to create a random split. - In the third step,
train()
triggers the training of a Logistic Regression model using the training data gathered before. - Fourth,
predict()
triggers the generation of predictions of the trained Logistic Regression model for the test dataset. - The final step is the collection of data related to the dataset gathered in step #1 using the method
gather_related_data(bool $anonymous = true)
. First is analyzed which tables relate to the main table, recursively (see a more detailed explanation). Then for each of these tables the relevant data is collected. If necessary, ID-maps are created for each table and each table is anonymized.
Upon finishing the model version, an event model_version_created
(in the event
folder) logs who created a new version of which model configuration.
The model configuration is immutable, and so is the model version once it is trained. If a Moodle model is updated, a new model configuration is added with properties copied from the updated Moodle model. If a Moodle model is deleted, the model configuration continues to exist. Model versions are not affected by Moodle model updates or deletions. This is to ensure reproducible and trustworthy audits.
As described previously, model versions follow a process where after each step, one or multiple peaces of evidence are stored. LaLA currently implements six two types of evidence (dataset
and model
), with four sub-classes of dataset
(training_dataset,
test_dataset,
predictions_dataset, and
related_data) and two anonymized variations of evidence (
dataset_anonymizedand
related_data_anonymized). The evidence collection (
collect(array $options)) and the validation of the
$options array is implemented differently for almost each type of evidence. Storing (
store()) is implemented in the abstract class
evidenceand stores the collected evidence on the server. The
store()method first serializes the collected data into a string and then creates a file from the string at the location
/evidence/modelversion-evidence.. The location differs for related data. Here,
-TABLENAMEis inserted after the original file name and before the
., so that we know what kind of related data this file contains (e.g.
user,
course). How the evidence's raw data is serialized is implemented by both direct children of
evidence, as well as by
related_data`.
Three new capabilities tool/laaudit:viewpagecontent
, tool/laaudit:downloadevidence
and tool/laaudit:createmodelversion
, as well as a new role auditor
(See Security) are added in the db
folder. The capability to serve files is defined in lib.php
.
Additionally, the following features are implemented and can be found in LaLA's source code:
- Mustache templates (in the
templates
folder) and output renderers (in theoutput
folder). - A plugin page (
index.php
) that sets some page properties and loads the root renderer. All available model configurations, versions and evidence are available on this page. - The addition of the link to the plugin page to the admin menu under
analytics
(insettings.php
) and for auditors on the front page (inlib.php
). -
Translatable strings (in
/lang/en/
). - Development branches of the plugin on GitHub additionally contain a
test
directory for PHPUnit tests. This folder is removed in release branches.
This work is funded by the Federal Ministry of Education and Research of Germany as part of the project Fair Enough? (project nr: 16DHB4002) at the University of Applied Sciences (HTW) Berlin.