Dataviews for Analytics
What are dataviews?
Dataviews allow us to maintain an Elentra "datamart" useful for analysts, data pipelines, and connecting third-party business intelligence tools.
Two sets of SQL views: vs_* and vc_*.
Simple views (v_s_*) are straight representations of the underlying tables, with soft deletes factored out.
Complex views (v_c_*) join together two or more tables to create a useful analytical table, for example a table representing individual A&E evaluation form responses.
We can also, optionally, use dataviews to provide a governance layer of security and privacy as well as to maintain institutionally-developed data dictionaries by enforcing column naming conventions.
Example: We can give one group of users or set of models access to evaluation data without also giving them access to assessment data, even though the data points are read out of the same underlying tables.
Example: We can choose to refer to "Curriculum Tracks" as "Regional Campuses" or some other descriptor that better aligns with our institutional semantics.
Recommended: Setup separate "analytics" database
Create a new
elentra_analytics
database on your primary and staging database servers.
You will need to log into your database servers with a root
account that has privileges to create databases to perform this action.
Add the following configuration to config.inc.php for each environment you wish to support (e.g., config-staging.inc.php, config-production.inc.php).
If you do not specify an analytics database, the dataviews will be housed in your elentra_me
database. Best practice is to go head and setup a new database schema, as it will leave ad-hoc backups in a more portable state.
Merge the new line from core settings.inc.php into your environment(s) settings files (e.g., settings-production.inc.php):
Execute dataviews
You can add (or remove) dataviews your environment manually by running:
You can also add this to your deployment (directly after the migrations task is a good place).
SQL views need to be refreshed when the underlying tables change, so you will want to add this if you move beyond the evaluation stage.
Optional: Extend the "scrub.sh" script
The utils
directory (see below) contains a replicate.sh
script which is an extension of the Elentra scrub.sh
script used to move production data to staging or other target environments.
You will want to configure replicate.sh
and use it to replace your existing scrub.sh
.
Support for replacing DEFINER permissions to make the .sql backups portable across servers with different credentials
Support for targeting a third "reporting" environment (production>staging PLUS either production>reporting OR staging>reporting)
NOTE: a reference line is included to exclude the statistics table from the reporting run.
It is commented out, but useful if you want to speed up mysqldump/restore for testing purposes.
Working with dataviews
The
dataviews
directory lives inwww-root/core/library/Dataviews
The
local
directory is not maintained in the upstream Consortium Core repository, and is available for institutions to optionally wrap "core" views with their own column naming conventions or to develop institutional-specific views.The
utils
directory contains a number of useful scripts for generating views, grant statements, etc.
Note to query authors: The string AUTH_DATABASE
in the .sql files is substituted with the Elentra AUTH_DATABASE constant before the queries are executed.
So you can write a query like
SELECT firstname FROM AUTH_DATABASE.user_data
and it will do the right thing.
When targeting tables in the elentra_me
database no schema prefix is necessary.
So you can write a query like
SELECT event_id FROM events
and it will also do the right thing, regardless of which schema you are using to house your views.
However, CREATE VIEW
should use the target schema, and this is done for you if you use the views.sql
utility
For example:
CREATE OR REPLACE SQL SECURITY DEFINER VIEW
ANALYTICS_DATABASE.v_s_courses
Last updated