Validation
This will validate the raw data using the standard library.
It will only pay attention to errors of severity
== failure
. All other errors of lesser severity are discarded.
It will save the results back in the database in theraw_data
table.
validation_done
- Boolean; has this data been validated?validation_results
- JSON - contains details of the errors, if there were any.validation_passed
- Boolean; did validation pass?Although this can be calculated by checking
validation_done
and the JSON invalidation_results
, storing this this makes it very easy to calculate statistics.
Every time a piece of raw data is updated or deleted by a publisher's RPDE feed, these variables are reset. In this way the validation results in those columns:
Are always for the latest data.
Will be recalculated every time a piece of data changes.
To run this:
$ node ./src/bin/validate-raw-data.js
It can be stopped at any time and it will not leave the database in a bad state. Only records being processed at that particular moment in time would be lost.
When restarted it will pick up where it left off.
Clear out work already done (Database storage)
To clear out all work already done, you can run the SQL:
This will leave all raw data still in the database, but will clear all data around validating that data.
To only validate a limited set of data
Stop the process early
The process can be forcibly stopped at any point and the database will not be in a bad state. It will contain most work done up to the point you stop it.
Last updated