Project Name

Project Differentials

Author

Project manager / owner

Submitted


Project group members:

Approval status

APPROVED

Project office supporter:

Approval date


Status


Start date:
End date:

 



Brief description of project, scope and aimed deliverables

Currently only full datasets can be uploaded or downloaded. Since only small amounts of donors/cbu actually change or get added/deleted, this is a wasteful exercise, both in capacity for validation, data transfer, and time. The differentials project aims to set up a method to allow for only partial data-uploads that contain only the differentials between the old and the new dataset.

Please note that Differential data-exchange means sending or receiving only the records that have been added, deleted or changed. This in contrary with the term Differences where is meant the (visual) indication if a record has been changed.

Scope includes both UPLOAD (GCD2 ingress) as DOWNLOAD/EXPORT of datasets (GCD2 egress)

1st stage:  provide the Differential upload by uploading a Differential File. The whole process is similar to current FULL upload.

2nd stage: An API real time differentials  update may be implemented.

Deliverables are

  • Description and pro's and cons for various methods that allows for differential UPLOADS.
  • Technical design of possible solution
  • Implemented solution.


Survey  info for PublicLinks and files
Final surveyhttps://www.mysurveygizmo.com/s3/4980562/Differential-update-survey.
Survey chart report till 2019-05-15https://data.surveygizmo.com/r/665609_5cc3357072aa95.39138997


The 1st stage differential upload is to upload a .gpg file to add, update or delete records. The UPDATE_MODE and STATUS fields are used to give information to do differential upload. 

1.Organizations allowed for DIFF upload

  • The registries that upload donors with GRID. GRID is compulsory for DIFF.
  • All registries and CBBs upload CBUs

2.How to create the .gpg file 

  • Following the same process in the FULL upload user guide  Search & Match Service Data Submission Information to prepare the differential upload .gpg file.
  • The DIFF upload needs also include all the fields for a DONOR/CBU exactly like the FULL upload. 
  • Changes comparing to current FULL upload (Will later be updated in the FULL upload user guide)

    Field IdentifierRequiredDescriptionTypeLengthComment
    UPDATE_MODEYesUpdate mode of the inventory, i.e. FULL or DIFFupdateModeType4

    "FULL" means for full upload
    "DIFF" means differential upload 

    STATUSYesStatus of the donor/CBUstatusType2

    statusType: "AV" ,"TU" ,"RS",  "DE" *

    AV = Available for transplantation purposes
    TU = Temporarily unavailable
    RS = Reserved
    DE = Deleted, permanently unavailable*

    *DE in only supported in DIFF upload, will be rejected in FULL upload.







2.Expected behavior


Differential Update casesFile level

DONOR Behavior

(The validation of ID will be improved after GRID become compulsory)

CBU Behavior
Upload frequencyUpload  frequency must be more than 15mins  for each DIFF upload





Add new records when STATUS is "AV"  or "TU" or "RS".
  • Valid records with not existing ID and GRID  will be added 
  • Valid records with not existing ID but duplicate GRID will be rejected.
  • Records without GRID will be rejected.
    But there is no rejection message, under implementation in next release.
  • Invalid records will be rejected.
  • Valid records with not existing ID will be added
  • Invalid records will be rejected.
Update existing records when STATUS is "AV" or "TU" or "RS"
  • Valid records, and no GRID duplication will be updated
  • Valid records with duplicated GRID will be rejected.
  • Invalid records will be rejected, and existing record with same ID and GRID in database will be deleted
  • Valid records with existing ID will be updated
  • Invalid records will be rejected, and existing record with same ID and GRID will be deleted in the database.
Deleted records when STATUS is "DE"
  • Record exists in database will be deleted.
  • Record does not exist in the database will generate a warning. 
  • Records existing in database will be deleted.
  • Records do not existing in the database will generate a warning. 
Upload records threshold limitation

Less than 10K for each DIFF upload. 

(Can request for more than 10K DIFF upload if needed)




3. Business validation rules

Reference numberValidation levelValidation SourceDate rule is valid Effective since XSD version numberField NameError in fieldReported Validation messageAction
9File

2.1UPDATE_MODEInvalid update modeYour file has been rejected as the UPDATE_MODE must be equal to "FULL" or "DIFF".Reject file
222FileWMDA
2.1UPDATE_MODEInvalid update mode for multiple inventoriesYour file has been rejected as we have identified mixed update modes in your XML inventories.Reject file

FileWMDA
2.1
Too many record provided for DIFF uploadFile could not be processed due to DIFF upload exceeded record count threshold: 10000Reject file

4. Errors and warnings in the processing report

Report

File: ION-0999-D.gpg 2019-09-09 11:39:22
Pool(s): 0999
Content Type: D
Update Mode: DIFF
Start processing: 2019-09-09 11:50:00
Schema version: 2.1
Total records processed: 14
Total records with warnings: 6
Total records rejected: 4
Total valid records: 7
Total updated records: 4
Total new records: 4
Total deleted records: 2

List of Records with duplicated ID or GRID:

POOL: 0999
IDs:
TD-000004
GRIDs: 
1234000000000203420


W | 0999 | TD-000002 | N/A | (Warning) GRID 774800006001853603 must be 19 characters.
W | 0999 | TD-000003 | N/A | (Warning) GRID 7748000060018E53612 checksum is not correct.
W | 0999 | TD-000010 | 1234000000000001031 | STAT_END_DATE (Warning) Status end date cannot be > 5yrs in the future.
W | 0999 | TD-000010 | 1234000000000001031 | STAT_REASON (Warning) Status reason cannot be provided with status AV or RS.
R | 0999 | TD-000013 | 999900000TD00001129 | BIRTH_DATE (Record Rejected) BIRTH_DATE is a mandatory field.
W | 0999 | TD-000014 | N/A | (Warning) GRID must be 19 characters.
W | 0999 | 9999000000020000629 | (Warning) Record with DE status does not exist.


List of deleted records
D | 0999 | 9999000000000000511 
D | 0999 | 9999000000000001115

Processing finished at: 2019-09-09 11:53:55
Total processing time: 3 minutes.


The processing report for differential upload has 4 parts.

  • 1st part is the summary of the file upload.
  • 2nd part is the duplication in the uploaded file, data uploader should clean up the duplication.
  • 3rd part is the rejections and warnings. Data uploader should clean up all of them.
  • 4th part is the list of the Deleted records. It is for information, and no action is needed.

Summary numbers:

We define some cases:

DE non exist: (Warning) Record with DE status does not exist.

                       Example:  W | 0999 | 9999000000020000629 | (Warning) Record with DE status does not exist.

                       This is the records with "DE" status, but they do not exist yet. Only GRID will be displayed in the report. 

GRID Missing:  (Warning) GRID must be 19 characters..
                         Example: W | 0999 | TD-000014 | N/A | (Warning) GRID must be 19 characters.
                         GRID is compulsory for DIFF, but not for FULL upload yet. So the message is Warning (W) type, but for Rejection (R) type yet. Will be improved after implement GRID as compulsory for FULL donor upload in November, 2019.


Total records = Total records rejected+Total valid records+Total deleted records+DE non exist

Total valid records = Total valid records+Total new records

Total records rejected=Records with duplicated ID or GRID in part 2R in part 3 + GRID Missing


Record Errors/warnings:

TypeMessageDescription
RecordWarningRecord with DE status does not exist. When a record is with STATUS "DE" in differential upload file, but this record dose not exists in database