Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Preparation, Just started, In progress, In review/In staging, In Production, Closed

 

Brief description of project, scope and aimed deliverables

Currently only full datasets can be uploaded or downloaded. Since only small amounts of donors/cbu actually change or get added/deleted, this is a wasteful exercise, both in capacity for validation, data transfer, and time. The differentials project aims to set up a method to allow for only partial data-uploads that contain only the differentials between the old and the new dataset.

Please note that Differential data-exchange means sending or receiving only the records that have been added, deleted or changed. This in contrary with the term Differences where is meant the (visual) indication if a record has been changed.

Scope includes both UPLOAD (GCD2 ingress) as DOWNLOAD/EXPORT of datasets (GCD2 egress)

1st stage:  provide the Differential upload by uploading a Differential File. The whole process is similar to current FULL upload.

2nd stage: An API real time differentials  update may be implemented.

Deliverables are

  • Description and pro's and cons for various methods that allows for differential UPLOADS.
  • Technical design of possible solution
  • Implemented solution.
Show If
groupmember access
Panel
borderColorlightgray
borderWidth1
borderStylesolid
title
Project brief - Project Differentials

Project Name

Project Differentials

Author

Project manager / owner

Submitted

Project group members:

Approval status

APPROVED

Project office supporter:

Approval date

Status
Progress Bar | Vectors (Formerly: SP Progress bar)
current-step5
steps

Start date:
End date:

Panel
borderColorlightgray
borderWidth1
borderStylesolid
titleDifferential Survey
Survey  info for PublicLinks and files
Final surveyhttps://www.mysurveygizmo.com/s3/4980562/Differential-update-survey.
Survey chart report till 2019-05-15https://data.surveygizmo.com/r/665609_5cc3357072aa95.39138997
Panel
borderColorlightgray
borderWidth1
borderStylesolid
titleDifferential upload User Guide
Differential upload user guide

The 1st stage differential upload is to upload a .gpg file to add, update or delete records. The existing UPDATE_MODE and STATUS fields are used to give information to do differential upload. 

Table of Contents

1.Organizations  Organizations allowed for DIFF upload

  • The registries that upload donors with GRID. GRID is compulsory for DIFF.
  • All registries and CBBs upload CBUs

      WMDA suggests all organizations with dataset more than 100K consider to implement DIFF upload to save effort to time.

2. How to create the .gpg file 

When there is more than one record 

  • Following the same process in the FULL upload user guide  Search & Match Service Data Submission Information to prepare the differential upload .gpg file.
  • The DIFF upload needs also include all the fields for a DONOR/CBU exactly like the FULL upload. For no records situation for DIFF file, please check guide below.
  • Changes comparing to current FULL upload (Will later be updated in the FULL upload user guide)

    Field IdentifierRequiredDescriptionTypeLengthComment
    UPDATE_MODEYesUpdate mode of the inventory, i.e. FULL or DIFFupdateModeType4

    "FULL" means for full upload
    "DIFF" means differential upload 

    STATUSYesStatus of the donor/CBUstatusType2

    statusType: "AV" ,"TU" ,"RS",  "DE" *

    AV = Available for transplantation purposes
    TU = Temporarily unavailable
    RS = Reserved
    DE = Deleted, permanently unavailable*

    *DE in is only supported in DIFF upload, will be rejected in FULL upload.







When there is no record

Accepted structure
DIFF file generation normally run automatedly, and it is possible there is no record generated. We suggest our registry do send the updated for this empty file case. But the empty record file must use the structure as below to not include the <INVENTORY> element. Otherwise the data process will generate error and block new files from your registry to be processed. 
XML example as below:
        

Code Block
languagexml
titleXML example for file with empty record
<?xml version="1.0" encoding="UTF-8"?>
<INVENTORIES CREATION_TIME="2022-03-07T10:00:03Z">
</INVENTORIES>

Not accepted structure

Structure below will cause issue in WMDA data process, we  added a ticket to add this feature to future release, but not supported it yet.

Code Block
languagexml
titleXML issue structure with empty record
<?xml version="1.0" encoding="UTF-8"?>
<INVENTORIES CREATION_TIME="2022-07-15T07:49:26Z">
  <INVENTORY LISTING_ORGANIZATION="0999" POOL="0999" CONTENT_TYPE="C" UPDATE_MODE="DIFF" SNAPSHOT_TIME="2022-07-15T07:49:26Z" SCHEMA_VERSION="2.3"/>
</INVENTORIES>

3.Expected behavior


Differential Update casesFile level

DONOR Behavior

(

The validation of ID will be improved after GRID become

GRID is compulsory)

CBU Behavior
Upload frequencyUpload  frequency must be more than 15mins  for each DIFF upload. Otherwise, only one file in each 15mins will be processed

Add new records when STATUS is "AV"  or "TU" or "RS".
Valid records

  • Records with
not existing ID and GRID  will be added 
  • duplicated ID in the upload file will be rejected. 
  • Records with duplicated GRID in the upload file
Valid records with not existing ID but duplicate GRID
  • will be rejected.
  • Records without GRID will be rejected.
    But there is no rejection message, under implementation in next release.
  • Invalid records will be rejected.
  • Valid records with not existing
ID
  • GRID  will be
added
  • added 
  • Invalid records will be rejected.
  • Valid records with not existing ID will be added
Update existing records when STATUS is "AV" or "TU" or "RS"
  • Valid records, and no GRID duplication will be updated

  • Records with duplicated ID in the upload file will be rejected. 
  • Records
Valid records
  • with duplicated GRID in the upload file will be rejected.
  • Invalid records will be rejected, and existing record with same ID and GRID in database will be deleted
  • Valid records with existing GRID will be updated
  • Records with duplicated ID in the upload file will be
updated
  • rejected. 
  • Invalid records will be rejected, and existing record with same ID and GRID will be deleted in the database.
  • Valid records with existing ID in DB will be updated
Deleted records when STATUS is "DE"
  • Record exists in database will be deleted.
  • Record does not exist in the database will
generate a warning. 
  • be ignored. And no warning message in the report.
  • Record exists
Records existing
  • in database will be deleted.
Records do
  • Record does not
existing
  • exist in the database will
generate a warning. 
  • be ignored.And no warning message in the report.
Upload records threshold limitation

Less than

10K

200K for each DIFF upload. 

(Can request for more than

10K

200K DIFF upload if needed, please contact support@wmda.info)

3


4. Business validation rules

Reference numberValidation levelValidation SourceDate rule is valid Effective since XSD version numberField NameError in fieldReported Validation messageAction
9File

2.1UPDATE_MODEInvalid update modeYour file has been rejected as the UPDATE_MODE must be equal to "FULL" or "DIFF".Reject file
222FileWMDA
2.1UPDATE_MODEInvalid update mode for multiple inventoriesYour file has been rejected as we have identified mixed update modes in your XML inventories.Reject file

FileWMDA
2.1N/AToo many record provided for DIFF uploadFile could not be processed due to DIFF upload exceeded record count threshold: 10000 600KReject file
4

5. Errors and warnings in the processing report

Report

Code Block
themeConfluence
titleExample report of Differential upload
File: ION-0999-D.gpg 2019-09-09 11:39:22
Pool(s): 0999
Content Type: D
Update Mode: DIFF
Start processing: 2019-09-09 11:50:00
Schema version: 2.1
Total records processed: 14
Total records with warnings: 6
Total records rejected: 4
Total valid records: 7
Total updated records: 4
Total new records: 4
Total deleted records: 2

List of Records with duplicated ID or GRID:

POOL: 0999
IDs:
TD-000004
GRIDs: 
1234000000000203420


W | 0999 | TD-000002 | N/A | (Warning) GRID 774800006001853603 must be 19 characters.
R | 0999 | TD-000002 | N/A | GRID (Record Rejected) GRID is a mandatory field.
W | 0999 | TD-000003 | N/A | (Warning) GRID 7748000060018E53612 checksum is not correct.
R | 0999 | TD-000003 | N/A | GRID (Record Rejected) GRID is a mandatory field.
W | 0999 | TD-000010 | 1234000000000001031 | STAT_END_DATE (Warning) Status end date cannot be > 5yrs in the future.
W | 0999 | TD-000010 | 1234000000000001031 | STAT_REASON (Warning) Status reason cannot be provided with status AV or RS.
R | 0999 | TD-000013 | 999900000TD00001129 | BIRTH_DATE (Record Rejected) BIRTH_DATE is a mandatory field.
W | 0999 | TD-000014 | N/A | (Warning) GRID must be 19 characters.
WR | 0999 | TD-000014 | 9999000000020000629N/A | GRID (Warning) Record withRejected) DEGRID statusis doesa notmandatory existfield.


List of deleted records
D | 0999 | 9999000000000000511 
D | 0999 | 9999000000000001115

Processing finished at: 2019-09-09 11:53:55
Total processing time: 30 minutes.


The processing report for differential upload has 4 3 parts.

  • 1st part is the summary of the file upload, with more details of updated, new, and deleted records. 
  • 2nd part is the duplication in the uploaded file, data uploader should clean up the duplication.
  • 3rd part is the rejections and warnings. Data uploader should clean up all of them.
  • 4th part is the list of the Deleted records. It is for information, and no action is needed.

Summary numbers:

We define some cases:

Cases
DE non exist
: (Warning) Record with DE status does not exist.
, ignoredThis

                       Example:  W | 0999 | 9999000000020000629 | (Warning) Record with DE status does not exist.

                       This
is the records with "DE" status, but they do not exist yet in WMDA database.
Only GRID will be displayed in the report. 
This will be ignored.
As it is resource consuming to calculate the number based on the test implementation, WMDA decide to ignore to calculate it.
GRID Missing, rejected

GRID is compulsory for DIFF and FULL from Dec, 17, 2019. And the messages include Warning (W) type, and also Rejection (R) type. 

Example:  W | 0999 | TD-000014 | N/A |

GRID Missing

(Warning) GRID must be 19 characters.

.


               

         Example: W

| 0999 | TD-000014 | N/A | GRID (

Warning

Record Rejected) GRID

must be 19 characters.

is a mandatory field.  

         

               GRID is compulsory for DIFF, but not for FULL upload yet. So the message is Warning (W) type, but for Rejection (R) type yet. Will be improved after implement GRID as compulsory for FULL donor upload in November, 2019.

  

Calculation

Total records = Total records rejected + Total valid records + Total deleted

records

records + DE non exist

Total valid records = Total valid

Valid records = updated records +

Total

new records + no change exist records

Total

Rejected records

rejected=Records with duplicated ID or GRID in part 2R in part 3 + GRID Missing

= Rejected records + DE records missing required fields (not listed in the R/W details)


New Record Errors/warnings:

We do not have new in DIFF upload yet.

TypeMessageDescription




6. Organizations use DIFF upload

IONDate
ION-35532020-10-30 CBU, 2021-02-17 Donor
ION-4596 includes following IONs:
ION-1574, 5525,6738,7414, 9935, 9968
2020-10-08 and later for rest of the IONs that 4596 is in charge of
ION-51032022-08-08
ION-81392019-11-01
ION-84862023-10-16
ION-87662023-02-15RecordWarningRecord with DE status does not exist. When a record is with STATUS "DE" in differential upload file, but this record dose not exists in database