XML file format
1. Introduction
The overall scope of BMDW development phase two is to receive more data from our listing organisations and to make these data available through our Search & Match Service. However, old format (DOT20) is not an appropriate format when you have many different fields. therefore, we had to move to another file format. The new file format is an XML (Extensible Markup Language) file which is considered an industry standard that is extendable, robust and easy to use.
Several people from the community formed a working group to create the required XML Sschema Definition (XSD) files. These files define the elements that are allowed in the XML file, the order of the elements and the values that will be accepted. The names of the elements are based upon EMDIS specifications and aligns with the EMDIS Data Dictionary when appropriate. Several elements are basic elements that should be included in all files, but there are also elements that are specific for only donors or only cord blood units (CBUs).
We will now explain the composition of the XML file and how you should use the XSD reference files.
2. XSD schema files
We provided two XSD sxhema files that define the structure of your XML file basicTypes.xsd and Inventories.xsd.
The Inventories file describes the structure of the XML file and the order of the elements. Here you can also find if a certain field is mandatory or not (minOccurs="0"-> not mandatory). This file includes many "complexType" : an XML element that contains other elements and/or attributes. In the file you can see that the values of the elements can be defined here, like the elements GRID and ID, or that after the name of the field a "type" is defined. For example for the element with name BIRTH_DATE you see type="bareDateType". The definition of "bareDateType" is described in the basicTypes.xsd file.
We will now describe the global structure of the XML file and the elements.
2.1 InventoryType elements
Field Identifier | Required | Description | Type | Length | Comment |
---|---|---|---|---|---|
CREATION_TIME | Yes | Creation time stamp of the inventories (in UTC) | dateTime | minimal 20 | Without fractional seconds the length is 20, for example: 2016-08-23T13:16:48Z. Additional notes: CREATION_TIME is defined as "Creation time stamp of the <INVENTORIES>" that means the time in UTC when the complete and valid file was finally created at the registry. This can be the same as SNAPSHOT_TIME. |
LISTING_ORGANIZATION | Yes | Organisation that lists the donor/cbu provided as ION | ionType: number between 1000 and 9999 | 4 | Issuing Organisation Number (ION) allocated by ICBBA. This can be different from the POOL when another organisation is sending the data to BMDW. |
POOL | Yes | Physical location of the donors/CBUs of the inventory provided as ION | ionType: number between 1000 and 9999 | 4 | Physical location of the donors/CBUs of the inventory provided as ION. |
CONTENT_TYPE | Yes | Type of the inventory items, i.e. donor ("D") or CBU ("C") | contentTypeType | 1 | The content-type is also shown in the fileName. When CONTENT_TYPE is "D", the INVENTORY must contain <DONOR>-blocks. When CONTENT_TYPE is "C", the INVENTORY must contain <CBU>-blocks. |
UPDATE_MODE | Yes | Update mode of the inventory, i.e. FULL or DIFF | updateModeType | 4 | Only UPDATE_MODE "FULL" is currently supported. Always the complete inventory should be send. |
SNAPSHOT_TIME | No | Timestamp of the 'data snapshot' (in UTC) | dateTime | minimal 20 | Without fractional seconds the length is 20, for example: 2016-08-23T13:16:48Z Additional notes: SNAPSHOT_TIME in the element <INVENTORY> is defined as "timestamp of the data snapshot in UTC" that means the timestamp of the creation of this part of the complete file. This can be the timestamp of the XML export and I guess that in most of the cases it will be identical to the CREATION_TIME. |
SCHEMA_VERSION | Yes | Version of the applied XML Schema Definition (XSD) | schemaVersionType | The schema version is very important as this determines the validation rules that should be applied during the processing of your file. |
2.2 ItemBaseType elements (for Donors and CBUs)
Field Identifier | Required | Description | Type | Length | Comment |
---|---|---|---|---|---|
ID | Yes | Unique identifier of the donor/CBU | String | 17 | Unique identifier of the donor/CBU: The value comprises the EMDIS hub code + donor identification allocated by the associated donor registry, where the sending organisation is an EMDIS member, otherwise the two digit ISO country code of the associated donor registry + donor identification allocated by the associated donor registry. For example: AU600196166, DEGOE-35487, US087013165, SB45 |
GRID | No | Global registration identifier of the donor/CBU | String | 19 | |
ATTR | No | Describing attribute of the donor/CBU according to house rules of the sending organization. | String | 3 | |
BIRTH_DATE | Yes | Date of birth of the donor/CBU | bareDateType | 10 | Date without timezone information, example 1968-06-28, Date Delimiter = "-" |
SEX | No | Biological gender of the donor/CBU | sexType | 1 | sexType: "F","M" NOTE: Mandatory for donors, optional for CBUs |
ABO | No | Blood group (ABO) of the donor/CBU | aboType | 2 | aboType: "A","B","O","AB" |
RHESUS | No | Rhesus (Rh) factor of the donor/CBU | rhesusType | 1 | rhesusType: "P","N" NOTE: "+" and "-" are not supported |
ETHN | No | Ethnic group of the donor/CBU | ethnType | 4 | ethnType: "AFNA","AFSS", "ASSW", "ASSO", "ASCE", "ASSE", "ASNE", "ASOC", "CAEU", |
CCR5 | No | CCR5 status of the donor/CBU | ccr5Type | 2 | ccr5Type: "DD","WW","DW" |
HLA | Yes | HLA of the donor/cbu | hlaType | Explained separately at hlaType 2.3 | |
KIR | No | KIR genotype of the donor/CBU | kirType | Explained separately at kirType 2.4 | |
IDM | No | Infectious disease markers (IDM) and other relevant tests of the donor/CBU | idmType | Explained separately at idmType 2.5 | |
RSV_PAT | No | Unique identifier of the patient the donor/CBU is reserved for (STATUS=RS). | String | 17 | The value comprises the EMDIS patient identification, where the patient search centre is an EMDIS member, otherwise the value is empty. For example: AU9654021, DE275342, US2277450. NOTE: This field is not required for status "RS" and can be transmitted as empty if privacy concerns exist. |
STATUS | Yes | Status of the donor/CBU | statusType | 2 | statusType: "AV","TU","RS" ("DE" is not supported yet, "RE" not valid for CBUs) |
STAT_END_DATE | No | Date until which the current status will be applicable | bareDateType | 10 | Date without timezone information, example 1968-06-28, Date Delimiter = "-" |
2.3 hlaType elements
HlaType fields can be divided in hlaSerFieldsType and hlaDnaFieldsType
hlaSerFieldsType: HLA values obtained by serological typing methods
hlaSerFieldsType = “<FIELD1>” string of max length 5 “</FIELD1>”, “<FIELD2>” string of max length 5 “</FIELD2>”;
Example: <SER><FIELD1>1</FIELD1><FIELD2>5</FIELD2></SER>
Serological typing results can be given for loci that are defined as hlaLocusType. These loci include HLA-A, -B, -C, -DRB1, -DQB1.
hlaDnaFieldsType: HLA values obtained by DNA based typing methods
hlaDnaFieldsType = “<FIELD1>” string of max length 20 “</FIELD1>”, “<FIELD2>” string of max length 20 “</FIELD2>”;
Exanple: <DNA><FIELD1>01:01</FIELD1><FIELD2>05:01</FIELD2></DNA>
DNA typing results can be given for loci that are defined as hlaLocusType and hlaLocusDnaOnlyType. These loci include HLA-A, -B, -C, -DRB1, -DQB1, -DRB3, -DRB4, -DRB5, -DQA1, -DPA1, -DPB1.
Finally, '01:XX' is equivalent to '01'. Both codes '01:XX' and '01' are allowed.
Minimal required elements
Minimal typing values for Donor: A (either SER or DNA), B (either SER or DNA)
Minimal typing values for CBU: A (either SER or DNA), B (either SER or DNA), DRB1 (either SER or DNA)
NOTES: - It is not possible anymore to submit string HLA values; only single values are allowed.
- When a donor or CBU has homozygous alleles/values, please use the following notation:
<HLA><A><SER><FIELD1>1</FIELD1><FIELD2 /></SER></A> ...
or
<DQB1><DNA><FIELD1>05:02:01G</FIELD1><FIELD2 /></DNA></DQB1>
Field Identifier | Required | Description | Type | Length | Comment |
---|---|---|---|---|---|
SER | depends on content type and DNA fields provided | HLA values obtained by serological typing methods | hlaSerFieldsType | 5 | Each SER element contains two other elements: FIELD1 and FIELD2 |
DNA | depends on content type and SER fields provided | HLA values obtained by DNA based typing methods | hlaDnaFieldsType | 20 | Each DNA element contains two other elements: FIELD1 and FIELD2 |
FIELD1 | HLA value of allele 1 | 5 or 20 | Element within the element SER and DNA | ||
FIELD2 | HLA value of allele 2 | 5 or 20 | Element within the element SER and DNA | ||
A | Yes | HLA-A values | hlaLocusType | Both SER and DNA possible; either SER or DNA values required | |
B | Yes | HLA-B values | hlaLocusType | Both SER and DNA possible; either SER or DNA values required | |
C | No | HLA-C values | hlaLocusType | Both SER and DNA possible | |
DRB1 | Yes (CBU) No (Donor) | HLA-DRB1 values | hlaLocusType | Both SER and DNA possible; either SER or DNA values required for CBU | |
DRB3 | No | HLA-DRB3 values | hlaLocusDnaOnlyType | Only DNA possible | |
DRB4 | No | HLA-DRB4 values | hlaLocusDnaOnlyType | Only DNA possible | |
DRB5 | No | HLA-DRB5 values | hlaLocusDnaOnlyType | Only DNA possible | |
DQA1 | No | HLA-DQA1 values | hlaLocusDnaOnlyType | Only DNA possible | |
DQB1 | No | HLA-DQB1 values | hlaLocusType | Both SER and DNA possible | |
DPA1 | No | HLA-DPA1 values | hlaLocusDnaOnlyType | Only DNA possible | |
DPB1 | No | HLA-DPB1 values | hlaLocusDnaOnlyType | Only DNA possible |
2.4 kirType elements
The kirType Field Definitions consists of the type: kirLocusType. This is defined as a String with 3 characters: "POS" or "NEG". "POS" means "Presence of KIR gene", "NEG" means "Absence of KIR gene".
The following elements are possible and in this specific order:
<KIR2DL1>,<KIR2DL2>,<KIR2DL3>,<KIR2DL4>,<KIR2DL5A>,<KIR2DL5B>,<KIR2DS1>,<KIR2DS2>,<KIR2DS3>,<KIR2DS4>,<KIR2DS5>,<KIR2DP1>,<KIR3DL1>,<KIR3DL2>,<KIR3DL3>,<KIR3DS1>,<KIR3DP1>.
There is another field called <KIR_GL> (URI that refers to a GL-string registered with a GL-service or direct GL-string for absence / presence) this field is not used at the moment and must be empty.
Field Identifier | Required | Description | Type | Length | Comment |
---|---|---|---|---|---|
KIR gene e.g. KIR2DL1 | No | KIR genotype e.g. KIR gene 2DL1 | kirLocusType | 3 | valid values: "POS" = presence of KIR gene; "NEG" = absence of KIR gene |
2.5 idmType elements
There are many infectious disease markers (IDM) possible in the element IDM. Many IDM elements can have either the values idmValueType or idmValueExtType
idmValueType includes the following values: "P","N"
idemValueExtType include the following values: “P”,“G”,“M”,“B”,“H”,“O”,“N”
Field Identifier | Required | Description | Type | Length | Comment |
---|---|---|---|---|---|
CMV | No | CMV status | idmValueExtType | 1 | idmValueExtType: “P”,“G”,“M”,“B”,“H”,“O”,“N” EMDIS data dictionary also has a ‘Q’ (questionable / unclear) but that will not be applicable within the BMDW data submission file. |
CMV_NAT | No | CMV NAT status | idmValueType | 1 | idmValueType: "P","N" |
CMV_DATE | No | Date of CMV test | bareDateTyp | 10 | Date without timezone information, example 1968-06-28, Date Delimiter = "-" |
HBS_AG | No | Hepatitis B status (hepatitis B surface antigen) | idmValueType | 1 | idmValueType: "P","N" |
ANTI_HBC | No | Hepatitis B status (antibody to hepatitis B core antigen) | idmValueType | 1 | idmValueType: "P","N" |
ANTI_HBS | No | Hepatitis B status (antibody to hepatitis B surface antigen) | idmValueType | 1 | idmValueType: "P","N" |
ANTI_HCV | No | Hepatitis C status (antibody to hepatitis C virus) | idmValueType | 1 | idmValueType: "P","N" |
ANTI_HIV_12 | No | Anti-HIV 1/2 status | idmValueType | 1 | idmValueType: "P","N" |
HIV_1_NAT | No | HIV-1 NAT status | idmValueType | 1 | idmValueType: "P","N" |
HIV_P24 | No | HIV p24 status | idmValueType | 1 | idmValueType: "P","N" |
HCV_NAT | No | HCV NAT status | idmValueType | 1 | idmValueType: "P","N" |
ANTI_HTLV | No | Antibody to HTLV I/II | idmValueType | 1 | idmValueType: "P","N" |
SYPHILIS | No | Syphilis status | idmValueType | 1 | idmValueType: "P","N" |
WNV | No | WNV status | idmValueType | 1 | idmValueType: "P","N" |
CHAGAS | No | Chagas status | idmValueType | 1 | idmValueType: "P","N" |
EBV | No | EBV status | idmValueExtType | 1 | idmValueExtType: “P”,“G”,“M”,“B”,“H”,“O”,“N” EMDIS data dictionary also has a ‘Q’ (questionable / unclear) but that will not be applicable within the BMDW data submission file. Please leave blank for Q. |
TOXO | No | Toxoplasmosis status | idmValueExtType | 1 | idmValueExtType: “P”,“G”,“M”,“B”,“H”,“O”,“N” EMDIS data dictionary also has a ‘Q’ (questionable / unclear) but that will not be applicable within the BMDW data submission file. Please leave blank for Q. |
HBV_NAT | No | HBV NAT status | idmValueType | 1 | idmValueType: "P","N" |
PB19_NAT | No | ParvoB19 NAT status | idmValueType | 1 | idmValueType: "P","N" |
ALT | No | Alanine aminotransferase status in units per litre | Short | Number, no decimals, minimal value is 1 |
2.6
XML example files
Below you can find two XML example files: one for donors and 1 for CBUs.
Both files contain only 2 records, but in those two records almost all possible elements contain a value. It can help you to check the order of the elements in your own XML file. Please be aware that values like GRID are fictive and do not follow the rules for the check character.