Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Warning

This page is still under construction.

The mapping process can be less manual by using .txt files. Please send the files to Hayley Mills to load into Archivist. Once loaded you can then check the mappings. The format specifications are given at the end of the page, below along with examples.

Note: you must be logged into Archivist to view the below pages.

A question list can be saved as a .txt from Archivist by navigating to the instrument view and then adding /cc_questions.txt to the web address e.g.https://archivist.closer.ac.uk/instruments/alspac_01_rat/cc_questions.txt.

Mappings (once applied) can be viewed in Archivist by navigating to either the instrument or dataset view using the following suffixes. Details of the formats for each can be found at the bottom of this page. 

Question to variable mappings add /mapping.txt to the instrument web address e.g.https://archivist.closer.ac.uk/instruments/alspac_01_rat/mapping.txt

Question to topic mappings add /tq.txt to the instrument web address e.g. https://archivist.closer.ac.uk/instruments/alspac_01_rat/tq.txt

Derived variable mappings add /dv.txt to the dataset web address e.g. https://closer-archivist.herokuapp.com/datasets/12/dv.txt

Variable to topic mappings add tv.tx to the dataset we address e.g. https://closer-archivist.herokuapp.com/datasets/12/tv.txt

The following steps can be used to upload mappings via the .txt files. Note you must notify Hayley Mills or Will Poynter when you are carrying out these mappings using this method, so the appropriate server can be turned on first.

1)      Import the sledgehammer XML

Check whether the dataset has been uploaded by searching the main Datasets page, if not, upload the XML file. Admin > Import. Under Upload DDI Dataset files, browse and Import Dataset. This might take a while to appear in the Datasets list.

Note: The prefix used when running Sledgehammer must be identical to the instrument prefix. This must be checked carefully as otherwise this will mean mappings will have to be redone.

Note: DDI-Flavour must be run after Sledgehammer has been run, and before importing.

Note: This step is currently run by the CLOSER team, as the dataset needs to be linked with the instrument, which is a feature yet to be added to Archivist.

2)      Import the variable mappings

Admin > Datasets > search for the loaded dataset (prefix or dataset name) > Import Mappings. Both the tvlinking.txt and the dv.txt can be imported. If uploading new mapping files, these will not replace those already mapped, but will add any additional mappings.

3)      Import the question mappings

Admin > Instruments > search for the questionnaire prefix > Import Mappings. Both the qvmapping.txt and tqlinking.txt can be imported. If uploading new mapping files, these will not replace those already mapped, but will add any additional mappings.

4)     Check the mappings in both the dataset and instrument view.

 Datasets A list of the Archivist pages which can be used to check the mappings (and view a list of questions) can also be found below with examples. 

It is recommended that you create only one topic mapping file first (preferably tv.txt) as the other will be inherited once the question and variable have been mapped. Any gaps in topics can then be filled afterwards. This should prevent topic conflicts.

Format specifications for Archivist

qv.txt

Mapping file which links questions and variables.

  • Tab delimited

  • 4 columns:

1. Questionnaire prefix with _ccs01 suffix

2. Question label (with optional suffix grid cell coordinates in the format $X;Y. Please refer to the Grid coordinates table on the condition page for how to reference grid cells.)

3. Dataset prefix (which usually matched the questionnaire prefix but without the suffix)

4. Variable name

tv.txt

Mapping file which links topics and all variables.

  • Tab delimited

  • 3 columns:

1. Dataset prefix

2. Variable name

3. Topic ID (Uses Colectica topic IDs)

tq.txt 

Mapping file which links topics and questions (this can be inherited using the qv and tv mappings).

  • Tab delimited

  • 3 columns:

1. Questionnaire prefix with _ccs01 suffix

2. Question label

3. Topic ID (Uses Colectica topic IDs)

dv.txt

Mapping file which links derived variables to source variables.

  • Tab delimited

  • 4 columns:

1. Dataset prefix

2. Derived variable name

3. Dataset prefix

4. Source variable name

Question to Variable mappings

These are loaded by navigating to Admin > Instruments > search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the qv.txt file you want to import. See format specifications below for details of the qv.txt file. You can also select to import question to topic mappings tq.txt at the same time. From the dropdown select whether the file is Q-V Mapping or T-Q Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

Common reasons for failures:

  • The incorrect file has been selected

  • The incorrect mapping type has been selected

  • The file format is not correct

  • The instrument and dataset is not linked in Archivist

  • The content is not valid

    • The names are not the correct case

    • The variable is not in the dataset

    • Typos or spaces in the names

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the qv.txt file and re-import.

Note: When re-importing, the current mappings will be replaced, so you must include all the question to variables you want to import in the file, not only the records (rows) you want to update. 

Note: All manual mappings will be replaced by the imported qv.txt mapping file. If you do not want this to happen, then export the qv.txt from Archivist first, then update this before importing. 

Note: you can map variables to individual question grid cells using the coordinates as below, or map to the whole grid without the grid cell reference. 

1

2

3

a

1;1

2;1

3;1

b

1;2

2;2

3;2

c

1;3

2;3

3;3

Variable to Topic mappings

Questions and variables which are mapped together must have the same topic. Questions which are mapped to a variable which has been linked to a topic, will automatically inherit that topic from the variable. After the question to variables mappings have been imported successfully, it makes sense to add the variable topic mappings in which the questions will inherit those topics. These are loaded by navigating to Admin > Datasets> search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the tv.txt file you want to import. See format specifications below for details of the tv.txt file. You can also select to import derived variable mappings dv.txt at the same time. From the dropdown select whether the file is T-V Mapping or D-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

In addition to the common reasons for failures listed above, other reasons include:

  • Topics conflict - questions and variables which are mapped together must have the same topic. If a different topic has already been assigned to a question which has been mapped to that variable this will be invalid

  • If a variable has been mapped to 0 (no topic)

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tv.txt file and re-import.

Note: When re-importing, the previously imported topic mappings will not be replaced unless they have been updated in the file. 

Note: All manual mappings will be replaced by the imported tv.txt mapping file. If you do not want this to happen, then export the tv.txt from Archivist first, then update this before importing.  

Note: when there are invalid errors related to topic conflicts, these mappings will not be loaded, and so will not show as a conflict on the datasets or instrument map view.

Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.

Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts. 

Question to Topic mappings

After the question to variable mappings have been imported successfully, you can import the tv.txt mappings files as above, alternatively if you only have question to topic mappings, or you have both tv.txt and tq.txt you can import those next or at the same time. If only a tv.txt was loaded, it is common that there are lots of questions without any topics, these gaps can be filled in using the interface or by downloading the generated tq.txt from Archivist and filling in any gaps, before importing. See below for more details. 

These are loaded by navigating to Admin > Instruments> search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the tq.txt file you want to import. See format specifications below for details of the tq.txt file. You can also select to import the question to variable mappings qv.txt at the same time. From the dropdown select whether the file is T-Q Mapping or Q-V Mapping. Note. the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is one invalid record (row) in the file, see common reasons listed above. 

Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tq.txt file and re-import.

Note: when there are invalid errors related to topic conflicts, these mappings will not be loaded, and so will not show as a conflict on the datasets or instrument map view.

Note: If you map to 0 (i.e. no topic), this will give an invalid error, but it will map the question to 0, this way you can reset the mapping back to no mappings. This is not the same as None topic. 

Note: When re-importing, the current question to topic mappings will be replaced, so you must include all the question to topic mappings you want to import in the file, not only the records (rows) you want to update. 

Note: All manual mappings will be replaced by the imported tq.txt mapping file. If you do not want this to happen, then export the tv.txt from Archivist first, then update this before importing.  

Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.

Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts. 

Derived variable mappings

Derived variables do not directly map to a question, but are created using other variables. A derived variable will have at least two source variables mapped to it. These are loaded by navigating to Admin > Datasets > search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the dv.txt file you want to import. See format specifications below for details of the dv.txt file. You can also select to import question to topic mappings tv.txt at the same time. From the dropdown select whether the file is D-V Mapping or T-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file. 

Note: Derived variables and their source variables do not have to have the same topic, although a suggested topic is inherited from the source variable. 

Reviewing the mappings

Dataset and instrument views can be used to check for gaps and conflicts.

  • Dataset view allows you to check Question-Variable, Derived variable, and Variable-topic mappings. Datasets > search and select the Name of the dataset.  Any mapping conflicts will appear in red. 

  • Instrument Map view allows you to check Question-Variable and Question-topic mappings. On the instrument page search for the prefix then select MAP. Any mapping conflicts will appear in red.

...

  • Instrument view allows you to check Question-Variable mappings. Instruments > search and select the Prefix (or view) of the instrument. This will gives the questionnaire view with the variable names listed against them

...

Note: Variables can be mapped to grid references in the mapping.txt file, but currently these will only be displayed as mapped to the gird as a whole.

Format specifications:

cc_questions.txt

Question list produced by Archivist.

  • ID
  • Text (Literal)

mapping.txt

Mapping file which links questions and variables (listing all variables and derived variables).

  • Tab delimited
  • Must include all variables including derived variables
  • 2 columns:
    • qc
      • Question name
      • ‘0’ if mapped to nothing
      • ‘0’ if derived variable
      • [OPTIONAL] suffix grid cell coordinates in the format $X;Y. Please refer to the Grid coordinates table on the condition page for how to reference grid cells.
    • Variable name

tv.txt

Mapping file which links topics and variables.

  • Tab delimited
  • Used for applying level 1 and 2 topics to variables
  • Uses Colectica topic IDs
  • 2 columns:

 tq.txt

Mapping file which links topics and questions.

  • Tab delimited
  • Used for applying level 1 and 2 topics to variables
  • Uses Colectica topic IDs
  • 2 columns:

dv.txt

Mapping file which links derived variables to variables.

  • Tab delimited
  • 2 columns:
    • Derived variable name
    • Source variable name

...

  • .

  • All_mappings.txt view allows you to check and download Question-Variable, Variable-topic and Question-topic mappings. On the instrument page search for the prefix then select MAP, then click Download File, this will list; Question name, Question text, Question Topic ID, Variable name, Variable Topic ID, and Variable Label. Note it doesn't include derived variables mappings.

Viewing and downloading mappings

Most .txt files can be viewed from one Instrument Export page. Navigate to Admin > Instrument exports > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/instruments/ncds_65_eq/exports

  • QV Question Variables Download qv.txt  - To view the qv.txt file in the format described above. Note: this will only include questions which have variables mapped to them. 

  • TQ Topic Questions - Download tq.txt - To view the tq.txt file in the format described above. Note: this will only include questions which have topics mapped to them. 

  • Mapper Question and sequences Download mapper.txt - To view the mapper.txt file which contains the sequences, question name, sequence ID and question text. 

  • CC Questions Construct Questions Download cc_questions.txt - To view cc_questions.txt which contains the question name, and text for question items and question grid sub-questions. 

  • variables.txt - To view the variables for the dataset linked to the questionnaire, includes variable name, label and whether it is normal or a derived variable. 

  • tv.txt - To view the tv.txt file in the format described above. Note: this will include all variables whether they are mapped to a topic or not. 

  • dv.txt - To view the dv.txt file in the format described above. 

If there are no question mappings, variable mappings can also be viewed by navigating to Admin > Datasets > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/datasets/130/export