Using Txt Files
The mapping process can be less manual by using .txt files. Please send the files to Hayley Mills to load into Archivist. Once loaded you can then check the mappings. The format specifications are given below along with examples. A list of the Archivist pages which can be used to check the mappings (and view a list of questions) can also be found below with examples.
It is recommended that you create only one topic mapping file first (preferably tv.txt) as the other will be inherited once the question and variable have been mapped. Any gaps in topics can then be filled afterwards. This should prevent topic conflicts.
Format specifications for Archivist
qv.txt
Mapping file which links questions and variables.
Tab delimited
4 columns:
1. Questionnaire prefix with _ccs01 suffix
2. Question label (with optional suffix grid cell coordinates in the format $X;Y. Please refer to the Grid coordinates table on the condition page for how to reference grid cells.)
3. Dataset prefix (which usually matched the questionnaire prefix but without the suffix)
4. Variable name
tv.txt
Mapping file which links topics and all variables.
Tab delimited
3 columns:
1. Dataset prefix
2. Variable name
3. Topic ID (Uses Colectica topic IDs)
tq.txt
Mapping file which links topics and questions (this can be inherited using the qv and tv mappings).
Tab delimited
3 columns:
1. Questionnaire prefix with _ccs01 suffix
2. Question label
3. Topic ID (Uses Colectica topic IDs)
dv.txt
Mapping file which links derived variables to source variables.
Tab delimited
4 columns:
1. Dataset prefix
2. Derived variable name
3. Dataset prefix
4. Source variable name
Question to Variable mappings
These are loaded by navigating to Admin > Instruments > search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the qv.txt file you want to import. See format specifications below for details of the qv.txt file. You can also select to import question to topic mappings tq.txt at the same time. From the dropdown select whether the file is Q-V Mapping or T-Q Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file.
Common reasons for failures:
The incorrect file has been selected
The incorrect mapping type has been selected
The file format is not correct
The instrument and dataset is not linked in Archivist
The content is not valid
The names are not the correct case
The variable is not in the dataset
Typos or spaces in the names
Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the qv.txt file and re-import.
Note: When re-importing, the current mappings will be replaced, so you must include all the question to variables you want to import in the file, not only the records (rows) you want to update.
Note: All manual mappings will be replaced by the imported qv.txt mapping file. If you do not want this to happen, then export the qv.txt from Archivist first, then update this before importing.
Note: you can map variables to individual question grid cells using the coordinates as below, or map to the whole grid without the grid cell reference.
| 1 | 2 | 3 |
---|---|---|---|
a | 1;1 | 2;1 | 3;1 |
b | 1;2 | 2;2 | 3;2 |
c | 1;3 | 2;3 | 3;3 |
Variable to Topic mappings
Questions and variables which are mapped together must have the same topic. Questions which are mapped to a variable which has been linked to a topic, will automatically inherit that topic from the variable. After the question to variables mappings have been imported successfully, it makes sense to add the variable topic mappings in which the questions will inherit those topics. These are loaded by navigating to Admin > Datasets> search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the tv.txt file you want to import. See format specifications below for details of the tv.txt file. You can also select to import derived variable mappings dv.txt at the same time. From the dropdown select whether the file is T-V Mapping or D-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file.
In addition to the common reasons for failures listed above, other reasons include:
Topics conflict - questions and variables which are mapped together must have the same topic. If a different topic has already been assigned to a question which has been mapped to that variable this will be invalid
If a variable has been mapped to 0 (no topic)
Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tv.txt file and re-import.
Note: When re-importing, the previously imported topic mappings will not be replaced unless they have been updated in the file.
Note: All manual mappings will be replaced by the imported tv.txt mapping file. If you do not want this to happen, then export the tv.txt from Archivist first, then update this before importing.
Note: when there are invalid errors related to topic conflicts, these mappings will not be loaded, and so will not show as a conflict on the datasets or instrument map view.
Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.
Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts.
Question to Topic mappings
After the question to variable mappings have been imported successfully, you can import the tv.txt mappings files as above, alternatively if you only have question to topic mappings, or you have both tv.txt and tq.txt you can import those next or at the same time. If only a tv.txt was loaded, it is common that there are lots of questions without any topics, these gaps can be filled in using the interface or by downloading the generated tq.txt from Archivist and filling in any gaps, before importing. See below for more details.
These are loaded by navigating to Admin > Instruments> search for the instrument prefix > IMPORT MAPPINGS. Choose files and select the tq.txt file you want to import. See format specifications below for details of the tq.txt file. You can also select to import the question to variable mappings qv.txt at the same time. From the dropdown select whether the file is T-Q Mapping or Q-V Mapping. Note. the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is one invalid record (row) in the file, see common reasons listed above.
Depending on the number of invalid records, if there are only a couple of issues, then you can fix these by using the Archivist interface see Map Questions, Variables and Topics, If there are many, systematic or file related issues, then update the tq.txt file and re-import.
Note: when there are invalid errors related to topic conflicts, these mappings will not be loaded, and so will not show as a conflict on the datasets or instrument map view.
Note: If you map to 0 (i.e. no topic), this will give an invalid error, but it will map the question to 0, this way you can reset the mapping back to no mappings. This is not the same as None topic.
Note: When re-importing, the current question to topic mappings will be replaced, so you must include all the question to topic mappings you want to import in the file, not only the records (rows) you want to update.
Note: All manual mappings will be replaced by the imported tq.txt mapping file. If you do not want this to happen, then export the tv.txt from Archivist first, then update this before importing.
Note: Only one topic can be applied to a grid as a whole- this means that all variables mapped to a grid question must have the same topic.
Note: There is a difference between a topic not being assigned (0) and a topic mapped to None (000), which is considered a topic, and can cause conflicts.
Derived variable mappings
Derived variables do not directly map to a question, but are created using other variables. A derived variable will have at least two source variables mapped to it. These are loaded by navigating to Admin > Datasets > search for the dataset prefix > IMPORT MAPPINGS. Choose files and select the dv.txt file you want to import. See format specifications below for details of the dv.txt file. You can also select to import question to topic mappings tv.txt at the same time. From the dropdown select whether the file is D-V Mapping or T-V Mapping. Note: the Heroku in-out-worker server must be turned on for the files to be imported. The state will change from pending, to running, then to success or failure once imported. If the state is failure, VIEW LOGS under Actions. The state will be failure if the whole file fails or if there is only one invalid record (row) in the file.
Note: Derived variables and their source variables do not have to have the same topic, although a suggested topic is inherited from the source variable.
Reviewing the mappings
Dataset and instrument views can be used to check for gaps and conflicts.
Dataset view allows you to check Question-Variable, Derived variable, and Variable-topic mappings. Datasets > search and select the Name of the dataset. Any mapping conflicts will appear in red.
Instrument Map view allows you to check Question-Variable and Question-topic mappings. On the instrument page search for the prefix then select MAP. Any mapping conflicts will appear in red.
Instrument view allows you to check Question-Variable mappings. Instruments > search and select the Prefix (or view) of the instrument. This will gives the questionnaire view with the variable names listed against them.
All_mappings.txt view allows you to check and download Question-Variable, Variable-topic and Question-topic mappings. On the instrument page search for the prefix then select MAP, then click Download File, this will list; Question name, Question text, Question Topic ID, Variable name, Variable Topic ID, and Variable Label. Note it doesn't include derived variables mappings.
Viewing and downloading mappings
Most .txt files can be viewed from one Instrument Export page. Navigate to Admin > Instrument exports > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/instruments/ncds_65_eq/exports
QV Question Variables Download qv.txt - To view the qv.txt file in the format described above. Note: this will only include questions which have variables mapped to them.
TQ Topic Questions - Download tq.txt - To view the tq.txt file in the format described above. Note: this will only include questions which have topics mapped to them.
Mapper Question and sequences Download mapper.txt - To view the mapper.txt file which contains the sequences, question name, sequence ID and question text.
CC Questions Construct Questions Download cc_questions.txt - To view cc_questions.txt which contains the question name, and text for question items and question grid sub-questions.
variables.txt - To view the variables for the dataset linked to the questionnaire, includes variable name, label and whether it is normal or a derived variable.
tv.txt - To view the tv.txt file in the format described above. Note: this will include all variables whether they are mapped to a topic or not.
dv.txt - To view the dv.txt file in the format described above.
If there are no question mappings, variable mappings can also be viewed by navigating to Admin > Datasets > search for the prefix then select VIEW e.g. https://closer-archivist.herokuapp.com/admin/datasets/130/export