Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Page in progress

This page is under construction.

GitLab parsers are used to import questionnaires directly into Archivist, rather than entering them from scratch. Underneath this page, you can find pages on:

  • Creating a GitLab account.
  • Running archivist_insert.

Note: It is important to make sure that you have the latest version of the parser every time you run it. If updates have been made, you should re-fork or pull the latest version of the parser to ensure you have the latest version. See Updating your GitLab pipeline for how to do this. 

Creating a GitLab Account

a. If you don’t have a GitLab account, please register at https://gitlab.com/.

b. Once you sign in on GitLab go to https://gitlab.com/jli755/archivist_insert_workaround.

c. Fork the repo on Gitlab.com - click “Fork” in the right-hand corner of the window (see screenshot below).


       

    

      d. You now need credentials for Heroku and Archivist login. Please contact Hayley Mills to obtain the variables.

i. First, make sure you are in the right part of the GitLab (i.e., in your account). The URL should look like: https://gitlab.com/(your account name)/archivist_insert.



ii. To add the variables, go to Settings → CI/CD → Variables → Expand → Add Variable.


                     

               

                          


                           

           Make sure both boxes (Protect variable and Mask variable) are ticked.



Running archivist_insert


a. Tables - You can use "csv" or "tsv" table formats, but not a mixture of them. (If you are using the "tsv" tables please see step 2 below) 

i. Copy your tables into "archivist_table"

1. Open the “archivist_tables” folder in Gitlab

2. Click “upload file” to copy the files to the folder (you have to upload individual files, therefore repeat the process for to upload all of your files.

3. To stop running the pipeline automatically, you need to add info about when you add or update on of these files.  Otherwise, it will automatically run the pipeline.  You should only run the pipeline when all files have been added; so in order to stop it running when you are not ready add [skip ci] to the comment.




                  

b. tsv tables instead of csv tables

i. Need to change the delimiter in the db_temp.sql file

ii. May need to specify the encoding of the file (Please see below)

\COPY temp_sequence FROM 'archivist_tables/sequence.tsv' DELIMITER E'\t' CSV HEADER encoding 'windows-1251';

iii. If it passes, great.  If not, look at the output cross mark to see what went wrong. Click on cross marks (1&2) on stages column (Please see the screen shot below)


                                      

                                          

Correct the errors on csv files

a. If the csv files have formatting problem, it will not pass stage 1 (run_tests). Make sure all the table formats are compliance with the correct format. See Tables structure

                                       

b. Having extra spaces on the uploaded table create issues. Just delete the extra spaces and run the pipeline again to fix this error.

                              

Download the xml file and export it to archivist

a. If it passes, the XML file is available as an "artifact" (a zip files containing the generated XML)  and this is available for 10 days.  The xml can be viewed temporarily from temp archivist and will need to be loaded into archivist via import, if you have permissions to do so then add to archivist, if not, please ask Hayley Mills to import.


Updating your gitlab pipeline

Once setup you may need to re-fork or pull the latest version of the Gitlab pipeline to ensure you have the latest version.

Before you start, you will need to ensure that you have git installed. 

To pull the latest version you will need to do this on the command line. 

1) Search/open cmd

2) Clone your fork. You can find the address on your Gitlab fork. Copy the Clone with HTTPS link. 


git clone https://gitlab.com/HayleyMills/archivist_react_export.git

This will create a folder on your current directory. 

2) Go to that new folder

cd archivist_react_export

3) Connect to the main/master version. This is the project you have created your fork from. As above find the https address under clone.  Follow the example below to pull this and then push this back to your version. You may need to enter your Gitlab email and password. Note the example below uses main for this, but if this does not work try master instead. 

git remote add upstream https://gitlab.com/jli755/archivist_react_export.git
git checkout main
git fetch upstream
git pull upstream main
git push origin main 

4) Your Gitlab fork will now have been updated to the latest version. Check that it is merging the main branch and that the history matches the main version.

Trouble shooting:

  • Main and master are interchangeable to use the one which works for your set-up/Git version
  • The first time you run this, Git may not know who you are and may request that you add your email and username, follow the instructions and example given in the command line. 
  • If there are differences in the files in your forked project and the main/master project this may cause issues. Git will detail which files are an issue. Got to your forked project and update the files to match that in the main/master project e.g. in the example above you may need to have the same Prefixes_to_export.txt file. 
  • If you updated your files, then I recommend you delete your new directory e.g. archivist_react_export and start the process again to ensure there are no issues. 
  • There may be cases when Git details that you have pulled the main/master already, that is not a problem. 
  • No labels