Hoffman2:Software Tools: NDATools
NDATools is for downloadig or uploading data to NDA. Here is how to use it in Hoffman to download data.
Start an interactive mode
It needs more memory than Hoffman2 login nodes can offer. So you need to use a work node for this download
qrsh -l h_rt=20:00:00,h_data=4G
Load modules
module load python/3.6.1_shared module load ndatools
Generate Temporary Tokens
To access NDA data in AWS S3, you need temporary token generated by your NDA user credential.
generate_token.sh ‘USERID’ ‘PASSWORD’ 'https://nda.nih.gov/DataManager/dataManager'
Replace USERID and PASSWORD with your NDA login ID and password.
Then you'll get something like the following. These keys and token will be used at the next step.
Beginning token request... Access Key: AAAAAAAAAAAAAAA Secret Key: SSSSSSSSSSSSSSSSSSSSSSSS Session Token: ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRST Expiration: 2021-03-04T12:55:00Z
Download Data
Use command downloadcmd to download data
To check the usage of command downloadcmd
downloadcmd -h
optional arguments: -h, --help show this help message and exit -dp, --package Flags to download all S3 files in package. -t, --txt Flags that a text file has been entered from where to download S3 files. -ds, --datastructure Flags that a data structure text file has been entered from where to download S3 files. -u <arg>, --username <arg> NDA username -p <arg>, --password <arg> NDA password -r <arg>, --resume <arg> Flags to restart a download process. If you already have some files downloaded, you must enter the directory where they are saved. -d <arg>, --directory <arg> Enter an alternate full directory path where you would like your files to be saved. -wt <arg>, --workerThreads <arg> Number of worker threads -v, --verbose Option to print out more detailed messages as the program runs.
To start downloading the package, you'll need the packageID. The packageID can be found in NDA's website after you login and submit your request for data access.
downloadcmd <packageID> -dp -d /u/project/MYGROUP/MYNDADATA_FOLDER
If the download got interrupted and you want to resume the download
downloadcmd <packageID> -dp -d /u/project/MYGROUP/MYNDADATA_FOLDER -r /u/project/MYGROUP/MYNDADATA_FOLDER
Once starting the downloadcmd command, you'll be asked to input the ACCESS KEY, Secret Key and SESSION TOKEN. Use the information generated from the previous step.
If there's no error message, it will start downloading right away. To see more details download logs, add -v option.