About
This site is for Supplementary Information for Article Self and nonself short constituent sequences of amino acids in the SARS-CoV-2 proteome for vaccine development by Joji M. Otaki1, Wataru Nakasone2, Morikazu Nakamura2
1 The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Okinawa 903-0213, Japan
2 Computer Science and Intelligent Systems Unit, Department of Engineering, Faculty of Engineering, University of the Ryukyus, Okinawa 903-0213, Japan
This site provides the following supplementary information
- Source Codes for Human SCS Analysis,
- Source Codes for SARS-CoV2 SCS Analysis,
- Additional Data
Source Codes for Human SCS Analysis
You can use the Python code easily with your computer or the Google Colab & Google Drive.
How to use the Human SCS Analysis Program with your computer
- Download the source code and Protein Datasets
- and then locate Human_SCS_Analysis.py in the current directory and Protein data as ./ncbi_dataset/protein.faa
- Run jupyter notebook at your current directory
- Start to use the program by importing Human_SCS_Analysis
import Human_SCS_Analysis as hscs hscs.initializeFromProteinDataset() # You can set all data to use the application hscs.menu() # You can see the command list # For example, to show the basic information of the dataset hscs.showBasicInformation()
How to use Human SCS Analysis Program with Google Colab
- Download the source code and Protein Datasets
- locate Human_SCS_Analysis.py at the Google Drive directory and Protein datasets at ./ncbi_dataset/protein.faa
- Open a new notebook in the Google Colab and mount the Google Drive directory
- Start to use the application by importing Human_SCS_Analysis
import Human_SCS_Analysis as hscs hscs.initializeFromProteinDataset() # You can set all data to use the application hscs.menu() # You can see the command list # For example, to show the basic information of the dataset hscs.showBasicInformation()
Source Codes for SARS-CoV-2 SCS Analysis
- Setup an environment on your computer to compile c++ source codes. Commandline tools are required. For linux and MacOS, the standard terminal tool is enough. For Windows 10 users, we recommend installing WSL or WSL2.
- Download SARS-CoV-2_SCS_Application and Protein Datasets to a working directory.
- Run make command at the working directory to build the software:
yourPC:~$ make
- Instead of the make command, you can compile the source codes directly to bilud the software:
yourPC:~$ g++ -std=c++17 -stdlib=libc++ -o covid-scs-analysis main.cpp calculation.cpp transform.cpp
- You can run the application software with input data located at the directory ./ncbi_dataset by the following command:
yourPC:~$ ./covid-scs-analysis
- The output files are generated under the output directory ./SARS-CoV-2_SCS_Analysis as csv files.
Additional Data
We provide the following additional data as Excel files: Additional Data:
- Self-nonself assignment and Nonself extraction,
- Nonself clusters,
- SARS-CoV-2 variant proteomes,
- Spike mutations