A  Deep Learning Approach

for

Therapeutic Effectiveness Estimation

of

FDA Approved Anti-Viral Drug Groups 

by

Virus Genome Sequence Analysis

 

 

By Stefano Sartori MSc May 2020

office@sartori-software.com

 

 

Scientific proofreading by Eugenio Filippi MBA MSc

 

 

 

 

 

Keywords

Artificial Intelligence, Deep Learning, Deep Artificial Neural Network, Virus Genome, FASTA file format, Therapeutic Investigation, Anti-Viral Drug Groups, Anti-Viral Therapy, SARS-CoV-2, HBV, HCV, HIV, HPV RSV, Human Influenza Virus MERS-CoV, Ebolavirus, Acyclic nucleoside phosphonate analogues, Entry inhibitors, HCV NS5A and NS5B inhibitors, Influenza virus inhibitors, Integrase inhibitors, Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors, NNRTIs, NRTIs, Nucleoside analogues, Protease inhibitors


 

Abstract

 

This work demonstrates that a deep artificial neural network can learn to map viral genome sequences onto correlated anti-viral drug groups. Once trained, such a deep artificial neural network is able to estimate the therapeutic effectiveness of a defined drug group for novel/emerging viruses such as SARS-CoV-2, responsible of the COVID-19 pandemic humanity is currently facing.

 

This was achieved by training and testing different artificial neural network configurations with viral genome sequences of the following viruses: HBV Hepatitis B Virus, HCV Hepatitis C Virus, HIV Human Immunodeficiency Virus, HPV Human Papilloma Virus, Human Influenza Virus, RSV Respiratory-Syncytial-Virus.

 

The best performing artificial neural network was then chosen to estimate the therapeutic effectiveness of approved drug groups against viruses unknown to the network, namely Ebolavirus, MERS-CoV Middle East Respiratory Syndrome-Related Coronavirus, SARS-CoV-2 Severe Acute Respiratory Syndrome Coronavirus 2.

 

For actual drugs in the identified drug groups, scientific studies demonstrating their possible anti-viral capabilities for defined viruses have been published.

By comparing the published results against estimated effectiveness of the different drugs for treatment of new, untested viruses, derived by using the artificial neural network, this work shows that an aid for therapeutic investigations for novel/emerging viruses like MERS-CoV and SARS-CoV-2 becomes available.

 


 

Table of Contents

 

1.      Introduction. 4

2.      Deep Learning and Artificial Neural Networks 5

3.      Identifying FDA Approved Anti-Viral Drug Groups 7

4.      Gathering the Virus Genome Sequences in Digital Format 8

5.      Artificial Neural Network Setup, Training and Test Datasets 9

5.1         Neural Network Input Setup. 9

5.2         Neural Network Output Setup. 10

5.3         Neural Network Architecture. 11

5.4         Training and Test Datasets 12

6.      Parallel Training of Multiple Networks 13

6.1         Methodology. 13

6.2         Hardware. 13

6.3         Software. 13

7.      Best Performing Network. 14

7.1         Architecture. 14

7.2         Error Function. 15

7.3         Training and Test Error Analysis by Output 16

7.      Ebolavirus Estimation. 27

8.      MERS-CoV Estimation. 28

9.      SARS-CoV-2 Estimation. 29

10.         Conclusions 30

11.         References 33

12.         About the Author 34


 




1.   Introduction

The constant improvement of computing power [1] and the ever-increasing availability of information in digital format [2], has led to a widespread application of deep learning algorithms/ artificial neural networks to a broad range of tasks, such as medical image processing [3], classification of medical images and illustrations [4] etc.

 

Since complete viral genome sequences are available in digital format [5][6] and given the knowledge of FDA (US Food and Drug Administration) approved anti-viral drug groups for treatment of defined viruses [7], it should be possible to train an artificial neural network which estimates the therapeutic effectiveness of approved anti-viral drug groups for novel/emerging viruses via genome sequence analysis.

 

In this publication, the following will be elaborated:

 

·         Gathering the necessary information for training, test and estimation

 

·         Set up an artificial neural network which takes virus genome sequences as input and correlated anti-viral drug group effectiveness as output.

 

·         Train the artificial neural network with defined virus genome sequences and correlated anti-viral drug group effectiveness.

 

·         Let the artificial neural network estimate the therapeutic effectiveness of FDA approved anti-viral drug groups for novel/emerging viruses outside the training and test data sets.

 

 


 




2.   Deep Learning and Artificial Neural Networks

Artificial neural networks [8] can be described in general terms as systems, with the task of mapping input information to output information by generalizing potentially existing underlying rules.

The architecture of an artificial neural network is defined by the task the network is required to perform.

In technical terms, artificial neural networks consist of an input layer, one to many hidden layers and an output layer.

Deep artificial neural networks have more than one hidden layer; the training process of such networks is called deep learning accordingly.

Each network layer contains nodes, called neurons N which are connected to neurons in the previous layer through synapses.

The computation flows from the network input I to its output O with each neuron value computed via the synapses as the weighted sum of the neuron values of the previous layer.

Deep learning with artificial neural networks is accomplished via supervised training [9]. During the training process, an artificial neural network is fed with actual information input-output pairs - for every input the network computes the output.

The difference between actual and computed output represents the network error, which in turn is used to adjust the network parameters.

Each training information input-output pair is fed through the network multiple times, the network parameters (synapse weights) are adjusted each time and over again in order to minimize the network error.

The parameter adjustments correspond to the learning process of the artificial neural network, one training information feeding cycle is called “training epoch”.

 

The artificial neural network error [10] (the difference between computed- and actual output), is measured over the training epochs, if it decreases, the artificial neural network is said to be learning.

 

To find the best performing artificial neural network for a given task, multiple artificial neural networks with different setup parameters are trained with the same data, the best performing artificial neural network is then chosen for the designed task.

The network with the lowest test and training error is said to be the best performing network.

 

Deep learning and artificial neural networks keywords for further reading:

Supervised Learning, Feature Scaling, Activation Functions, Backpropagation, Stochastic Gradient Descent (SGD), Weight Initialisation, Weight Decay, Sparsity/Density, Cross Validation.


 




3.   Identifying FDA Approved Anti-Viral Drug Groups

The publication available under https://cmr.asm.org/content/29/3/695 [7] gives an overview of the FDA approved antiviral drugs over the past 50 years, which can be summarized by virus and drug group as follows.

 

Drug Group / Virus

HBV

HCV

HIV

HPV

Human Influenza

RSV

Acyclic nucleoside phosphonate analogues

X

X

Entry inhibitors

X

X

HCV NS5A and NS5B inhibitors

X

 

 

 

Influenza virus inhibitors

 

X

 

 

X

X

Integrase inhibitors

 

X

 

 

 

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

X

X

X

 

 

NNRTIs

 

 

X

 

 

 

NRTIs

X

 

X

 

 

 

Nucleoside analogues

X

 

 

 

 

Protease inhibitors

X

X

 

 

 

 

HBV Hepatitis B Virus

HCV Hepatitis C Virus

HIV Human Immunodeficiency Virus

HPV Human Papilloma Virus

Human Influenza Virus

RSV Respiratory-Syncytial-Virus




4.   Gathering the Virus Genome Sequences in Digital Format

Complete viral genome sequences are available in the FASTA file format from the United States National Center for Biotechnology Information (NCBI) under the following link: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ [5]

Viral genome sequences are represented in the FASTA file format [6] as a sequence of the characters A,C,G,T which represents the nucleic acid bases:  adenine, cytosine, guanine and thymine respectively.

 

Viral genome sequences length in nucleic acid base count [5]:


Virus

Sequence Length [N]

Task

HBV

~ 3,300

Training/Test

HCV

9,000 - 10,000

Training/Test

HIV

9,000 - 10,000

Training/Test

HPV

~ 8,200

Training/Test

Human Influenza

~ 1,800

Training/Test

RSV

1,000 - 1,800

Training/Test

Ebolavirus

~ 19,000

Estimation

MERS-CoV

~ 31,000

Estimation

SARS-CoV-2

~ 31,000

Estimation

 

Eligible viruses for training and test have been chosen by their genome sequence size, since the genome length has a direct impact on the network size, which in turn heavily influences training time on the chosen hardware.

 


 




5.   Artificial Neural Network Setup, Training and Test Datasets

 

5.1         Neural Network Input Setup

The viral genome sequence [5] has to be transformed to numerical values which the artificial neural network is able to process.

For each nucleic acid base, represented by one of the four characters A,C,G,T in the viral genome sequence [6], a number of 4 neural network inputs must be setup [9], thus, the total number of inputs depends on the maximum viral genome sequence length the network is capable of processing.

Each nucleic base in the viral genome sequence is mapped to a group of four inputs as follows:


Genome Sequence - Nucleic Acid Base

Input

Value

A

Input 1 (A)

1

Input 2 (C)

0

Input 3 (G)

0

Input 4 (T)

0

C

Input 5 (A)

0

Input 6 (C)

1

Input 7 (G)

0

Input 8 (T)

0

G

Input 9 (A)

0

Input 10 (C)

0

Input 11 (G)

1

Input 12 (T)

0

T

Input 13 (A)

0

Input 14 (C)

0

Input 15 (G)

0

Input 16 (T)

1

 

Since the task is to estimate the therapeutic effectiveness of approved drug groups for Ebolavirus, MERS-CoV and SARS-CoV-2, the network is designed to accept a maximum viral genome sequence length of 32,000 nucleic acid bases.

This results in 4 x 32,000 = 128,000 inputs for the artificial neural network.

 

 

 

5.2         Neural Network Output Setup

The neural network output setup is given by the drug groups [7] which the network should map onto the input viral genome sequence [5].

 

Output

Drug Group

O1

Acyclic nucleoside phosphonate analogues

O2

Entry inhibitors

O3

HCV NS5A and NS5B inhibitors

O4

Influenza virus inhibitors

O5

Integrase inhibitors

O6

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

O7

NNRTIs

O8

NRTIs

O9

Nucleoside analogues

O10

Protease inhibitors

 

 


 

5.3         Neural Network Architecture

Given the conditions outlined in neural network input and output setup sections, the following artificial neural network architecture has been chosen:

 

Input layer with 128,000 neurons for the viral genome sequence, 3 hidden layers with 20,000 neurons for each layer, output layer with 10 neurons representing the drug groups.

 

Activation functions  have a random uniform distribution for each neuron in the network: Softplus, Logistic, Hyperbolic Tangent and ReLU. [8][9]

 

Input and output data feature rescaling [8][9] in interval [-1,1]

 

Since a fully connected network [8] with this architecture imply prohibitive computing costs, the following synapse density factors have been setup:

 

Layer 1 (Input) to Layer 2:            0.00008

Layer 2 to Layer3:                       0.0001

Layer 3 to Layer 4:                      0.0001

Layer 4 to Layer 5 (Output):          0.1

 

The density factors have been chosen in order to connect each neuron in a layer with at least 2 neurons from the previous layer.

 

 

Learning related hyper parameters [8]:

 

Weight Initialisation Method:           Xavier Normal

Weight Decay:                             0.1

Initial Learning Rate:                   0.0001

Initial Momentum:                        0.00001

Batch Size:                                 1 (full online)

 

The learning related hyper parameters have been chosen to avoid network overfitting on the training data set.


5.4         Training and Test Datasets

As training and test dataset, 100 complete genome sequences [5] for each virus species HBV, HCV, HIV, HPV, Human Influenza and RSV have been mapped onto the FDA approved drug group [7], resulting in a total number of 600 complete genome sequences.

80% and 20% [9] of the genome sequences were set up for training and test dataset respectively.

The dataset used to train the artificial neural network can be generalized as follows:


Dataset

I1-I128,000 Genome Sequence Input

O1

O2

O3

O4

O5

O6

O7

O8

O9

O10

Training

HBV

1

0

0

0

0

1

0

1

1

0

Training

HCV

0

0

1

1

0

1

0

0

0

1

Training

HIV

1

1

0

0

1

0

1

1

0

1

Training

HPV

0

0

0

0

0

1

0

0

0

0

Training

Human Influenza

0

0

0

1

0

0

0

0

0

0

Training

RSV

0

1

0

1

0

0

0

0

0

0

...

...

...

...

...

...

...

...

...

...

...

Test

HBV

1

0

0

0

0

1

0

1

1

0

Test

HCV

0

0

1

1

0

1

0

0

0

1

Test

HIV

1

1

0

0

1

0

1

1

0

1

Test

HPV

0

0

0

0

0

1

0

0

0

0

Test

Human Influenza

0

0

0

1

0

0

0

0

0

0

Test

RSV

0

1

0

1

0

0

0

0

0

0

 

Input I1 - I128,000 Virus genome sequence input from FASTA files, nucleic acid bases loaded into the artificial neural network input.

 

Output O1 - O10 Drug Groups/Effectiveness




6.   Parallel Training of Multiple Networks

 

6.1         Methodology

In order to find the best performing artificial neural network, 4 networks with uniform random variations in their hyper parameters have been trained in each training batch in parallel.

Uniform random variations of hyper parameters have been chosen over the permutation of hyper parameters in order to control the maximum number of networks to be trained in parallel.

To optimize the performance, the number of networks to be trained in parallel has been restricted to match the number of CPUs/CPU cores of the computer used.

The RAM utilisation per network was about 2.7 GB while holding the network setup, the complete training and the test dataset in memory.

 

The training process for each batch, containing 4 artificial neural networks, took between 6 and 12 hours computing time.

The fluctuation of the training time depended on the chosen network hyper parameters and learning behaviour over the training epochs.

 

20 training batches with a total number of 80 different neural networks have been executed over the period of a week resulting in the identification of the best performing network, namely the one with the lowest training and test error.

 

6.2         Hardware

Intel® Xeon® E3-1246v3 4 Cores 3.5GHz Hyper-Threading

16 GB RAM

1 TB HDD (RAID 1)

+ 250 GB SSD (RAID 1)

 

6.3         Software

SSWAI by sartori-software.com: neural network management, setup, training validation and production application, written in PHP 7.x


 




7.   Best Performing Network

The parallel training process described in the previous section, yielded the following best performing artificial neural network, identified as the one with the lowest training and test error, after 148 training epochs.

 

7.1         Architecture

 

https://sswai.com/PHP/NNManagement.php?SID=1b4acf0c592cb62b82a0ad062827f1c25a05484c091131cf148231e2f738342d&ID=15283&Action=NNRenderTopologySimplified

 

Layer L1       Input; Virus Genome Sequence; 128,000 Neurons

Layer L2       15,150 Neurons; 151,500 Synapses to L1

Layer L3       14,363 Neurons; 28,726 Synapses to L2

Layer L4       12,703 Neurons; 25,406 Synapses to L3

Layer L5       Output; Drug Group; 10 Neurons; 20,000 Synapses to L4

 


 

 

7.2         Error Function

 

 

X: Epochs N=148

Y: Network Error:  Sum of squared error over all outputs divided by training and test record count respectively

 

The training has been actively stopped at Epoch 148, since no significant decrease of the network error could be observed.

In this specific case, about half the number of epochs used, would have been sufficient to reduce both the training and test error to an acceptable level.


 

7.3         Training and Test Error Analysis by Output

 

Rank

Output

Drug Class

Σ Training

Squared Error

Σ Training

Euclidean Distance

Σ Test

Squared Error

Σ Test

Euclidean Distance

1

O3

HCV NS5A and NS5B inhibitors

0.246685

0.702403

1.728259

1.859172

2

O6

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

1.604566

1.791405

3.493144

2.643159

3

O8

NRTIs

1.020619

1.428719

3.958002

2.813539

4

O4

Influenza virus inhibitors

0.600835

1.096207

4.862661

3.118545

5

O1

Acyclic nucleoside phosphonate analogues

0.660366

1.149231

5.155664

3.211126

6

O2

Entry inhibitors

1.791308

1.892780

5.222156

3.231766

7

O5

Integrase inhibitors

2.657760

2.305541

6.483288

3.600913

8

O7

NNRTIs

2.347382

2.166740

8.210226

4.052216

9

O10

Protease inhibitors

1.243284

1.576886

9.893748

4.448314

10

O9

Nucleoside analogues

174.000003

18.654758

26.001142

7.211261

 

 

The network output of training and test data sets are plotted in the graphs on the following pages.

The graph XY domain [-1,1] is given from the applied feature rescaling before feeding the data to the network.

The closer the data points are to the top-right- and bottom-left- corner of the graph the better the network estimates the effectiveness of the drug group for a defined virus.

 


 

O1 Acyclic nucleoside phosphonate analogues


 

O2 Entry inhibitors

 


 

O3 HCV NS5A and NS5B inhibitors


 

O4 Influenza virus inhibitors


O5 Integrase inhibitors


O6 Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

 


 

O7 NNRTIs


O8 NRTIs

 


 

O9 Nucleoside analogues

 

 


 

O10 Protease inhibitors





7.   Ebolavirus Estimation

443 complete Ebolavirus virus genome sequences have been gathered from the United States National Center for Biotechnology Information under the following link: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ [5]

 

The 443 complete virus genome sequences have then been fed into the best performing artificial neural network identified.

Since the network was trained to map a drug group effectiveness for a defined virus [7], onto its viral genome [5], with 0 deemed not effective and 1 deemed effective, the output values give therapeutic effectiveness estimation for the defined drug group.

The table below shows the median of the output values by drug group, computed by the network for the 443 complete Ebolavirus virus genome sequences [5], where greater values up to 1 represent greater effectiveness estimate, and smaller values close to 0 represent low or no effectiveness estimate.



Output

Drug Group

Ebolavirus Effectiveness Estimate

Minimum

Maximum

Median

 Ntotal=443

O1

Acyclic nucleoside phosphonate analogues

0.044617

0.964413

0.488051

O2

Entry inhibitors

0.003549

0.995335

0.160814

O3

HCV NS5A and NS5B inhibitors

0

0.619514

0.295130

O4

Influenza virus inhibitors

0.001260

0.978951

0.184570

O5

Integrase inhibitors

0.005243

0.952554

0.236775

O6

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

0.000103

0.982464

0.086133

O7

NNRTIs

0.009419

0.964271

0.146356

O8

NRTIs

0.031995

1

0.632101

O9

Nucleoside analogues

0

0.024077

0

O10

Protease inhibitors

0

1

0.351124




8.   MERS-CoV Estimation

257 complete MERS-CoV virus genome sequences have been gathered from the United States National Center for Biotechnology Information under the following link: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ [5]

The 257 complete virus genome sequences have then been fed into the best performing artificial neural network identified.

Since the network was trained to map a drug group effectiveness for a defined virus [7], onto its viral genome [5], with 0 deemed not effective and 1 deemed effective, the output values give therapeutic effectiveness estimation for the defined drug group.

The table below shows the median of the output values by drug group, computed by the network for the 257 complete MERS-CoV virus genome sequences [5], where greater values up to 1 represent greater effectiveness estimate, and smaller values close to 0 represent low or no effectiveness estimate.



Output

Drug Group

MERS-CoV Effectiveness Estimate

Minimum

Maximum

Median

 Ntotal=257

O1

Acyclic nucleoside phosphonate analogues

0.007407

0.412886

0.055218

O2

Entry inhibitors

0.000765

0.965112

0.131159

O3

HCV NS5A and NS5B inhibitors

0

0.459072

0.110305

O4

Influenza virus inhibitors

0.001292

0.939077

0.484627

O5

Integrase inhibitors

0.006321

0.626097

0.109600

O6

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

0.001200

0.969569

0.151942

O7

NNRTIs

0.000544

0.873862

0.023173

O8

NRTIs

0

0.697909

0.247677

O9

Nucleoside analogues

0

0.015237

0

O10

Protease inhibitors

0

1

0.446981




9.   SARS-CoV-2 Estimation

941 complete SARS-CoV-2 genome sequences have been gathered from the United States National Center for Biotechnology Information under the following link: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ [5]

The 941 complete virus genome sequences have then been fed into the best performing artificial neural network identified.

Since the network was trained to map a drug group effectiveness for a defined virus [7], onto its viral genome [5], with 0 deemed not effective and 1 deemed effective, the output values give therapeutic effectiveness estimation for the defined drug group.

The table below shows the median of the output values by drug group, computed by the network for the 941 complete SARS-CoV-2 genome sequences [5], where greater values up to 1 represent greater effectiveness estimate, and smaller values close to 0 represent low or no effectiveness estimate.



Output

Drug Group

SARS-CoV-2 Effectiveness Estimate

Minimum

Maximum

Median

 Ntotal=941

O1

Acyclic nucleoside phosphonate analogues

0.008703

0.765844

0.111573

O2

Entry inhibitors

0.000057

0.995527

0.330182

O3

HCV NS5A and NS5B inhibitors

0

0.543685

0

O4

Influenza virus inhibitors

0.001059

0.953277

0.567019

O5

Integrase inhibitors

0.005885

0.984187

0.343736

O6

Interferons, immunostimulators, oligonucleotides, and antimitotic inhibitors

0.000115

0.968116

0.049023

O7

NNRTIs

0.006103

0.944986

0.257039

O8

NRTIs

0

0.971661

0.186103

O9

Nucleoside analogues

0

0.079622

0

O10

Protease inhibitors

0

1

0.757960

 




10.       Conclusions

The deep learning approach to estimate therapeutic effectiveness of anti-viral drug groups described in this publication provided the following results to focus on: 

 

 

Virus [5]

Drug Group [7]

Therapeutic Effectiveness Estimate Median, Interval [0,1]

Ebolavirus

Acyclic nucleoside phosphonate analogues

0.488051

Ebolavirus

NRTIs

0.632101

MERS-CoV

Influenza virus inhibitors

0.484627

MERS-CoV

Protease inhibitors

0.446981

SARS-CoV-2

Influenza virus inhibitors

0.567019

SARS-CoV-2

Protease inhibitors

0.757960

 

 

Matching the identified drug groups by using the publication available under https://cmr.asm.org/content/29/3/695 to a selection of actual drugs and available publications yields the following table:

 

Drug Group [7]

Drug [7]

Virus/ Publication

Acyclic nucleoside phosphonate analogues

Cidofovir

Ebolavirus

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5017617/

 

https://www.virology.ws/2014/10/09/treatment-of-Ebolavirus-virus-infection-with-brincidofovir/

 

 

NRTIs

Zidovudine

Ebolavirus

https://www.ncbi.nlm.nih.gov/pubmed/27902714

 

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5127007/

Influenza virus inhibitors

Oseltamivir/

Zanamivir

 

MERS-CoV

https://www.frontiersin.org/articles/10.3389/fmicb.2019.01327/full

SARS-CoV-2

https://www.thelancet.com/journals/langlo/article/PIIS2214-109X(20)30114-5/fulltext

 

https://www.nature.com/articles/d41587-020-00003-1

 

 

 

Protease inhibitors

Ritonavir

MERS-CoV

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5733787/

 

SARS-CoV-2

https://www.ncbi.nlm.nih.gov/pubmed/32293875

 


 

 

The neural network was excluded from having access to any of the Ebolavirus, MERS-CoV and SARS-CoV-2 genome sequences during the training and test process.

 

The estimated results therefore indicate clearly that it is possible to train artificial neural networks on known viral genomes and correlated therapies to aid therapeutic investigations for novel/emerging viruses like MERS-CoV and SARS-CoV-2.

 

With time, computing power will still increase [1] and more complete genome sequences are expected to be available in digital format [2], making it possible to train more accurately artificial neural networks [8] for genomic sequence analysis tasks.

 

As next step it is planned to apply the methodology described to viral vaccines.

Considering the mechanism of action of vaccines [11], it seems unlikely that an artificial neural network with reasonable low error can be trained.

On the other hand, if an acceptable performing artificial neural network is found, it could be an indication of possible viral vaccine cross-protection.

 

The methodology described could also be used to investigate anti-bacterial therapies, though training an artificial neural network on bacteria genomes will require significantly more computing power or training time, as bacterial genomes are orders of magnitudes larger than viral genomes [12].




11.       References

[1] Moores’Law - Wikipedia Article

https://en.wikipedia.org/wiki/Moore%27s_law

 

[2] Data generated by humanity - Forbes Article

https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/

 

[3] Deep Learning for Medical Image Processing - Science Direct Article

https://www.sciencedirect.com/science/article/pii/S0939388918301181

 

[4] Artificial Neural Networks for Medical Image Classification - NCBI Publication

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4699299/

 

[5] Online Virus Genome Sequences

https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/

 

[6] Definition of the FASTA format - Wikipedia Article

https://en.wikipedia.org/wiki/FASTA_format

 

[7] FDA Approved Anti-Viral Drug Groups -  Clinical Microbiology Reviews Publication

https://cmr.asm.org/content/29/3/695

 

[8] Introduction to Artificial Neural Networks - Article

https://towardsdatascience.com/introduction-to-artificial-neural-networks-ann-1aea15775ef9

 

[9] Supervised Training -Medium Article

https://medium.com/@gowthamy/machine-learning-supervised-learning-vs-unsupervised-learning-f1658e12a780

 

[10] Network Error, Error Function - NCBI Publication https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009026/

 

[11] How Vaccines Work - World Health Organization Publication

http://www.euro.who.int/en/health-topics/disease-prevention/vaccines-and-immunization/vaccines-and-immunization/how-vaccines-work

 

[12] Genome Size - Science Direct Page

https://www.sciencedirect.com/topics/medicine-and-dentistry/genome-size




12.       About the Author

Stefano Sartori, born 1973 in Italy

 

1992 Technical high school degree in electronics and matriculation University of Bologna, Italy, faculty of physics.

 

1995-1996 - 12 months exchange Student at the Technical University Vienna, Austria - ERASMUS-SOCRATES Project. Specialization in Biophysics, radiation protection and dosimetry.

 

1998 Degree in Physics (MSc)

Thesis: Implementation of a computerized management system for the controlled disposal of radioactive waste deriving from the medical use of radionuclides of storage facility identified by: IAEA "TECDOC-775, HANDLING, TREATMENT, CONDITIONING AND STORAGE OF BIOLOGICAL RADIOACTIVE WASTES International Atomic Energy Agency (IAEA)"

 

1998-2020 Owner of sartori-software.com - Software Company with core business in application development for the pharmaceutical industry.

Continuous improvement of IT skills and software development technologies, specialization in Deep Learning and Artificial Neural Networks.