Difference between revisions of "N-Glycan biosynthesis"
(→Results) |
|||
(15 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | |||
− | |||
− | |||
− | |||
==Introduction== | ==Introduction== | ||
N-Linked glycans are attached in the endoplasmic reticulum to the nitrogen (N) in the side chain of asparagine in the sequon. The sequon is an Asn-X-Ser or Asn-X-Thr sequence, where X is any amino acid except proline and the glycan may be composed of N-acetyl galactosamine, galactose, neuraminic acid, N-acetylglucosamine, fructose, mannose, fucose, and other monosaccharides. | N-Linked glycans are attached in the endoplasmic reticulum to the nitrogen (N) in the side chain of asparagine in the sequon. The sequon is an Asn-X-Ser or Asn-X-Thr sequence, where X is any amino acid except proline and the glycan may be composed of N-acetyl galactosamine, galactose, neuraminic acid, N-acetylglucosamine, fructose, mannose, fucose, and other monosaccharides. | ||
+ | [[File:N-Glycan1.JPG|thumb|The first version of the model, made in the system BioUML. (The first stage of the modeling)]] | ||
+ | [[File:Compartment1.jpg|thumb|Model of the first Golgi compartment. (The second stage of the modeling)]] | ||
+ | [[File:Patients1-3.JPG|thumb|Optimization results for the first three patients from Korcula population. (The third stage of the modeling)]] | ||
N-linked glycans are extremely important in proper protein folding in eukaryotic cells, in cell-cell interactions and in personalized medicine<ref>[http://en.wikipedia.org/wiki/Glycan#N-Linked_glycans en.wikipedia.org/wiki/Glycan]</ref>. | N-linked glycans are extremely important in proper protein folding in eukaryotic cells, in cell-cell interactions and in personalized medicine<ref>[http://en.wikipedia.org/wiki/Glycan#N-Linked_glycans en.wikipedia.org/wiki/Glycan]</ref>. | ||
Line 16: | Line 15: | ||
Reaction rate equations were generated by KEGGtranslator<ref>[http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator/ KEGGtranslator]</ref>, a number of necessary parameters, such as Michaelis-Menten constants, were taken from BRENDA Enzyme Database<ref>[http://www.brenda-enzymes.org/ BRENDA]</ref>. | Reaction rate equations were generated by KEGGtranslator<ref>[http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator/ KEGGtranslator]</ref>, a number of necessary parameters, such as Michaelis-Menten constants, were taken from BRENDA Enzyme Database<ref>[http://www.brenda-enzymes.org/ BRENDA]</ref>. | ||
+ | |||
+ | To give more intuitive representation [[Glycan structures]] property was used and to achieve the optimum accuracy we supposed that the distribution of the N-glycans in each Golgi compartment is modeled as a well-mixed reactor<ref>[http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000713 Systems Analysis of N-Glycan Processing in Mammalian Cells]</ref>. | ||
+ | |||
+ | Despite of the simplicity of the reaction mechanisms and reactions' parameters availability, the process of the building of the computer model of glycosylation split into several stages. | ||
+ | |||
+ | ==== The first stage of the modeling ==== | ||
+ | |||
+ | On the first stage of the modeling an attempt to implement the N-glycan biosynthesis model from the KEGG database was made. Reaction equations were generated using KEGGtranslator, reaction parameters were taken from the database BRENDA Enzyme Database, and the initial concentration of the substances were taken as 1 mol. The model included 25 enzymes, 62 glycan, a number of substances – participants of intermediate reactions (formation and dissociation of the enzyme-substrate complex).The model consisted of 68 reactions with 126 parameters in their equations. The localization of enzymes in different parts of Golgi apparatus according to the order of their entry into the reactions was not taken into consideration. | ||
+ | |||
+ | ==== The second stage of the modeling ==== | ||
+ | |||
+ | In the second stage the initial model was split into two parts. For glycans included in the first part the mass spectrometry data was available. So it was possible to check the correctness of the getting results. The second part of the model does not have experimental verification yet. | ||
+ | Also an attempt to take into consideration the localization of enzymes in different compartments of the Golgi apparatus according to the order of their entry into the reactions was made. | ||
+ | There were the decrease in the number of diagram elements (8 enzymes and 41 glycans, 40 reactions with 39 parameters in the part of the model with the experimental confirmation). Based on experimentally validated information the initial glycan concentrations, except 9-mannose G00011, were taken as 0 mol. The initial concentration of 9-mannose G00011, which was the reactant in the first reaction occurring in the first compartment of the Golgi apparatus, was taken as 300 mol. | ||
+ | |||
+ | ==== The third stage of the modeling ==== | ||
+ | |||
+ | The third stage began with clarifying the properties of the mechanisms of reactions catalyzed by different enzymes. Next we took the assumption that the distribution of the N-glycans in each compartment is modeled as well-mixed reactor, the residence time in each compartment is 5.6 min, enzymes are located in the corresponding parts of the Golgi apparatus, therefore the reaction which are catalyzed by them occur at the moment of reactor passing through this area. | ||
+ | The model built in the third stage consists of 8 enzymes and 45 glycans which are bind by 89 reactions (the increase in the number of reactions has place because of changes in the mechanisms of their occurrence). There are 54 parameters for optimization in the model. | ||
=== Model optimization === | === Model optimization === | ||
− | + | We use the expiremental data for Korcula population <ref>[http://www.ncbi.nlm.nih.gov/pubmed/21653738 High throughput isolation and glycosylation analysis of IgG-variability and heritability of the IgG glycome in three isolated human populations.]</ref> and made the [[Optimization]] of the individual parameters for all patients after every stage of the modeling. | |
− | + | ||
− | We use the expiremental data for Korcula population <ref>[http://www.ncbi.nlm.nih.gov/pubmed/21653738 High throughput isolation and glycosylation analysis of IgG-variability and heritability of the IgG glycome in three isolated human populations.]</ref> and made the [[Optimization]] of | + | |
− | + | The optimization of first model was held relatively long time (~15 min for each patient), and the resulting accuracy of the parameters fitting was very low (~10<sup>-1</sup>) | |
+ | |||
+ | In the optimization of the second model the individual optimization time reduced to 4-5 minutes per patient, the level of accuracy of the parameters fitting about 10<sup>-3</sup>-10<sup>-4</sup> was achived. | ||
+ | |||
+ | The third model has the minimum optimization time for each person (~1 min) and the highest accuracy of matching between the experimental and simulation data (~10<sup>-7</sup>-10<sup>-8</sup>) are achieved for this model (in comparison with the models obtained on the previous two stages). | ||
+ | |||
+ | === Results === | ||
+ | |||
+ | The research with use methods of mathematical statistics and R language features was carried out for the getting after individual optimization parameters. It was found that on the whole parameters correlate with each other weakly, two parameters (the turnover number for beta-1,4-mannosyl-glycoprotein (GnTIII, EC 2.4.1.144) and the Michaelis constant of glycan structure A2[3]BG2S1) correlate with patients’ age (with coefficients 0,1095195 and 0,131857 respectively). | ||
+ | The Random Forest Algorithm applied to the results of the optimization can predict about 15% occurring processes (against 60% in the original parameter space). | ||
==References== | ==References== | ||
<References/> | <References/> |
Latest revision as of 15:21, 6 June 2014
Contents |
[edit] Introduction
N-Linked glycans are attached in the endoplasmic reticulum to the nitrogen (N) in the side chain of asparagine in the sequon. The sequon is an Asn-X-Ser or Asn-X-Thr sequence, where X is any amino acid except proline and the glycan may be composed of N-acetyl galactosamine, galactose, neuraminic acid, N-acetylglucosamine, fructose, mannose, fucose, and other monosaccharides.
N-linked glycans are extremely important in proper protein folding in eukaryotic cells, in cell-cell interactions and in personalized medicine[1].
[edit] Mathematical model of N-Glycan biosynthesis
[edit] Model realization
Because of an important role of N-Glycans in personalized medicine it was decided to work out a mathematical model of its biosynthesis in BioUML. The research was begun with Reference pathway from KEGG database implementation [2].
Reaction rate equations were generated by KEGGtranslator[3], a number of necessary parameters, such as Michaelis-Menten constants, were taken from BRENDA Enzyme Database[4].
To give more intuitive representation Glycan structures property was used and to achieve the optimum accuracy we supposed that the distribution of the N-glycans in each Golgi compartment is modeled as a well-mixed reactor[5].
Despite of the simplicity of the reaction mechanisms and reactions' parameters availability, the process of the building of the computer model of glycosylation split into several stages.
[edit] The first stage of the modeling
On the first stage of the modeling an attempt to implement the N-glycan biosynthesis model from the KEGG database was made. Reaction equations were generated using KEGGtranslator, reaction parameters were taken from the database BRENDA Enzyme Database, and the initial concentration of the substances were taken as 1 mol. The model included 25 enzymes, 62 glycan, a number of substances – participants of intermediate reactions (formation and dissociation of the enzyme-substrate complex).The model consisted of 68 reactions with 126 parameters in their equations. The localization of enzymes in different parts of Golgi apparatus according to the order of their entry into the reactions was not taken into consideration.
[edit] The second stage of the modeling
In the second stage the initial model was split into two parts. For glycans included in the first part the mass spectrometry data was available. So it was possible to check the correctness of the getting results. The second part of the model does not have experimental verification yet. Also an attempt to take into consideration the localization of enzymes in different compartments of the Golgi apparatus according to the order of their entry into the reactions was made. There were the decrease in the number of diagram elements (8 enzymes and 41 glycans, 40 reactions with 39 parameters in the part of the model with the experimental confirmation). Based on experimentally validated information the initial glycan concentrations, except 9-mannose G00011, were taken as 0 mol. The initial concentration of 9-mannose G00011, which was the reactant in the first reaction occurring in the first compartment of the Golgi apparatus, was taken as 300 mol.
[edit] The third stage of the modeling
The third stage began with clarifying the properties of the mechanisms of reactions catalyzed by different enzymes. Next we took the assumption that the distribution of the N-glycans in each compartment is modeled as well-mixed reactor, the residence time in each compartment is 5.6 min, enzymes are located in the corresponding parts of the Golgi apparatus, therefore the reaction which are catalyzed by them occur at the moment of reactor passing through this area. The model built in the third stage consists of 8 enzymes and 45 glycans which are bind by 89 reactions (the increase in the number of reactions has place because of changes in the mechanisms of their occurrence). There are 54 parameters for optimization in the model.
[edit] Model optimization
We use the expiremental data for Korcula population [6] and made the Optimization of the individual parameters for all patients after every stage of the modeling.
The optimization of first model was held relatively long time (~15 min for each patient), and the resulting accuracy of the parameters fitting was very low (~10-1)
In the optimization of the second model the individual optimization time reduced to 4-5 minutes per patient, the level of accuracy of the parameters fitting about 10-3-10-4 was achived.
The third model has the minimum optimization time for each person (~1 min) and the highest accuracy of matching between the experimental and simulation data (~10-7-10-8) are achieved for this model (in comparison with the models obtained on the previous two stages).
[edit] Results
The research with use methods of mathematical statistics and R language features was carried out for the getting after individual optimization parameters. It was found that on the whole parameters correlate with each other weakly, two parameters (the turnover number for beta-1,4-mannosyl-glycoprotein (GnTIII, EC 2.4.1.144) and the Michaelis constant of glycan structure A2[3]BG2S1) correlate with patients’ age (with coefficients 0,1095195 and 0,131857 respectively). The Random Forest Algorithm applied to the results of the optimization can predict about 15% occurring processes (against 60% in the original parameter space).
[edit] References
- ↑ en.wikipedia.org/wiki/Glycan
- ↑ N-Glycan biosynthesis - Reference pathway
- ↑ KEGGtranslator
- ↑ BRENDA
- ↑ Systems Analysis of N-Glycan Processing in Mammalian Cells
- ↑ High throughput isolation and glycosylation analysis of IgG-variability and heritability of the IgG glycome in three isolated human populations.