Difference between revisions of "Polynomial Regression analysis"

From BioUML platform
Jump to: navigation, search
(Space after icon)
(Plugin name added)
Line 3: Line 3:
 
;Provider
 
;Provider
 
:[[Institute of Systems Biology]]
 
:[[Institute of Systems Biology]]
 +
;Plugin
 +
:ru.biosoft.analysis (Common methods of data analysis plug-in)
  
 
=== Polynomial regression analysis. ===
 
=== Polynomial regression analysis. ===

Revision as of 11:48, 6 May 2013

Analysis title
Statistics-Polynomial-Regression-analysis-icon.png Polynomial Regression analysis
Provider
Institute of Systems Biology
Plugin
ru.biosoft.analysis (Common methods of data analysis plug-in)

Polynomial regression analysis.

Regression analysis is performed for each row in experimental data independently. Consider:

  • Y = {Y1...Ym} — gene expression values.
  • X = {X1...Xm} — corresponding time poins.

Value Yi is measured at the time point Xi. Analysis constructs polynomial regression:

Statistics-Polynomial-Regression-analysis-r1.png

For each estimated regression coefficient, the P-value will be calculated, but P-value threshold will be applied only on the last coefficient (with largest power).

Parameters:

  • Experiment - experimental data for analysis.
    • Table - a table data collection stored in the BioUML repository.
    • Columns - the columns from the table which should be taken into account for futher analysis. Note that in order to ensure correct analysis you should specify the corresponding time point for each column. Time points also should ascend!
  • Regression power - the positive value representing power to construct regression.
  • P-value threshold - thresold for P-value (only elements with lower P-value will be included in the result table).
  • Outline boundaries - lower and upper boundaries for values from the input table. Outliers will be ignored.
  • Calculate FDR - the test method for calculation of False Discovery Rate (FDR) - an average rate of mistakenly builded regressions with the given P-value threshold. It randomly permutates the data 50 times and applies regression analysis to each randomized test. FDR is calculated according to the formula:
    Statistics-Polynomial-Regression-analysis-fdr.png
  • Output table - the path in BioUML repository where the result table will be stored. If a table with the specified path already exists it will be replaced. The table will contain the sum of square errors, coefficients with their scores (log10(P-value)) and graphics for original and approximated profiles.

Details

In matrix form building regression could be described as:

Statistics-Polynomial-Regression-analysis-r2.png

where e is distributed like N(0,σ2).

Let us consider:

Statistics-Polynomial-Regression-analysis-r3.png

We will find βii out of condition S → min, i.e. we should solve the equation system:

Statistics-Polynomial-Regression-analysis-r4.png

We also need to estimate the dispersion of the coefficients, which we can derive from the equation:

Statistics-Polynomial-Regression-analysis-r5.png

Approximation for σ2 is:

Statistics-Polynomial-Regression-analysis-r6.png

To test the hypothesis H0: {bi=0}, we shall use the statistic:

Statistics-Polynomial-Regression-analysis-r7.png

In the case that H0 is true this statistic is distributed like Student's random value with np − 1 degrees of freedom. So we estimate P-value:

Statistics-Polynomial-Regression-analysis-r8.png

where:

Statistics-Polynomial-Regression-analysis-r9.png
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox