Brief about the Statistical Application in Cell and Developmental Biology with an example
A statistical application in genetics and Molecular Biology is widely developing as researchers are trying to apply statistical ideas to the problem of computational biology. Statistics in biology is mainly used to test the hypotheses; meanwhile other sophisticated tests are used to understand and set up experiments and interpret results. Biology and statistics have been interconnected for a long time. When biology focuses on living organisms, statistical analysis provides crucial insight into many biological processes. An important part of any biological experiment involves choosing a appropriate sample size and selecting correct trial. Larger sample size is always preferred in statistic but in clinical trials we cannot collect larger sample size. Larger samples will always reduce the type one error. At end of every study researches would like to prove their hypothesis as true and conclusion is statistically significant. If data is highly surrounded or clustered around mean, then mean will be the best indicator, if data is highly spread out then we can consider median as best indicator as median is not affected by outliers. After the experiment we need to interpret the result well, which need expert advice. Statistical software like SAS, SPSS, R can help in giving appropriate best results.
Seven Steps that must be followed:
- Experimentalists ordinarily make estimations to gauge a property or “parameter” of a population from which the information was drawn, such as a cruel, rate, extent or relationship. one ought to be mindful that the real parameter contains a settled, unknown value in the population. Take illustration of a population of cells, each separating at their claim rate. At a given point in time, the population incorporates a genuine cruel and change of the cell division rate. Not one or the other of these parameters is comprehensible. When one measures the rate in a sample of cells from this population, the sample mean and analysis of variance are estimates of the true population mean and variance. Accurate methods can always help us identify bias methods. Second sample may not be representative of the population. Estimates tend to be closer to the true values if more samples are measured and if they vary as experiment is repeated. By accounting the variability in the sample mean and variance, one can test a hypothesis about the true mean in the population or estimate its confidence interval(A’Brook & Weyers, 1996).
- A very difficult task in biology is to translate biological problem into statistical hypothesis i.e. into null and alternative hypothesis as biological problem will often be qualitative analysis statements about the effect of a treatment like genotype, phenotypic or conditions which we need to find the relation or predict. In biological conditions sometimes effect of treatment would never be in one of the two possible directions, therefore most of the times one sided hypothesis is used. Hypothesis testing enables researcher to assess quantitatively whether their data support or go against the biological hypothesis(Ditty et al., 2010).
- All the variables in datasheet may be important as they may contribute to the outcome variable equally. But the variables that can be measured numerically may be considered into Data analysis but the nuisance variables like type of treatment, genotype or drug concentration cannot be measured numerically and are string variables. There are two major types of variable i.e. numerical or categorical, which has all different type of statistical data analysis and recording of data is different. Decision tree selects best type of test between response and one or more treatment. A range of inhibitor concentrations or time after adding a drug is example of numerical treatments. Examples for categorical variables are comparing wild-type versus mutant cells or control cell versus cell depleted of a mRNA.

The real challenge is to understand experiment well enough to randomize treatments
Effectively across potential confounding factor. Biological replicates are used for parameters
Estimate and statistical analysis as they let one describe deviation in population. Technical replicates mean longitudinal data are used to improve estimation of measurement for each
biological replicate. Treating biological replicates are called pseudo replication and often
Produce low estimates of variance and test results with errors. The difference between
technical and biological replicate depends on how one defines the population of interest.
- Data should not deviate strongly from assumptions of the chosen statistical plan, if it doesn’t follow assumption then we need to do adjustment in the analysis plan. Sample size should be always large or else atleast minimum sample size must be attained. Strong deviations from expectations will result in inaccurate test results. No matter how well the experiment is designed at the end of study it may require data adjustment to the data analysis plan, if data do not conform to expectation or assumption. The most commonly used and powerful tests are parametric tests which follow a lot of assumptions, one of the major assumptions is normality. If data doesn’t follow normality parametric tests cannot be used and analogy of that a non-parametric test should be used which doesn’t have much assumptions.
- A hypothesis test is done to see if the probability of observing the experimental data, if the null hypothesis is true. Different set of data fit into different set of tests like t-test is used for continuous numerical response, chi-square test is used to find association between two categorical variables. The result is interpreted using p-value, if p-value is <0.05 then reject null hypothesis, if p-value is <0.05 then do not reject null hypothesis. There are different types of tests like one-sample test, two-sample test(Blackboard, 2020).
- One must be very carful before concluding that two treatments are different or that any detected difference is meaningful in the biological context. Most of the time clinical and statistical concepts show agreement showing p-value below 0.05, but fewer times statistically biological concepts may not agree but clinically that small difference may have huge impact in such case we should make desirable conclusion with expert advice. These days researchers strongly believe that p-value is highly data driven and its not solely responsible for decision, we can go about with confidence interval and effect size for better evidence when statistical tests are well powered.
- Presentation of results at the end plays very important role. The nature of the experiment and statistical test should guide the selection of appropriate presentation. Like some type of data are well presented in table than figures, example frequency (categorical data). Other type of data collection may require more sophisticated graphs or figure. The statistical tests applied and any transformation applied must be clearly mentioned while reporting results. Improper presentation of results, some reports SEM without the number of measurements, so the definite variation is not revealed. Its always better to show the raw data with the results.
Cellular and molecular biologists can utilize statistics effectively when analysing and presenting their data, all the above steps should be followed properly, this will avoid common mistakes.
References
- A’Brook, R. & Weyers, J.D.B. (1996). Teaching of statistics to UK undergraduate biology students in 1995. Journal of Biological Education. [Online]. 30 (4). pp. 281–288. Available from: https://www.tandfonline.com/doi/abs/10.1080/00219266.1996.9655518.
- Blackboard (2020). Scaling to Meet the Needs of a Changing Environment. [Online]. 2020. Available from: https://www.blackboard.com/. [Accessed: 23 March 2020].
- Ditty, J.L., Kvaal, C.A., Goodner, B., Freyermuth, S.K., Bailey, C., Britton, R.A., Gordon, S.G., Heinhorst, S., Reed, K. & Xu, Z. (2010). Incorporating genomics and bioinformatics across the life sciences curriculum. PLoS biology. [Online]. 8 (8). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2919421/.
[wce_code id=1]

 Next Post
Next Post