Biological organisms can be viewed as a kind of special information processing systems. In Chapter Two, from this point of view, we give a discussion of the information coding in genomic DNA sequences, which are the biggest information resource in biological cells. Firstly, We focus on the coding methods of protein-coding regions in genomes. Secondly, after a brief review of the genes expressional regulation at transcription level, we discuss the coding methods of regulatory regions in genomes. Chapter Three presents our theoretical study, by using available open bioinformatics resource on Internet, on a possible coding method of DNA regulatory regions in genomes, which is termed BSP (Base Spatial Pattern) of double-strand DNA segment. Firstly, We find a 4 and 20 relationship mediated by triple BSP, called 4-BSP-20 relation. We speculate that 4-BSP-20 relation might be a kind of recognition code in some DNA-protein interactions. Secondly, The study of experimental data of DNA-protein interactions in the gene transcription regulation of eukaryotic cells reveals that the specific recognition of double-helical nucleic acids by some DNA-binding proteins might involve BSP. We have found seven sets of experimental data, which were published by other biologists, to support this hypothesis. Finely, We research the formal description of BSP. We find a formal BSP representation that is based on a mathematical model of equivalent classes, and develop a relevant computer-processing algorithm. This representation of BSP is succinct and easily to be processed by computers. More importantly, this representation of BSP provides a method by which BSP or 4-BSP-20 relation can be generalized to single-strained DNA (ssDNA) and ssDNA-protein interactions. Furthermore, BSP or 4-BSP-20 relation can be generalized to RNA (ssDNA) and RNA-protein interactions. In Conclusion, we prospect the future of the BSP study.
修改评论