This method involves the complete lysis of all microorganisms provideing the highest DNA yields. The first step is the disruption of the microbial/plant cell wall and leads to the release of its nucleic acid content directly into the extraction buffer. The second step includes, separation of the extraction buffer from soil particles followed by extraction of nucleic acids. This is the most challenging step,as a lot of contaminants such like humic acids, metal ions, and proteins also get extracted along with the nucleic acid . The process of cell disruption is usually a combination of physical, thermal, chemical and enzymatic lysis. Physical treatments such as bead-beating homogenization, sonification, vortexing (Steffan et al., 1988; Miller et al., 1999; Maarit Niemi et al., 2001; Miller, 2001), and thermal shock (Tsai et al., 1991; More et al., 1994; Porteous et al., 1997, Orsini & Romano-Spica, 2001) destroy the soil structure creating an access to the whole microbial community, including microbes hidden deep within soil microaggregates. They have also shown effi- ciency for disruption of vegetative forms, small cells and spores, but they often result in significant DNA shearing (More et al., 1994). However,an extensive lysis may cause the excessive shearing of the DNA fragments. A method of Chemical lysis alone or in combination with other physical methods of lysis have also been used. The physical method requires preliminary crushing and grinding of the material allowing the extraction or the lysis buffer to access the cells properly. The most common chemical used is sodium dodecyl sulfate (SDS) which dissolves the hydrophobic part of cell membranes. Detergents have often been used in combination with chelating agents like EDTA, Chelex 100 (Robe et al., 2003) and different buffers like Tris and sodium phosphate (Krsek & Wel- lington, 1999) along with heat-treatment. It has been observed that with the Increasing the EDTA concentration , the yield also increased , but lowering the purity of nucleic acid isolates. Few other chemical reagents used are cetyltrimethyl-ammonium bromide (CTAB) which can also partially remove humic acids (Zhou et al., 1996), and form insoluble complexes with denatured proteins, polysaccharides and cell debris (Saano et al., 1995). One more chemical called Polyvinyl- polypyrrolidone (PVPP) can also remove humic acids during the lysis, but it lowering the DNA yield. Enzymatic methods involve the digestion of samples by different enzymes affecting the DNA in the least way possible and particularly used in the case of Grampositive bacteria, that hold resistance to physical and chemical isolation methods. Enzymes can be also used for denature the nucleases and for removal of RNA. The most commonly used enzymes are: lysozyme, proteinase K, RNase A (Tsai et al., 1991; Tebbe & Vahjen, 1993; Zhou et al., 1996; Maarit Niemi et al., 2001).
This approach involves the extraction of cell from the environmental sample prior to cell lysis forming th first step of the indirect dna isolation method. This can be carried out by centrifugation of the samples or via filtration. The next step is the cell lysis followed by isolation and purification of the DNA and rest goes the same as in case of direct isolation methods. Now ,regardless of whether we choose to apply a direct or indirect method of nucleic acid , we will always obtain certain amount of contamination either with proteins or humic acids, polysaccharides, minerals, as well as eukaryotic DNA (Kozdrój, 2010). So as to remove these unwanted contaminants, some additional protocols have been developed and adopted at different steps. Many of the protocols appear to be very specific and only effective for the type of soil for which they were developed.
Purification of metagenomic DNA
The most common contaminant obtained from soil associated DNA is the humic acid. It holds the similar charge as the DNA leading to its co-seperation with the nucleic acids (Sharma et al., 2007). Humic acid also interferes with the DNA quantification as they exhibit absorbance at both 230 nm and 260 nm (Sharma et al., 2007). An absorbance ratio of 260/230 nm is widely used to evaluate the purity of metagenomic DNA, and this is why humic contaminants must be taken into account. Most DNA purification methods are based on the precipitation method using various chemicals like potassium acetate, isopropanol,chloroform, poly ethylene glycol etc. alone or in combination. ion ex- change chromatography , agarose , Sephadex gel filtration, PVPP/PVP gel electrophoresis carry out the precipitation of proteins and humic substances present in the crude extract of the DNA (Cullen & Hirsch, 1998). Cesium chloride density gradient centrifugation has often been used to purify high quality DNA, (Robe et al., 2003). As this method consumes a lot of time, and faster alternatives have been developed that include the DNA extraction and purification kits, that can process different types of soil samples and result in a relatively pure DNA within a short span.
The modern omics targets the quantification and characterization of a large number of biological molecules that decipher the structural and functional dynamics of the organism/s. This amalgamation of knowledge from omics-based research is an emerging issue as the researchers seek to gain biological insights and promote translational research. All the techniques are mutually dependent on each other and on the analysis data generated from these omics tools thus providing a deep insight of the potential activities of the plant microbial communities. The integrative approach of ‘omics’ comprising of genomics, transcriptomics, proteomics, and metabolomics is needed to understand the basic principles of functions at different cellular levels.
The word meta- indicates a collection of similar items and genomics- means the study of genomes. The word Metagenome thus includes genomic analysis of assemblages of organisms. The idea of cloning DNA directly from environmental samples was first proposed by Pace (Pace, Stahl, et al. 1985) and in 1991, the first such cloning in a phage vector was reported (Schmidt, DeLong, and N. R. Pace.1991). The direct isolation of genomic DNA from an environment circumvents culturing the organisms under study.
Metagenomics unravel the functional potential of a microbiome, in terms of the wealth of genes involved in particular metabolic processes. Next generation sequencing (NGS) has made metagenomic studies comparatively easier and catalyzed the rapid, unprecedented characterization studies of microbiome (Akinsanya et al. 2015). Metagenomic analysis involves isolating DNA from an environmental sample, cloning the DNA into a suitable vector, transforming the clones into a host bacterium, and screening the resulting transformants. The clones can be screened for phylogenetic markers or “anchors,” such as 16S rRNA and recA, or for other conserved genes by hybridization or multiplex PCR (Staley, J. T., and A. Konopka. 1985) or for expression of specific traits or they can be sequenced (Tyson, G. W., et al. 2004 and Venter, J. C., et al. 2004.). Metagenomic and comparative studies gives a deeper insight into the functional and phylogenetics of the plant-associated communities. The development of NGS technologies, such as pyrosequencing of 16S rRNA genes, provided abundant sampling depth compared to traditional approaches, like T-RFLP, or 16S rRNA gene clone libraries
Sequencing the metagenomic DNA.
Earlier Sanger automated sequencing (Sanger et al., 1977) method revolutionized the study of the microorganisms and their classification. However due to certain setbacks advanced technologies were developed .These technologies that revolutionized the genomics and metagenomics areas was the 454 sequencing platform or “pyrosequencing.” The principle of this technology is a one-by-one nucleotide addition cycle, where the pyrophosphate (PPi) released from the DNA polymerization reaction is transformed in a luminous signal. The light emission from a plate with millions of microwells containing a given DNA fragment is detected by the machine and is translated to nucleotide sequences with an associated base quality value (Margulies et al., 2005). This technology offered a higher yield than Sanger sequencing at a lower cost but with shorter read lengths. The main bias of this technology is artificial insertions and deletions due to long homopolymeric regions. In spite of the advantages that this technology provided to metagenomics, it is now obsolete. The Ion Torrent platform is an analogous technology to 454 that produces a similar yield and a read length to those obtained at its middle stage of development. The Ion Torrent PGM is considered as the smallest potentiometer that exists and can detect the change in hydrogen potential generated each time a proton is released after a nucleotide is added in the sequencing reaction occurring in millions of microwells (Rothberg et al., 2011). The maximum Ion Torrent yield is ~500 million reads with a mode length of 400 bp (Glenn, 2014). In this case, the benefit is in terms of cost reduction, since Ion Torrent sequencing is just a tenth of the pyrosequencing cost (Whiteley et al., 2012).
However, read length reduction in return for higher yields and error-rates is a balance observed in some platforms in order to reduce the sequencing costs, i.e., the case of the Illumina technology, which has become one of the most popular technologies due to its low cost and high yield. The basis of Illumina chemistry is the reversible-termination sequencing by synthesis with fluorescently labeled nucleotides. To summarize, DNA fragments are attached and distributed in a flow cell, where the sequencing reaction occurs by adding a labeled nucleotide. When the labeled nucleotide is incorporated and its fluorescent molecule is excited by a laser, the signal is registered by the machine. Afterwards, the fluorophore molecule is removed and the next nucleotide can be incorporated. DNA fragments can be sequenced from one or both sides giving single end or pair-end sequencing, respectively, with a maximum read length of 300 base pairs per read (Bennett, 2004). The output of this technology is currently the highest among the second generation sequencing technologies and makes it suitable for multiplexing hundreds of samples ( Glenn, 2014).
Currently, the technologies already mentioned are the most used for metagenome projects, but the development of sequencing was kept going for the last 5 years in order to solve the known biases of these technologies and to offer a better trade-off between yield, cost, and read length. At present, the so called third generation sequencing technologies such as PacBio RS from Pacific Bioscience (Fichot and Norman, 2013) or the Oxford Nanopore (Kasianowicz et al., 1996), which are single-molecule, real-time technologies, reduced the amplification bias and also the short read length problem. The time and cost reduction offered by these technologies is also a valuable asset. However, the error rate is higher compared to other technologies but correctable if the sequencing depth is high enough. In terms of computational tools, there is virtually no software that can be used for metagenomics analysis.
One of the great improvements of second and third generation sequencing technologies is that the library preparation does not require DNA cloning vectors or bacterial hosts, simplifying the library preparation and reducing DNA contamination from other organisms that are not part of the metagenome.Although new generation sequencing technologies are powerful and have allowed us to discover novel microbial worlds and explore new environments, they present particular limitations and biases that have to be circumvented. It is important to consider that data obtained from second or third generation sequencing technologies have certain computational requirements for their analysis. The bigger the dataset generated, the higher computational resources and more complex bioinformatics analyses are necessary. In addition, large data storage is needed to archive and process the data (Logares et al., 2012). In terms of bioinformatic analysis, not only high-end servers are required but also UNIX operative system skills are needed. Programming and scripting knowledge are desirable to run and install the available metagenomics software for parsing and interpreting the results. Thus, it is suggested that biologists or biological scientists should develop basic computational skills in order to take an advantage of metagenomic data
Transcriptomics and metatranscriptomics Transcriptome analysis is one of the important omics analyses to understand plant response to environmental changes and has contributed very much to clarify plant biology. Transcriptomics and metatranscriptomics approach have been considered as a realistic way to study the microbial communities associated with di?erent plants (Sheibani-Tezerjietal, 2015). Transcriptomics studies help understand the microbial interactions and their response to various environmental conditions by entailing a comparative analysis of transcriptomes of the assembly of interacting species. Metatranscriptomic analysis of soybean plant revealed the presence of many small RNA (siRNAs) sequences distinct from soybean genome. The comparative studies of these sequences revealed the presence of various pathogenic, symbiotic and free living microbes corresponding to different samples of soybean plant (Molina et al. 2012). Metatranscriptomics has revealed profound changes in the rhizosphere microbiome were analysed in soil and rhizospheres of wheat, oat, pea and an oat mutant at the kingdom-level. The pea plant and oat showed five-fold higher load of of the microbiome with in the rhizosphere to that of wheat rhizosphere, with pea rhizosphere highly enriched with fungi (Turners et al. 2013). Metagenomic and metatranscriptomic analysis to understand the active microbiome and gene expression in different watermelon cultivars suggested the important role of different classes of bacteria in ripe stage (Saminathan et al. 2018). The integrated studies of transcriptomics and metabolomics is increasingly being used for generating deep insights in to abiotic stress responses. The analysis of different pathways of the transcriptome and metabolome studies in Astragalus membranaceus Bge. var. mongolicus (Bge.) revealed specific genotypic responses to different levels of drought (Jia et al. 2016). However, metatranscriptomic studies initially faced several challenges including the low recovery of high quality mRNA from environmental samples, the short half-lives of mRNA species, and the need for separation of mRNA from other RNA species (Simon and Daniel 2011). These limitations can be overcome by direct cDNA sequencing using NGS technologies. So far, metatranscriptomics have been used to analyze microbial communities from ocean surface waters (Frias-Lopez et al. 2008), coastal waters (Poretsky et al. 2009), and soil samples (Urich et al. 2008).
Proteomics and metaproteomics
The genomic based analysis are incapable to uncover the actual function of the microbial diversity in situ, therefore post genomic analysis are well accepted with the upcoming advancement in technologies. Proteomics involve large scale study of the expression of different proteins produced by an organism (Wilkins et al. 1995) whereas metaproteomics entail identi?cation of the functional expression of the metagenome and the metabolic activities occurring within a community at the moment of sampling. Proteins shape the phenotypic trait of a plant since they play an important role in the expression of plant stress response. Proteomic studies therefore, have become potent tools for looking at the physiological metabolism and protein–protein interactions in microbes and plants. The proteomic analysis are important for host–microbe interactions and many other intra- and inter-microbial species, revealing the different signaling responses of these microorganisms (Kosova et al. 2015). A large number of proteomic studies focused on the stress responses in different crop plants including Arabidopsis, wheat (Triticum aestivum), durum wheat (Triticum durum), barley (Hordeum vulgare), maize (Zea maize), rice (Oryza sativa), soybean (Soybean max), commonbean (Phaseolus vulgaris), pea (Pisumsativum), oilseed rape (Brassica napus), potato (Solanum tuberosum) and tomato (Lycopersiconesculentum) (Liu et al. 2015; Kosova et al. 2015; Xu J et al. 2015; Wang et al. 2016), re?ecting the dynamic changes in protein functional groups, signaling and regulatory pathway, protein metabolism, protein– protein interactions at interface, proteins and enzymes conferring several stress-related compounds, functions of structural proteins associated with the cell wall and cytoskeleton and identi?cation of putative proteins using bioinformatics tools (Kosova et al. 2015). The metaproteogenomics is the combinatorial study of the metagenome and the metaproteome of the environmental samples to identify more proteins (functions) than proteomics alone.
The single-cell genomics approach is based on amplifying DNA directly from individual cells without requiring growth in culture (Lasken 2007). The single-cell genome sequence generation involves the following major steps: sample preparation, single-cell isolation, cell lysis, whole genome amplification, library preparation, sequencing and data analysis ( Blainey et al 2003 , stepanauskas 2012). The advantage of sequencing complete genomes from single cells is that it allows the finding of the uncultured microbial majority and facilitatating the analysis of cell to-cell variability in microbial populations (Kvist et al. 2007). The first such demonstration of sequencing from single uncultured cells was done by micromanipulation (Raghunathan et al. 2005). Micromanipulation provided a high degree of certainty that only one cell was selected and also provided morphological information as the cell could be viewed and photographed during isolation (Brehm-Stecher and Johnson 2004). The few femtograms of DNA in a bacterium can be amplified by multiple displacement amplification upto micrograms of high molecular weight DNA suitable for DNA library construction and sequencing (Lasken 2007). A standard single-cell genomics experiment typically comprises four basic steps : (1) preparation of microbial cell fractions from the environment, (2) isolation of single cells, (3) amplification of DNA by MDA, and (4) single-cell genomic sequencing (Ishoey et al. 2008). There are now continued efforts to improve the MDA reaction enzymology to reduce bias and chimeric rearrangements, and some practical approaches to single-cell sequencing have been developed (Ishoey et al. 2008; Chitsaz et al. 2011). In 2010, an individual “Candidatus Sulcia muelleri” cell was isolated from the green sharpshooter Draeculacephala minerva and whole genome amplified by MDA; then, the genome was generated (Woyke et al. 2010). More recently, Blainey et al. (2011) reported the genome sequence of an ammoniaoxidizing archaea in an enrichment culture from low-salinity sediments in San Francisco Bay using single-cell genomics. Raman microspectroscopy and NanoSIMS Recently, several single-cell-based technologies such as Raman microspectroscopy (Huang et al. 2004, 2010b) and NanoSIMS (Kuypers and Jorgensen 2007; Oehler et al. 2010) were developed to explore the behavior and functions of unculturable microorganisms (Brehm-Stecher and Johnson 2004). Among them, Raman microspectroscopy offers a unique opportunity which enables one to interrogate at the biochemical level and manipulate microbes at single-cell level in their natural habitat (Huang et al. 2004, 2010b). Raman microspectroscopy is a noninvasive technique to acquire chemical signals from a small volume of samples.
These omics approaches based on the extraction of molecules (such as nucleic acid or proteins) directly from environmental samples broadened the spectrum of potentially targeted organisms by also including the rare microbiome. On the other hand they led the losing of the spatial information, as the microbial cells are physically removed from their original location. For such reasons, certain methods allowing localization and visualization of microbes in microbe-host systems have also progressed during the past two decades, parallel to molecular microbiology methods. These approaches includes the confocal laser microscopy