Functional annotation clustering in DAVID was used to minimize redundancy in the GO terms

As an extension of our previous work, the objective of the present study is to employ a toxicogenomics approach to compare and contrast the molecular pathways that are perturbed by MSC and TSC. However, subtle differences in gene expression provide insight into mechanisms underlying the observed differences in toxicities.A reference design with arrays as blocks of size 2 was used to analyze the median signal intensities of the two-color micro-array data. Five biological replicates per condition were used for each of the eight conditions, for a total of 80 micro-arrays. Six MSC and four TSC “outlier” micro-arrays were removed based on quality control checks , leaving a minimum of 3 replicates per group. The background signal intensity for each array was estimated using the 1533xSLv1 negative controls present on each array. All pre-processing of the data was conducted using R. The data were normalized using the LOWESS normalization method in the R library “MAANOVA”. Differential expression between the control and exposed samples for each of the three dose levels at each of the two time points was tested using the MAANOVA library. The ANOVA model was fitted to include the main effects of dose and time, with a dose by time interaction term and the array as a blocking variable. The Fs statistic , a shrinkage estimator, was used for the gene-specific variance components, and the associated p-values for all the statistical tests were estimated using the permutation method. These p-values were then adjusted for multiple comparisons using the false discovery rate approach. The least squares mean , a function of the model parameters, was used to estimate the fold change for each pairwise comparison of the six pairwise comparisons of interest among the eight treatment-by-time groups. The micro-array data for this experiment has been submitted to the Gene Expression Omnibus repository and can be accessed under record number GSE44603. Visualization and analysis of significantly changing genes was performed using Gene Spring GX 7.3. Important pathways containing significantly expressed genes were identified using Ingenuity Pathway Analysis. 

Genes were assigned to functional categories using gene ontology in the Database for Annotation, Visualization and Integrated Discovery. Analyses were performed on genes that were identified as statistically significant by one-way ANOVA using four models: Hill, Power, Linear and 2◦ Polynomial. Models that described the data with the least complexity were selected. A nested chi-square test,vertical rack with cutoff of 0.05, was first used to select among the linear and 2◦ polynomial model, followed by comparison of Akaike information criterion , which measured the relative goodness of fit of a statistical model, between nested models and the power model. The model with the lowest AIC was selected as the best fit. A maximum of 250 iterations and a confidence level of 0.95 were selected. For functional classifications and analyses, the resulting BMD datasets were mapped to KEGG pathways with promiscuous probes removed. BMDs that exceeded the highest exposure dose were removed from the analysis.Three RT-PCR pathway specific arrays were used to validate the expression of specific micro-array genes. Eight nanograms of total RNA, from the same samples that were used for the micro-array study, were reverse transcribed to cDNA using an RT2 First Strand Kit. cDNA was mixed with the RT2 qPCR Master Mixes and aliquoted into 96-well plates containing primers for 84 pathway specific genes. Expression levels were evaluated using a CFX96 real-time Detection System. Relative gene expression was normalized to the Gapdh housekeeping gene, which remained unaffected under experimental conditions. Fold changes and statistical significance were calculated using the REST method for statistical significance.For the micro-array study, FE1 cells were exposed to 2.5, 5 and10 g/ml of MSC and 25, 50 and 90 g/ml of TSC. Exposed samples were compared to their matched controls, and genes were considered significantly differentially expressed if they had a fold change ≥2 with an FDR-adjusted p-value ≤0.05. A total of 1020 unique probe identifiers were significantly differentially expressed following exposure to MSC, and of these, 979 were deemed “present”. 

Following exposure to TSC, 557 probes were significantly differentially expressed and 527 were deemed “present”. Of these, 356 were common to both MSC and TSC exposures. The number of significantly up- and down-regulated genes at each time point and concentration is shown in Table 1. Overall, there was an increase in the number of differentially expressed genes with increasing concentration of condensate, and there were more genes changing after the four hour recovery. At the highest concentration for both time points, cells exposed to MSC had a greater number of changing genes as compared to cells exposed to TSC. Gene expression was most altered for cells exposed to the highest concentration of MSC at the 6 + 4 h time point. Whether separated by dose or considered all together , Venn diagrams show considerable overlap in the genes that are significantly expressed at each time point following MSC or TSC exposure. Hierarchal clustering using all genes that were statistically significant revealed that the controls and the marijuana high concentration clustered independently from the rest of the samples. The remaining samples clustered first by concentration ,then by condensate type , with the last branching resulting from time. When cells exposed to TSC and MSC were analyzed separately, samples clustered first by concentration and then by time point, suggesting that concentration has the largest overall effect on gene expression. For MSC, the high concentration samples were on the first main branch, followed by control, low and medium concentrations. The results indicate that the expression profiles of the high concentration MSC exposed cells are quite distinct. For TSC, the controls branched separately from all the treatment groups. The top 10 genes with the largest overall fold changes are listed in Table 2. All of the top 10 genes were significantly up-regulated with the exception of low density lipoprotein receptor , which was down-regulated in MSC exposed cells. Of the top 10 changing genes, five genes were common to both MSC and TSC. The GO terms associated with these commongenes included multicellular organismal development, vasculogenesis, regulation of transcription, and regulation of inflammatory response. Ingenuity Pathway Analysis was used to define the pathways that were significantly altered following exposure to MSC or TSC. Fig. 3 shows the overlap in all the significant pathways between the two condensate types.

The top five most significantly altered pathways for cells treated with MSC or TSC are listed in Table 3. NRF2-Mediated Oxidative Stress Response was the most significant pathway for cells exposed to TSC at all concentrations and time points, with the exception of lowest concentration at time 6 + 4 h where LXR/RXR Activation was the most significant. For cells exposed to MSC, the most significantly altered pathways were Biosynthesis of Steroids, as well as NRF2-Mediated Oxidative Stress Response, Aminoacyl-tRNA Biosynthesis and HMGB1 Signaling. Some ofthe top five pathways were common to both the MSC and TSC including those related to oxidative stress and xenobiotic metabolism. However, inflammation pathways were more predominant for the MSC, whereas cell cycling and cancer signaling pathways were more predominant for the TSC. To further elucidate differences between the two smoke condensates, the genes that were uniquely expressed following TSC exposure or uniquely expressed following MSC exposure at the highest concentrations for the two separate time points were compared in IPA. The findings confirm the importance of inflammation and steroid biosynthesis pathways in MSC exposed cells and highlight the significance of apoptotic pathways particularly at the 6 h time point. For cells exposed to TSC,Mphase cell cycle pathways appear to be of particular importance. Gene Ontology in the Database for Visualization, Annotation and Integrated Discovery was used to apply functional annotation to all the significantly differentially expressed genes for each condensate. The full results are shown in Supplementary Tables 1 and 2. For cells exposed to MSC, significant perturbations were associated with steroid/cholesterol/lipid biosynthesis, NOD like receptor signaling , tRNA aminoacylation, transcription regulation, unfolded protein response and DNA binding. Like MSC, cells exposed to TSC had significant perturbations in transcription regulation, unfolded protein response and DNA binding. In addition, perturbations in cell cycle, p53 signaling, oxidative stress, and cancer signaling were also noted in TSC exposed cells. Fig. 5 shows the overlap of all the significantly affected ontologies between the two condensate types.This analysis revealed 19 clusters with enrichment scores greater than 2 for MSC and 19 clusters for TSC. The top clusters for MSC relevant to toxicological processes include lipid/steroid biosynthesis , RNA processing ,indoor hydroponic system cellular response to unfolded protein , tRNA aminoacylation , and positive regulation of transcription. 

The top clusters for TSC relevant to toxicological processes include cellular response to unfolded protein , cell cycle , positive regulation of transcription , response to steroid hormone stimulus , and positive/negative regulation of apoptosis and cell death. To investigate early versus downstream effects, functional annotation was applied to significantly differentially expressed genes at the two separate time points. The results are shown in Supplementary Tables 5–8. For cells exposed to MSC at the 6 h time point, the analyses revealed 79 significant terms including those related to transcription activity, DNA binding, and steroid/cholesterol biosynthesis. Four KEGG pathways and 1 Biocarta pathway were also deemed significant at this time point. At the 6 + 4 h time point, 76 significant terms were identified. These terms included unfolded protein response, and tRNA aminoacylation, as well as steroid/cholesterol biosynthesis which was found at the 6 h time point. Three KEGG pathways were significant at this time point including Steroid Biosynthesis, Terpenoid Backbone Biosynthesis, and Aminoacyl-tRNA Biosynthesis. Analyses of cells exposed to TSC at the 6 hr time point revealed 67 significant terms including those associated with oxidative stress, cell death, protein unfolding, transcription regulation, DNA binding and cell cycle. In addition, 2 KEGG pathways were significant. At the 6 + 4 h time point, 32 GO terms were identified as significant with oxidative stress being the only relevant toxicological endpoint. In addition, only one KEGG pathway was significant. Overall for MSC, the DAVID analyses confirmed many of the significant pathways identified by IPA including steroid biosynthesis, tRNA aminoacylation, inflammation and apoptosis. In addition, the analyses highlighted transcription regulation, DNA binding and unfolded protein response as also significant. For TSC, the DAVID analyses confirmed the significance of IPA pathways related to oxidative stress and cell cycle. As with the MSC, the DAVID analyses also further highlighted the importance of transcription regulation, DNA binding and unfolded protein response, as well as cell death. Transcription regulation and DNA binding were significant terms common to both MSC and TSC at the 6 h time point, whereas no common terms existed for the two condensates at the 6 + 4 h time point.In our previous genotoxicity study we showed that MSC and TSC were both cytotoxic and genotoxic. However, quantitatively, MSC was more cytotoxic and mutagenic than TSC, and TSC appeared to induce chromosomal damage in a concentration-dependent manner whereas MSC did not. Our earlier chemical analyses of MSC and TSC noted that aside from the nicotine in tobacco and the cannabinoids in marijuana, the two smoke condensates contained mixtures of chemicals that were qualitatively similar though quantitatively different. The similarities in the chemical profiles and some of the toxicity findings suggested that the two smoke condensates might elicit somewhat comparable gene expression profiles. Hierarchal clustering of all the MSC and TSC exposed samples in the present study supported this notion and samples clustered first by concentration as opposed to smoke type. In addition, analysis of the top ten greatest gene expression changes relative to control revealed that half of the genes were common to both marijuana and tobacco. A number of previous studies have examined gene expression changes in pulmonary cells following exposure to tobacco smoke. Generally, these studies have shown thattobacco smoke stimulates xenobiotic metabolism, and that metabolized smoke constituents contribute to DNA damage. Following early insult, DNA damage leads to disruptions in the cell cycle such as arrest at the G2 checkpoint to allow time for response. Cellular response can include DNA repair, mutation induction through faulty repair or lack of repair, and programmed cell death of heavily damaged cells.