Main

Neuroblastoma is a childhood tumour with a wide range of clinical courses, spanning from spontaneous differentiation to rapid progression and fatal outcome. A few genetic alterations have been described to be causative for neuroblastoma including mutations of the genes encoding ALK (Chen et al, 2008b; George et al, 2008; Mosse et al, 2008) or PHOX2B (reviewed in Janoueix-Lerosey et al (2010); Maris (2010)). Genomic sequencing of primary neuroblastoma recently revealed an enrichment for mutations in genes involved in neuritogenesis (Molenaar et al, 2012). However, the origin of the majority of spontaneous neuroblastomas remains unclear (Park et al, 2008). One subgroup of aggressive neuroblastomas is characterised by amplification of the MYCN oncogene, which occurs in about 20–25% of all primary neuroblastoma (Seeger et al, 1985). In the absence of the MYCN amplification, it remains difficult to define the individual risk of relapse and assign adequate therapy (Maris et al, 2007). Expression profiling has been performed in recent years to identify signatures predictive of outcome, and these signatures are still being refined for the use in clinically applicable settings (Wei et al, 2004; Ohira et al, 2005; Schramm et al, 2005b; Oberthuer et al, 2006, 2010). A comparison of published microarray studies has revealed little overlap of genes within signatures identified to predict neuroblastoma outcome, although integration of data seemed possible (Chen et al, 2008a). Several explanations for the diversity of genes in these signatures have been suggested such as the use of different technical platforms for expression profiling, the choice of prediction algorithms and inter-laboratory variation (Irizarry et al, 2005). It has also been argued that genes in the same pathways are co-regulated implying a similar predictive value and redundancy of information. Still, the fact that genes are used by one predictor and are absent in others may be a function of the probes being derived from different parts of the same gene for the different technical platforms. Methods to overcome this limitation of gene-based expression profiling include RNA sequencing or exon-level mRNA expression analyses. Here, we chose Affymetrix HuEx1.0 arrays that were designed to interrogate every known and predicted exon in the human genome. Reporter sequences that extend beyond protein coding regions to other areas of the transcribed genome are also included on these arrays. In principle, the combination of coverage and precise localisation information allows for analyses of regulation on the sub-gene level. This is important as a substantial proportion of protein coding genes are predicted to be alternatively spliced. In neuroblastomas, alternative splicing of the NTRK1/TrkA receptor has already been reported to have an impact on tumour aggressiveness (Tacconelli et al, 2004), suggesting that the phenomenon of differential transcripts use has a major impact on neuroblastoma biology. In this study, we wanted to explore whether analyses of mRNA expression resolved to the exon level was capable of improving prognostic prediction of neuroblastoma patient outcome above the level currently achieved using gene expression-based predictors. We also aimed to identify factors associated with alternative exon use in neuroblastoma and sought to gain novel biological insights into neuroblastoma biology by exon-level gene expression analyses.

Materials and methods

Patient cohort

The 113 primary neuroblastoma specimens were from tumour banks in Cologne and Essen, Germany, Ghent, Belgium and Valencia, Spain. All patients were diagnosed between 1998 and 2007 and treated according to the German neuroblastoma trials NB97, NB 2004 or the respective SIOPEN protocol. Written informed consent was obtained from patients or their parents. As many patients were identified within general neuroblastoma screening in Germany (Schilling et al, 2002), patients were selected to recreate a representative distribution of tumours stages. Of the 113 NB patients enclosed in our study, 57 presented with localised disease (stage 1: n=31; stage 2: n=11; and stage 3: n=15) and 56 with metastatic disease at the time of diagnosis. The mean and median ages of our patient cohort at diagnosis were 934 and 472 days, respectively. Approximately two third of patients were >1 year of age at the time of diagnosis (n=72) and the remainder (n=41) were <1 year. Amplification of the MYCN oncogene occurred in 21 patients. Median and mean follow-up were 2066 and 1976 days, respectively. In total, 26 patients died of their disease. Patient characteristics are summarised in Table 1.

Table 1 Clinical and biological characteristics of the 113 patients with primary NB included in this study

RNA preparation

Representative areas of histologically confirmed, snap-frozen neuroblastomas were cut on dry ice. No preselection for tumour cell fraction or microenvironment was performed. Total RNA extraction of NB tumour samples was performed in by silica gel-based membrane purification methods (RNeasy Mini kit or MicroRNeasy kit, Qiagen, Hilden, Germany) according to the manufacturers’ instructions. RNA quality was controlled by capillary gel electrophoresis using either the Experion system (high-sensitivity chips, Bio-Rad, Munich, Germany) or the Bioanalyser 2100 (RNA nano kits, Agilent, Palo Alto, CA, USA).

Array hybridisations

Fragmentation of cRNA and hybridisation to HuEx1.0 microarrays (Affymetrix, Santa Clara, CA, USA), washing, staining and scanning of the arrays in a GeneArray scanner (Agilent) were basically performed as described previously (Schramm et al, 2005a, 2005b). Quality control of hybridisations and RMA normalisation was performed using Genomics Suite (Partek Inc., St. Louis, MO, USA) using the default settings. Data were stored in the R2 microarray analysis and visualisation platform (AMC Amsterdam, http://r2.amc.nl, Amsterdam, The Netherlands) for exploration and visualisation. Microarray data have been submitted to the GEO database (acc. no. GSE32664).

Relapse prediction using shrunken centroids

A shrunken centroids (prediction analyses for microarrays (PAMs); Tibshirani et al, 2002) model was used to predict event-free survival (EFS) or relapse as well as overall survival (OS) or death using exon- and gene expression values. Two filter steps were applied before running the classification algorithms. First, all features of the core probe set having a 0.3-quantile score >5.0 were discarded. Then features that were significantly differentially expressed between two classes were identified using an F-test (significance level 0.05) to check for equal variances in each class for every exon or gene. Student’s t-test was performed for exons or genes with equal class variance and Welch's t-test was used for exons or genes with non-equal class variance to calculate P-values. These steps were repeated 20 times on bootstrap samples (drawing with replacement) from the data. Only exons and genes having P-values 0.05 in at least 95% of the rounds were kept as significantly associated to the outcome or label and these data were used to build the shrunken centroids model, which further reduced the number of features by shrinking the centroids to the relevant features. Cross-validation delivered a shrinkage parameter of 1.2 as a sensible trade-off between prediction accuracy classifier size for both gene and exon data.

Shrunken centroids analysis

Implementations of support vector machines (SVMs) and PAMs in Rapid Miner 5.1 (Rapid-I, Dortmund, Germany) were used to predict patient outcome, and patients were divided into a training (n=73) and a test set (n=40). RapidMiner was applied in conjunction with the RapidMiner Feature Selection Extension. Analyses of maximum divergent courses (MDCs) were carried out on a subset of 63 patients. MDC was defined by patients with localised disease without MYCN amplification, which had no relapse on the one hand and patients with metastatic (INSS stage 4) disease who died of their disease on the other hand. Test set data were not used in any of the validation steps until the predictor was applied.

Detection of differential gene or exon expression

The ‘apt-probeset-summarise’ tool (Affymetrix Power Tools 1.10.0) was run to compute exon and gene expression values using the following analysis parameters on the core probe set: rma-bg (background correction as described in Irizarry et al (2003)), quant-norm (quantile normalisation), pm-only (no mismatch probe adjustments) and med-polish (median polish as described in Irizarry et al, (2003)). Genes and exons were discarded if their normalised log2 expression count was <5 for at least 1/3 of the patients. Relative exon expression was computed for each gene with at least two exons represented, and utilised all exons representing a given gene as r=(eg)/g, where e is the normalised log2 exon expression and g is the normalised log2 expression of the gene. Differential expression between classes was determined by Welch's t-test, followed by false discovery correction according to Benjamini–Hochberg. Differential exon use in transcripts from the same coding gene was determined on relative exon expression levels. Genes or exons with a false discovery rate <0.05 were reported as significantly differentially expressed. To estimate differential exon use, significantly differentially expressed genes were discarded and differential exon expression between groups was then redetermined.

Cell culture and in vitro analyses

Cultivation of the human neuroblastoma cell lines IMR32, NB69, SH-EP, WAC2 and SH-EP MYC-ER was performed as described previously (Schramm et al, 2005c; Schulte et al, 2008). Induction of MYCN in SH-EP MYC-ER was achieved using 4-hydroxy tamoxifen as described (Schulte et al, 2008). Propodium iodide-based FACS analyses of DNA content was carried out as in (Eggert et al, 2001). Cell viability assays using 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazoliumbromide were performed as described in (Schulte et al, 2005). RNA was harvested from subconfluently growing cells using the RNeasy Mini Kit (Qiagen) and cDNA was subsequently prepared as described (Schulte et al, 2005). Semiquantitative qPCR was performed using standard protocols on a ABI StepOnePlus Real-Time PCR system (Life Technologies, Carlsbad, CA, USA) using primers spanning INCENP exons 1–2 and 10–11, respectively (Life Technologies).

Results

Association of known risk factors with outcome in our study cohort

We first evaluated the distribution of known risk factors (Table 1) and their association with clinical outcome in the cohort of 113 neuroblastomas that were analysed by expression profiling in this study. The relative number of patients with MYCN amplification and metastatic disease at time of diagnosis resembled the situation found in the overall patient population. As expected, MYCN amplification, metastatic disease, age at diagnosis >1 year and low NTRK1 expression were all associated with a significantly worse outcome in univariate analyses (Figure 1).

Figure 1
figure 1

Significant association of clinical and biological features with outcome (OS) as visualised by Kaplan–Meier analyses. (A) MYCN status (MYCN sc=MYCN single copy; MNA=MYCN-amplified); (B) metastatic spread (local vs metastatic); (C) age at diagnosis (age < or >1 year at diagnosis); and (D) TrkA expression (TrkA expression < or > median).

Comparison of the predictive power of gene-level or exon level-based predictors

In contrast to other studies focussing on only gene expression, we evaluated both gene- and exon-level expressions as the basis for a reliable predictor of neuroblastoma outcome. Tumour expression profiles from the 113 neuroblastoma patients (Table 1) were obtained using Affymetrix HuExST1.0 arrays. Expression data for both the gene and exon levels were processed and normalised as described in the Materials and methods section. Several methods were evaluated for prediction, including linear SVM and random trees/forests (data not shown), but PAM-based prediction was the most stable and reliable method. After filtering and enrichment for significantly differentially expressed genes, SVM- or PAM-based methods were used to build a predictor for EFS or OS using gene or exon expression data. Support vector machine-based predictors were less stable when data were split into a training set (n=73) and a test set (n=40, see also Materials and methods), which resulted in decrease in accuracy and specificity in the test set (data not shown). The PAM-based signature predicted OS with an accuracy of 83.1% (sensitivity 84.7%, specificity 82.6%, Table 2). When a published PAM-based gene set was used as input for prediction (Vermeulen et al, 2009a), accuracy was comparable (86.1%) but this was accompanied by a decrease in sensitivity, which dropped to 79.6%, whereas specificity was higher (89.4%). Prediction analyses for microarray utilising only the patients with MDCs only (n=63) produced the highest prediction accuracy (89.2%, sensitivity=85.2%, specificity=91.4%). Interestingly, both prediction accuracy and the individual classification were highly similar using either expression from the exon or gene levels and the same prediction method and data sets (Table 2).

Table 2 Precision of outcome prediction using PAM analysis of exon- or gene-level-derived signatures

Gene- and exon-derived PAM signatures are independent prognostic markers of survival in multivariate analyses

As we wanted to test the robustness of our predictors and to correlate them with known prognostic markers, backward logistic regression was performed. As the gene- and exon-derived signatures functioned similarly in predicting the individual patient outcome, we analysed their performance separately when building the regression models. Both gene- and exon-based predictors were the only significant independent predictors of OS in our cohort in multivariate analyses (Table 3, P=0.007, odds ratio=9.7, for the gene-based predictor; P=0.005, odds ratio=10.7, for the exon-based predictor).

Table 3 Multivariate analyses identified gene- and exon-based signatures as independent prognostic markers for OS

Identification of factors associated with differential exon usage in neuroblastoma

We analysed the impact of clinical and biological features on both differential gene and exon expression with the aim to identify master regulators governing alternative transcript use. Using the filtering steps described in the Materials and Methods section, we first determined the number of differentially expressed genes associated with either INSS stages, age at diagnosis, MYCN amplification, survival or NTRK1 expression (Table 4). Survival, MYCN amplification and NTRK1 expression were associated with a massive shift of the transcriptome, reflected by significantly differential expression of >1000 genes each. This also held true when analyses were performed using expression at the exon level, which resulted in >10 000 significantly differentially used exons for each group. Interestingly, when differential exon use was assessed for genes, whose expression was not significantly altered between groups, NTRK1 expression and MYCN amplification status were strongly associated with alternative exon use. When tumours with high NTRK1 expression were compared with tumours with low NTRK1 expression, 471 exons were found to be differentially expressed, whereas MYCN amplification correlated with 306 differentially used exons. This finding suggests that both, MYCN and NTRK1, could be involved in governing alternative transcript use.

Table 4 Association of clinical and biological features with differential gene and exon expression

Validation of known and novel transcript variants in primary NB

We first validated a variant of the neurotrophin receptor NTRK1, designated TrkA-III, which has been previously associated with aggressive tumour behaviour in NB (Tacconelli et al, 2004), using our expression data. This variant lacks exons six, seven and nine resulting in altered NTRK1 signalling independent of ligand binding (Tacconelli et al, 2004). Differential expression of exons six and seven was already visually detectable from the individual tumour profiles plotted as a heatmap (Figure 2A). NTRK1 gene expression was highly correlated with survival, as expected, whereas expression of exons six or seven was not (Figures 2B and C). Notably, expression of NTRK1 is not only correlated to prognosis, but also to patient age: younger patients have higher NTRK1 levels than older patients. Thus, we could validate that differential NTRK1 variants are expressed in NB and that these variants correlate with different patient outcome for the first time in a large patient cohort.

Figure 2
figure 2

Association of alternative exon usage in NTRK1 with outcome. (A) The heat map shows NTRK1 exon-level expression (rows, exon 1 on top) in our patient cohort. Exon 6/7 depicts the exons, for which the respective probes on the array are located. Red indicates exon upregulation and green indicates downregulation of an exon as compared with the median expression of the respective exon across all patients in our study. High NTRK1 expression was also correlated with young (<1 year) patient age. Age=patient age at diagnosis; green >1 year; red <1 year. (B) Differential association of NTRK1 exon expression with EFS or relapse. Left: NTRK1 exon 2 expression is high in EFS and low in patients with relapse (P=9 × 10−15). Right: correlation between NTRK1 exon 6 expression in NB patients with EFS is less pronounced although still significant (P=0.002). (C) Real-time PCR of NTRK1 exons 5–8 shows preferential expression in patients with EFS when compared with patients who experienced relapse (M=marker, − and + depict the negative and positive controls, respectively).

We also wanted to gain first insights into transcripts and novel transcript variants associated with unfavourable clinical features as these could serve as novel therapeutic targets. For this reason, we focussed on genes and exons, whose expression was linked to survival and the amplification status of MYCN (Table 4). Patterns of alternative exon use included different transcriptional start and stop sites as well as differential expression of isoforms from the same coding gene (Supplementary Figure 1). To validate the MYCN-induced regulation of one of these transcript isoforms, we selected the gene coding for INCENP, a member of the chromosomal passenger complex involved in mitotic progression. Statistical analyses suggested a significant upregulation of INCENP exons 1–5 corresponding to the INCENP transcript variant, NM_020238, in patients with MYCN-amplified tumours (Figure 3A). This is exemplified for exon 5, whose expression was significantly associated with MYCN amplification (P<0.001, Figure 3B), whereas no difference in INCENP exon 15 expression was observed between tumours with and without MYCN amplification (Figure 3C). INCENP was also induced upon MYCN activation or stable overexpression in vitro in two cell model systems based on the SH-EP NB cell line, which does not express detectable levels of MYCN. We used SH-EP derivates engineered to stably or inducibly overexpress MYCN (designated WACII and SH-EP MYCN-ER, respectively, Figure 3D). Although PCR-based assays validated INCENP as a MYCN-inducible target in vitro, specific isoforms were not differentially induced in assays specifically measuring both INCENP transcript variants compared with the longer INCENP transcript variant (NM_020238_2, Figure 3D, right). Thus, INCENP could be validated as a MYCN-inducible target in vitro, confirming the data obtained from the primary tumours, but an isoform-specific induction of INCENP variants by MYCN remains to be confirmed.

Figure 3
figure 3

Expression pattern of the INCENP gene in primary NB and in in vitro NB models. (A) A shorter INCENP isoform comprising only exons 1–5 (NM_020238) was preferentially expressed in MYCN-amplified patients, whereas expression differences were less pronounced for the longer INCENP transcript NM_020238_2. Data are presented as log2 of mean expression in each group ±s.e.m. (B) Expression of INCENP exon 5 was significantly elevated in MYCN-amplified patients (MNA) when compared with patients with normal MYCN (MYCNsc), whereas expression of exon 15 was not (C). (D) Semiquantitative real-time PCR revealed that expression of INCENP exons 1–2 and 10–11 was upregulated in vitro in a MYCN-negative NB cell lines, SH-EP, upon MYCN overexpression (WAC2) or MYCN induction (SH-EP MYCN+).

Interestingly, alternative exon usage of the first exons of the cell cycle regulating genes CDKN2B and CCNB1 was significantly different between aggressive, deadly or MYCN-amplified tumours and patients with EFS or normal MYCN status (Figures 4A and B). The most 5′ probe sets of CDKN2B are expressed at very low levels, so a functional role for the observed expression difference remains to be determined. The expression of a shorter CCNB1 isoform was verified in neuroblastoma cell lines (Figure 4C) and inhibition of Cdk1/CCNB1 interaction using the RO3306 small molecule inhibitor resulted in an increase in the fraction of apoptotic cells (Figure 4 D). Taken together, differential expression of transcript variants suggested from the analyses of primary tumours could be validated in the in vitro models.

Figure 4
figure 4

Differential transcript expression of the cell-cycle-associated genes, p15INK4B (CDKN2B) and cyclin B1 (CCNB1). (A) CDKN2B isoforms are differentially regulated between tumours with and without MYCN amplification, but not between tumours from patients with EFS and non-survivors (DoD). (B) A short isoform of CCNB1 (red line) is preferentially expressed in tumours from non-survivors (DoD). For both, A and B, data are presented as log2 of mean expression in each group ±s.e.m. (C) Higher expression of this shorter CCNB1 isoform was also detected in NB cell lines. (D) Inhibition of the CCNB1/Cdk1 interaction by a small molecule inhibitor induced apoptosis in these cell lines.

Discussion

Prediction of the individual risk and consequent adaptation of treatment is one promise of personalised cancer medicine. Using gene expression profiling for individual treatment decisions has been investigated for a decade now for many cancer types and also for neuroblastoma. Although early studies published in this field suffered from data overfitting and lack of general applicability, large studies have meanwhile provided evidence for the high sensitivity and accuracy of gene expression-based classifiers using different platforms (Vermeulen et al, 2009a, 2009b; Oberthuer et al, 2010). International consortia such as MAQC have defined standard operating procedures for validating microarray data (Shi et al, 2010). Although advances have been achieved for both technical improvements as well as standardisation, concerns still remain on the clinical usability of classifiers derived from expression profiling. Here, we addressed whether the higher information density of expression profiling at the exon level rather than the gene-level expression could increase the power of tumour expression-based outcome prediction. For this purpose, we used a cohort of 113 neuroblastoma patients, which were profiled using Affymetrix Exon Arrays (HuEx1.0ST). Using exon-level expression achieved comparable prediction accuracy, sensitivity and specificity for both OS and EFS compared with using gene-level expression. These findings also held true, when a published PAM-derived signature of 59 genes (Vermeulen et al, 2009a) and the exons making up these genes were used to predict outcome in this cohort. Prediction accuracy increased when patient cohorts with maximum diversity (MDCs) in clinical courses were used. For the best clinical courses, we only included patients with localised disease without MYCN amplification, which had never experienced a relapse. Patients with the worst clinical course had metastatic spread at time of diagnosis (INSS stage 4) and were eventually succumbing to their disease. As previously noticed, survival prediction is possible with near perfect accuracy whereas prediction of relapse is more difficult. One reason for this phenomenon could be that MYCN amplification is a major risk factor for relapsing neuroblastoma and results in massive reprogramming of the tumour transcriptome to create this aggressive tumour phenotype. However, MYCN amplification alone cannot account for all aggressive, relapsing courses of the disease. Given the heterogeneity of transcriptomes from tumours with and without MYCN amplification, the definition of a perfect relapse signature is still an unresolved task. Nevertheless, the gene- and exon-derived signatures in our study were the only independent markers of survival in multivariate analyses, corroborating their potential usefulness for patient stratification and, in the long run, individually adapted risk assessment.

Exon-level profiling promises to yield unprecedented insights into the fine tuning of transcript use associated with aggressive or benign tumours. MYCN amplification has long been linked to aggressiveness in NB (Schwab et al, 1983; Seeger et al, 1985), but has only recently been associated with alternative exon use in primary NB (Guo et al, 2011). First mechanistic insights into MYC-driven splice site usage were provided by linking alternative splicing of the Raf kinase to differential MYC expression (Rauch et al, 2011). Here, we identified NTRK1 as a second biological parameter governing gene composition, when patients with tumours strongly expressing NTRK1 were compared with patients whose tumours weakly expressed NTRK1. High NTRK1 expression, which has previously been shown to be associated with outcome in NB (Nakagawara et al, 1993), was also correlated with excellent outcome in our data set (Figure 2A). Of note, expression of NTRK1 exons 6 and 7 was even lower in advanced stage tumours (Figure 2C), confirming previous results (Tacconelli et al, 2004).

We identified differential expression of >200 predicted alternative transcripts between neuroblastomas with and without MYCN amplification or with high and low NTRK1 expression. Thus, we are still at the beginning to understand the global changes induced by MYCN and NTRK1 in neuroblastoma. The interplay of these factors is complicated by the fact that MYCN can directly bind to the NTRK1 promoter and negatively regulate NTRK1 expression (Iraci et al, 2011). Although first attempts have been made to catalogue alternative splice site usage in oncogenic kinases (Druillennec et al, 2012), we are still far from having an overview of the functional relevance of all splicing changes intrinsic to aggressive cancer cells. Moreover, it remains to be determined whether alternative transcript use rather than alternative splicing has a role in determining the expression profiles underlying tumour aggressiveness. In neuronal development, alternative transcript use was identified as a major cause for transcriptome diversity (Pal et al, 2011), and similar patterns were observed in medulloblastomas (Menghi et al, 2011). It is evident that the altered tumour transcriptome results from aberrant transcripts due to mutations, aberrant transcript amounts from changes in regulatory networks and also from alternative transcript use, but how the latter is produced must be analysed in more tumour entities to evaluate its contribution to cancer biology.

In neuroblastoma, there are hints linking functional modules such as cell cycle regulation (Murphy et al, 2011) and protein biosynthesis (Boon et al, 2001) to MYCN and aggressive tumour behaviour. Moreover, gene expression of members of the chromosomal passenger complex, most prominently survivin, has been previously linked to NB. Survivin upregulation was linked to neuroblastoma aggressiveness (Islam et al, 2000), but this could be attributed to an unspecific gain of chromosome 17q, which occurs than half of all neuroblastomas (Bown et al, 1999). Here, we provide first evidence that a second member of the chromosomal passenger complex, INCENP, is a potential MYCN target in neuroblastoma as INCENP mRNA was upregulated upon MYCN induction in vitro as well as in patients having tumours with amplified MYCN. In addition, exon-level analyses of the INCENP coding region suggested a shorter hypothetical INCENP isoform differentially expressed between tumours with and without MYCN amplification. However, the existence of such an isoform, for which only a cDNA clone has previously been described, remains to be proven experimentally. In vitro studies corroborated MYCN-induced upregulation of INCENP exons 1–2 and 11–12. Nevertheless, it is intriguing to speculate that transcriptional reprogramming of aggressive tumour cells may not only cause differential gene expression but also alternative transcript use, which could here be exemplified for a CCNB1 variant.

In summary, expression signatures based on the gene or exon level proved to be equally powerful in predicting the clinical outcome of neuroblastoma patients, rendering gene-based assays as sufficiently precise for potential diagnostic purposes. The power of exon-based expression analyses lies in the identification of alternatively used transcripts in tumours, which can provide new insights into the aberrant circuits of the most aggressive tumour fraction, harbouring a plethora of options for therapeutic interventions.