|
Below is a list of steps that should help you to use the Bioconductor affylmGUI package to analyze Affymetrix microaray data.
Using the affylmGUI package (Graphical User Interface for limma with Affy data)
The purpose of the affylmGUI package is to make it easier to use features of the affy() and limma() BioConductor packages by providing a Graphical Interface and eliminating the need to type R commands to invoke the functions in these packages. There is a webpage for affylmGUI at: http://bioinf.wehi.edu.au/affylmGUI/ and the Worked Examples are very good place to start!
1.Go to http://www.bioconductor.org/packages/release/AnnotationData.html and look for the annotation package corresponding to the chip you are working with, e.g. drosgenome1. Click on the package link and download the package to your computer.
2.Start the Rgui.exe program
3.Choose Packages->Install from local zip files to install the desired genome annotation package.
4.Type: library(affylmGUI) and answer Yes to start the GUI. Minimize the R GUI to keep it from stealing the focus from the affylmGUI.
5.It is safe to ignore this error message if it appears:
Error in eval(expr, envir, enclos): Object “PlotOptions” not found
6.Choose File -> New and specify the folder containing the .CEL files (Note: .CEL file names cannot contain # or other special characters besides underscore, dash, or dot.) This folder must also contain a Targets file, which is a Tab-delimited text file specifying sample names, filenames and chip replicate information. For example:
Name FileName Target
WPPCNSf.1 WPPfCNS1.CEL WPPCNSf
WPPCNSf.2 WPPfCNS2.CEL WPPCNSf
WPPCNSf.3 WPPfCNS3.CEL WPPCNSf
WPPCNSm.1 WPPmCNS1.CEL WPPCNSm
WPPCNSm.2 WPPmCNS2.CEL WPPCNSm
WPPCNSm.3 WPPmCNS3.CEL WPPCNSm
In this file the Target column indicates that the first three samples are replicates, as are the last three.
7.After selecting the Targets file, press OK and the .CEL files will be read in. Depending on the number of CEL files, this may take a few minutes. You will be prompted to enter a name for the data set. At this point you can click on the RawAffyData object in the status window on the left side of the GUI and you will see that the Raw Data is Available.
8.Next you can select Plot->Intensity Density Plot to check the distribution of Perfect Match intensities across each chip. The Density plot will show whether a sufficient range of intensities appears on the chip (i.e. neither saturated nor too little brightness overall). The Plot menu also has an option to display Image Array Plots, which can be plotted in an R window to save memory. The Image Array plots will show any scratches or saturated areas on the chips. If some images look saturated you may wish to exclude these chips from your analysis. Remember to minimize the R GUI after you have looked at the image plots.
9.Choose Normalization->Normalize and select RMA or GCRMA. (Each time normalization is run, the resulting Normalized Data object will overwrite any existing Normalized Data.) Remember that you are working with a large number of data points so some steps may take a few minutes to complete. If you wish to export the normalized expression values you will find this feature under the Normalization menu as well.
10. Under the Plot menu, choose Raw Intensity Box Plot and you will see how the arrays compare to each other before normalization. Then choose Normalized Intensity Box Plot to see the effect of normalization.
11. Choose Linear Model->Compute Linear Model Fit, then Compute Contrasts. You will be prompted to select the contrast(s) of interest. This set of contrasts must then be named. If replicate arrays have been specified, Empirical Bayes statistics will be computed in this analysis. The B statistic computed is the log odds score that the corresponding gene is differentially expressed. Calculation of the B statistic takes into account the variability between replicate arrays. The higher the B value, the more significant the differential expression is for a given gene. The original B statistic is described in Lonnstedt and Speed's "Replicated Microarray Data" :
http://stat-www.berkeley.edu/users/terry/zarray/TechReport/Baypap4d.pdf
(In particular, see Appendix A.1)
12. Choose TopTable->Table of Genes Ranked… then specify the contrast you are interested in, the number of genes to be shown in the table, the statistic to sort by, and the Adjust method. When the table is displayed it will include a menu bar from which the data can be saved or copied for pasting into another application.
13. From the Plot menu, select M A Plot and select a contrast to plot. An M vs. A plot is a graphical way to see ratios and fluorescence intensity at the same time (see Dudoit et al. Statistica Sinica (2002) 12:111). The top ten differentially expressed genes will be labeled in this plot. A Q-Q Plot can be rendered to show how much evidence of differential expression is present in a given contrast. A Log Odds (or Volcano) plot displays the relationship between the log odds of differential expression (B statistic) and the log fold change (M in the M A plot).
14. If you wish to produce a web page containing a summary of this analysis, the File->Export HTML Report menu item may be used.
15. When more than one contrast is set up, Heat Maps and Venn Diagrams can be plotted.
16. Use the File menu to Save the Data Set. This generates a .lma file that can be loaded into the affylmGUI package at a later time.
Although easy to use, affylmGUI provides only basic analyses. For an example of a more detailed time-course study that uses the limma() package directly, see: http://bioinf.wehi.edu.au/marray/jsm2005/lab5/lab5.html
|