About genArise

genArise is a package that contains specific functions to perform an analysis of
microarray obtained data to select genes that are significantly differentially 
expressed between classes of samples. Before this analysis, genArise carry out a 
number of transformations on the data to eliminate low-quality measurements and to 
adjust the measured intensities to facilitate comparisons.

You need install the packages tkrplot and locfit in order to be able to use this 
package. This can be done as follows:

> install.packages(c("locfit","tkrplot","xtable"))

genArise is provided with functions that can be applied from R prompt in every step 
of analysis; however, there is also a graphical user interface to facilitate the use 
of all the functions in the package.
*
Project

In order to save the history of all the performed actions in an analysis, genArise 
create a project that consist of a directory with the same name of the project name. 
This directory will contain two directories, one for the saved plots and one for the 
output files saved while the analysis.
*
Creating a New Project

To create a new project:

1. Open the File menu and choose Project.
2. Click on the New Project submenu.


A new window will be displayed where you must specify the Project Configuration.

*Input:
	In the Location Spot File text entry, type the path of the file with the 
	results of the experiment or click on the Browse button and search the file
	in the disk. Next, you must choose the correct option: the experiment was
	performed by the IFC in UNAM or is an experiment performed in a different 
	place.

	Then you must to specify: the location of the column that contain the query 
	sample in the array (Cy3), the location of the column that contain the reference
	sample in the array (Cy5), the location of the column that contain the back-
	ground correction for Cy3 (BgCy3), the location of the column that contain the 
	background correction for Cy5 (BgCy5), the location of the column that contain 
	the ids for the distinct elements in the array.

	Now, if the experiment was not performed in the IFC you must specify the grid 
	dimensions, so, you must type the values of the fields: rows, columns, 
	meta-rows and meta-columns.
	
*Output:
	In the Project Name text entry, type the name of the project (this will create
	a directory with this name in the current directory) or click on the Browse 
	button and go to the location where you want to create the project. In the 
	Location for Plots text entry type the name of the directory where you want to 
	save your plots and in the Location for Results type the name of the directory 
	where you want to save the output files that you obtain while the experiment.
*
File Format

	The input file format must be in columns of data which values are tab-delimited
	which can usually be exported from Excel, R or Splus.

	The input file must contain only a row for each spot (gene) in the microarray, 
	at least that the experiment was performed by the IFC in UNAM (the 13 last 
	rows contain statistical information about the experiment) and probably a 
	header row.
*
Opening an Old Project

Select the option "File" on the main menu of genArise and follow the next steps:

      1.- Select the option "Project"
      2.- Once in the "Project" option select "Open Project"
      3.- Then choose a file with extension ".prj" and click "Open"
 
     The three steps above will diplay the operations and graphics from the selected
     Project.
*
Diagnostic Plots

	Previous to any kind of analysis, once you have load the input, the next 
	window contain a preliminary view of the main grid displayed as an image 
	plot in a green to red scale representing the log2 intensity ratio for 
	each spot on the array. You can also see the plot of the green and red 
	levels just by selecting the corresponding option in the lower right
	side of the window.

	It is important to clarify that these image plot one does not replace the
	original TIFF image from the microarray experiment.

	This feature was completly based in the imageplot function from the limma 
	package.

	Gordon K. Smyth (2004), "Linear Models and Empirical Bayes Methods for 
	Assessing Differential Expression in Microarray Experiments", Statistical
	Applications in Genetics and Molecular Biology: Vol. 3: No. 1, Article 3. 
	http://www.bepress.com/sagmb/vol3/iss1/art3

	genArise shows all this plots in the window inmediatly after create a new 
	project.
*
Intensity Ratio Plot

	Previous to any kind of analysis, once you have load the input, the next 
	window contain a preliminary view of the main grid displayed as an image 
	plot in a green to red scale representing the log2 intensity ratio for 
	each spot on the array. You can also see the plot of the green and red 
	levels just by selecting the corresponding option in the lower right
	side of the window.

	It is important to clarify that these image plot one does not replace the
	original TIFF image from the microarray experiment.

	This feature was completly based in the imageplot function from the limma 
	package.

	Gordon K. Smyth (2004), "Linear Models and Empirical Bayes Methods for 
	Assessing Differential Expression in Microarray Experiments", Statistical
	Applications in Genetics and Molecular Biology: Vol. 3: No. 1, Article 3. 
	http://www.bepress.com/sagmb/vol3/iss1/art3

	genArise shows all this plots in the window inmediatly after create a new 
	project.
*
Background of Cy3 Plot
	
	Previous to any kind of analysis, once you have load the input, the next 
	window contain a preliminary view of the main grid displayed as an image 
	plot in a green to red scale representing the log2 intensity ratio for 
	each spot on the array. You can also see the plot of the green and red 
	levels just by selecting the corresponding option in the lower right
	side of the window.

	It is important to clarify that these image plot one does not replace the
	original TIFF image from the microarray experiment.

	This feature was completly based in the imageplot function from the limma 
	package.

	Gordon K. Smyth (2004), "Linear Models and Empirical Bayes Methods for 
	Assessing Differential Expression in Microarray Experiments", Statistical
	Applications in Genetics and Molecular Biology: Vol. 3: No. 1, Article 3. 
	http://www.bepress.com/sagmb/vol3/iss1/art3

	genArise shows all this plots in the window inmediatly after create a new 
	project.
*
Background of Cy5 Plot

	Previous to any kind of analysis, once you have load the input, the next 
	window contain a preliminary view of the main grid displayed as an image 
	plot in a green to red scale representing the log2 intensity ratio for 
	each spot on the array. You can also see the plot of the green and red 
	levels just by selecting the corresponding option in the lower right
	side of the window.

	It is important to clarify that these image plot one does not replace the
	original TIFF image from the microarray experiment.

	This feature was completly based in the imageplot function from the limma 
	package.

	Gordon K. Smyth (2004), "Linear Models and Empirical Bayes Methods for 
	Assessing Differential Expression in Microarray Experiments", Statistical
	Applications in Genetics and Molecular Biology: Vol. 3: No. 1, Article 3. 
	http://www.bepress.com/sagmb/vol3/iss1/art3

	genArise shows all this plots in the window inmediatly after create a new 
	project.
*
Microarray Analysis

	Because many proteins have unknown functions, and because many 
	genes are active all the time in all kinds of cells, researchers 
	usually use microarrays to make comparisons between similar cell 
	types. For example, an RNA sample from brain tumor cells, might 
	be compared to a sample from healthy neurons or glia. 
	Probes that bind RNA in the tumor sample but not in the healthy one 
	may indicate genes that are uniquely associated with the disease. 
	Typically in such a test, the two sample's cDNAs are tagged with two 
	distinct colors, enabling comparison on a single chip. Researchers hope
	to find molecules that can be targeted for treatment with drugs among 
	the various proteins encoded by disease-associated genes.

	An important step of this analysis is the "task of normalizing data 
	from individual hybridizations to make meaningful comparisons of 
	expression levels, and of 'transforming' them to select genes for 
	further analysis and data mining" (Quackenbush, 2002).
	All this tasks are provided by genArise in a complete set of 
	functions.

	You can perform all this tasks in just one step as follow:
	
	1. Create a new project.
	2. In the window that shows the diagnostic plots, open the Analyze
	   menu.
	3. Click Yes to the dialog asking if you want to follow the wizard.

	If you want to perform the operations one by one and in a different 
	order, click No to the dialog and a new window is displayed that 
	includes in the menu bar a menu for each operation.
*
Background Correction

	Local background correction involves subtracting the estimated background 
	intensity surrounding a microarray spot from the estimated spot intensity.

	That is performed just with a subtraction (Cy3 - Background of Cy3) and 
	(Cy5 - Background of Cy5) for each spot in the microarray.

	If you don't follow the wizard, when you click on the Normalize menu, a
	dialog asking if you want to correct the background is open. You must click
	Yes.
*
Normalization

	Data normalization can be done by grids or in a global way, and each method 
	returns different results for the same input data set. We must remark that 
	the grid normalization can just be applied to the complete data set, so you 
	must not eliminate spots in order to avoid errors executing this function.  

	In the normalization procedure any observation in which the R value is zero 
	will be eliminated.

	Lowess detects systematic deviations in the R-I plot and corrects them by 
	carrying out a local weighted linear regression as a function of the log10
	(intensity) and subtracting the calculated best-fit average log2(ratio) from 
	the experimentally observed ratio for each data point. Lowess uses a weight 
	function that de-emphasies the contributions of data from array elements that
	are far (on the R-I plot) from each point.

*
By Grid

	The normalization by grid normalize R and I values and fit the value of Cy5 
	for each grid in the array that it receives as argument. 
	The dimension of each grid is (meta-row * meta-column).

*
Global

	Global normalization, normalize R and I values and fit the value of Cy5 
	from his argument. In this function the normalize algorithm will be
	applied to all observations to get the lowess factor and then fit Cy5 
	with this factor. The observations with values R = 0 are deleted because 
	they have no change in their expression levels.
*
Filter
	
	If one examines several representative R-I plots, it becomes obvious that 
	the variability in the measured log2(ratio) values increases as the measured 
	hybridization intensity decreases. This is not surprising, as relative error 
	increases at lower intensities, where the signal approaches background. 
	
	A commonly used approach to address this problem is to use only array 
	elements with intensities that are statistically significantly different from
	background. 

	If the average local background near each array element and its standard 
	deviation are measured , we would expect at 95.5% confidence that good 
	elements  would have intensities more than two standard deviations above 
	background. 
	By keeping only array elements that are confidently above background, we can
	increase the reliability of measurements.
	 
	This function keep only array elements with intensities that are 2 standard 
	deviation above background.
*
Duplicates Analysis

	"Replication is essential for identifying and reducing the variation in any 
	experimental assay, and microarrays are no exception. Biological replicates 
	use RNA independently derived from distinct biological sources and provide 
	both a measure of the natural biological variability in the system under 
	study, as well as any random variation used in sample preparation. 
	Technical replicates provide information on the natural and systematic 
	variability that occurs in performing the assay. Technical replicates 
	include replicated elements within a single array, multiple independent 
	elements for a particular gene within an array (such as independent cDNAs 
	or oligos for a particular gene), or replicated hybridizations for a 
	particular sample. The particular approach used will depend on the 
	experimental design and the particular study underway" (Quackenbush, 2002).

	genArise provide three different approaches for this task.
*
Mean Replicates Filter

	This function allows to remove from the array repeated Id's. Before 
	moving one of the repeated Id's the function compute the average 
	of Cy3 intensity and Cy5 intensity.
*
Non-extreme Values Filter

	This function allows to remove from the spot repeated Id's. Before
	moving one of the repeated Id's the function compute the log ratio of
	both values with the same Id and delete the least absolute value if both
	of them are positive or negative. In other case delete both observations. 
*
Geometric Mean Filter

	We consider replicate measures of two samples and adjust the
	log2 ratio measures for each gene so that the transformed 
	values are equal. To do this we take the geometric mean.

	This procedure can be extended to averaging over n replicates 
	and is the procedure performed by default if you follow the 
	wizard.
*
Graphics

	The menu Graphics allows you change the type of the graphic that is 
	displayed. 

	The options are:  
	
	1. Cy3 -vs- Cy5
	   (query sample versus reference sample)
	2. R -vs- I  
	   (ratio R = log_2(R/G) versus intensity I = log_10(G/R))
	3. M -vs- A  
	   (log intensity ratios M = log_2(R/G) versus
	     average log intensities A= log_2(R*G)/2)
	
	By default, genArise show the R -vs- I plot.
*
R -vs- I

	Log ratio values can have a systematic bias on intensity and 
	the simpler normalisation techniques cannot completely account 
	for these (Quackenbush, 2002). An RI Plot shows these 
	intensity-specific effects by plotting the log ratio of each 
	gene as a function of the product of the individual intensities 
	(i.e. Red and Green).

	1. In the window after the microarray analysis select the option you 
	want to plot.
	2. Open the Graphics menu.
	3. Click on the R -vs- I submenu and the plot will be displayed.
*
M -vs- A

	Like RI plots, MA plots can show the intensity-dependant ratio of 
	raw microarray data. The plots differ in the axes used. The MA 
	plot uses M as the y-axis and A as the x-axis where M = log_2 (R/G) 
	and A = log_2(R*G)/2, R and G represent the fluorescence intensities
	in the red and green channels, respectively. 

	Logarithms base 2 are used instead of natural or decimal logarithms 
	as intensities are typically integers between 1 and 2^16.

	1. In the window after the microarray analysis select the option you 
	want to plot.
	2. Open the Graphics menu.
	3. Click on the M -vs- A submenu and the plot will be displayed.
*
log2(Cy3) -vs- log2(Cy5)

	Plot the original intensities. Query sample versus reference sample.

	1. In the window after the microarray analysis select the option you 
	want to plot.
	2. Open the Graphics menu.
	3. Click on the log2(Cy3) -vs- log2(Cy5) submenu and the plot will be displayed.
*
Write Output

	Write the values for observations after perform any operation in an
	output file. This values are writen in columns with the follow order:
	Cy3, Cy5, Cy3 Background, Cy5 Background and finally Ids.

	If you follow the wizard, the partial results after each operation
	are saved in the Results directory on the project.

	If you don't follow the wizard you must:

	1. In the window after the microarray analysis select the option you 
	want to save their data as output file.
	2. Open the Options menu.
	3. Click on the Write as outputfile submenu.
	4. Type the name of the output file.

	If you want to save in an output file the values for observations 
	after perform the Z-score, this values are writen in columns with the
	follow order: Cy3, Cy5, Cy3 Background, Cy5 Background, Ids and 
	finally the Zscore value. 

	1. In the window after the Z-score select the option you want
	to save their data as output file.
	2. Open the Options menu.
	3. Click on the Write as outputfile submenu.
	4. Type the name of the output file.
*
Z-score

	This function identify differential expressed genes by calculating 
	an intensity-dependent Z-score. This function use a sliding window 
	to calculate the mean and standard deviation within a window 
	surrounding each data point, and define a Z-score where Z measures 
	the number of standard deviations a data point is from the mean.
*
Post-Analysis

	With this function you can obtain which genes are shared between two or 
	more projects.

	You must follow the next steps:
	
	1. From the genArise's main window in the File menu, click on Post-analysis.
	2. Click on the Add file button to add all the .prj files one by one.
	3. Select the range for the z-score.
	4. Select which set you want to compare (up or down-regulated).
	5. Type the name of the direcory where the results will be save.
*
genArise Keyboard Shortcuts

	COMMAND			SHORTCUT
	New Project			Ctrl+N
	Open an Old Project		Ctrl+O
	Postanalysis			Ctrl+P
*
