Pathway Análisis utilizando el conjunto de genes análisis de enriquecimiento (GSEA) Herramienta

Análisis conjunto de genes de enriquecimiento es uno de muchos enfoques para la El análisis de la expresión génica datos de perfil y se describe en un papelde los trabajadores en el Instituto Broad.

El concepto básico fue motivada por la observación de que el estudio de genes individuales que muestra la diferencia más significativa en el nivel de expresión entre dos estados o fenotipos se carente de una visión mecanicista. En lugar, tiene más sentido para tomar una conjunto de genes compartiendo algunas vínculo biológico, y hacer la pregunta - ¿el conjunto mostraron estadísticamente enriquecimiento significativo en los genes que tienen expresión diferencial?

La conjunto de genes puede ser elegido, a priori, por un número de razones p.e.. el conjunto de genes que se sabe están influenciados por encima- o expresión insuficiente de un micro-ARN, o tal vez un conjunto elegido basado en la localización cromosómica, o genes para los que la función molecular, componente celular y / o biológico han sido asignados utilizando los vocabularios controlados de la Gene Ontología.

Una ventaja del enfoque GSEA es que es posible incorporar su conjunto completo de datos, no sólo las transcripciones con un umbral diferencial de expresión arbitrariamente elegido. Estoy seguro de que muchas personas lean esto pensarán - "¿Cómo puede ser correcto utilizar el conjunto completo de datos? Normalmente yo sólo consideraría genes con >2 (O valor preferido otro)-la expresión diferencial veces ". La razón es válida la aproximación es que los genes expresados ​​en niveles bajos o con gran variación entre repeticiones no contribuyen a la métrica principal utilizado por GSEA, el 'enriquecimiento de puntuación' (ES).

GSEA trabaja por primera clasificación el valor de la expresión de cada gen por Señal a ruido ratio - calculando la diferencia entre los valores medios para las muestras que representan cada fenotipo y ajuste a escala por la suma de las desviaciones estándar. Esto significa que los genes con grandes diferencias en el nivel de expresión entre los diferentes estados y poca variación entre repeticiones biológicos se clasifican en muy.

El siguiente paso es que el ES, la estadística primaria generada por GSEA, se calcula para cada conjunto de genes - en el manual de GSEA, que documenta el software excelente, se afirma:

"Todos los genes están clasificados primero por su relación de señal a ruido, entonces el ES es calculado por "caminar" por la lista ordenada de genes creciente un funcionamiento de suma estadística cuando un gen está en el conjunto de genes y decreciente que cuando no está. La magnitud del incremento depende de la correlación del gen con una fenotipo. La ES es la máxima desviación de cero se encuentran en la lista de caminar. La positivo ES indica enriquecimiento conjunto de genes en la top de la lista de clasificación; un negativo ES indica enriquecimiento conjunto de genes en la fondo de la lista de clasificación. "

Los valores son ES normalizado basado en el tamaño conjunto de genes y, a continuación una tasa de falso descubrimiento se calcula, para dar una estimación de probabilidad de falsos positivos. GSEA utiliza un valor predeterminado de muy relajado 25%, que es adecuado para la generación de hipótesis con un número relativamente grande de repeticiones biológica.

Los científicos que trabajan en los datos de no humano Las muestras pueden seguir utilizando GSEA, pero necesitantener cuidado - La símbolos de genes utilizado por GSEA son "traducido"Es decir de su equivalentes humanos. identificadores utilizados para los genes de sus especies de interés representado en el microarray se convierten en símbolos para su orthologues humanos, entonces se utiliza en el análisis. Subramanian y colegas reclamar que esta conversión tiene poca o ningún efecto sobre la utilidad de GSEA; se ha utilizado con éxito en múltiples especies no humanas, pero por supuesto esto debe tenerse en cuenta en la investigación de resultados en detalle.

Para un excelente, a fondo, revisión de las herramientas de la vía, consultar:

Khatri, P., Sirota, M., & Butte, La. J. (2012). Diez años de Pathway Analysis: Enfoques actuales y retos pendientes. PLoS Computational Biology, 8(2), e1002375. dos:10.1371/journal.pcbi.1002375

Otra buena fuente de asesoramiento sobre análisis de vías, especialmente para aquellos que están familiarizados con el paquete R estadísticas es aquí.

Otras lecturas

A Subramanian, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, A Paulovich, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Conjunto de genes de enriquecimiento de análisis: un enfoque basado en el conocimiento para la interpretación de todo el genoma de perfiles de expresión. A Proc Natl Acad Sci U S 102:15545-15550

Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M (2005) Descubrimiento sistemático de los motivos de reglamentación en los promotores humanos y 3[principal] UTRs por comparación de varios mamíferos. Naturaleza 434:338-345

Entrada publicada en Pathway análisis | 1 Response

Alegrías de edición de libros académicos de ciencias

Image courtesy of ningmilo / FreeDigitalPhotos.net

Or: “A beginner’s guide to herding cats”.

Consider this scenario: you are an academic scientist, in a busy research institute and your boss is invited to edit a book, but declines due to pressure of work; then suggests that it would look good on your CV. You agree, it would look good on your CV, so you commit yourself to editing your first multi-author academic science book.

So why is that a problem?

Getting authors on board

You want the best people to write the chapters. You Google some big-name experts and invite them to contribute a chapter to your book. They almost all decline, or fail to reply to your email. But, somewhat to your amazement, one agrees. Sin embargo, this paragon of science then never, ever replies to any future contacts. So, you lower your sights and aim for good scientists, but not Nobel Prize winners. Finalmente, you get enough authors together to write the chapters around the topic that the publishers have given you – phew!

Getting authors to agree a deadline

Assuming it’s not unreasonable, everyone is usually relaxed about the deadline set. Sin embargo, the real challenge is:

Getting them to meet the deadline

  1. This should be easy, right? Scientists are grown-up, professional people. Aren’t they? Well, sort of. In reality, academics typically over-commit themselves, doing not only research and teaching, but also writing grant funding applications, papers, reviews, book chapters, etc, etc. After all, the scientific mission statement is “publish OR be damned.”
  2. As the deadlines go past – “wooshh”, like passing cars, half your authors have submitted their chapters, the rest not. Now another sticky moment arrives – these are meant to be cutting edge reviews. State-of-the-Art. But this delay now means that the ‘good’ authors work is rapidly reaching its sell-by date. You may have to go crawling back to them to ask for updates. Which they are usually not too unhappy about, but you hate the loss of face.
  3. One more thing that I forgot to mention; as the editor, you have to READ these chapters. Worse still, you are expected to produce cogent critiques – what the author needs to add, remove, expand or contract. Even if the topic is on the fringe of your main expertise.

What happens if authors go AWOL?

What do you do when one of your authors decides that they are NOT going to write their chapter? Not simply procrastinate, fail to meet deadlines, but stop all communication. Disappear off the map. So, now you’re stuck – find another author(s) – more delay – write the chapter yourself? – but it’s too far outside your own area of expertise. So, eventually, you find someone else. Which means yet more delay.

Writing your own chapter

Oh, yes, you forgot that you agreed to write one of the chapters yourself. Oops. Oh well, not a problem. Offer co-authorship to one of your PhD students – they’ll be falling over themselves to get another publication on their CV. Or maybe not: no, they are not interested after all; obviously suspecting (correctly) that your aim is to let them write the whole thing, then submit the chapter to you for a little light editorial polishing.

Pleading with the publishers for more time

  1. You now hold the dubious record for the longest gestation period of a multi-author academic book in human history, excluding the Bible.
  2. ‘Please, sir, I want some more.’
  3. The publishers are not impressed, but quietly resigned, telling you to go away and come back when you meet a new deadline.

Losing your marbles and giving up completely

It’s all taking SO LONG – too few authors have submitted first drafts of their chapters. You start to get desperate – the original deadline was so long ago that you’ve forgotten it – the “new” deadline is also now history. You consider giving the whole thing up – apologise to the authors and the publishers and say the book can’t be finished. But your co-editor and the authors who have delivered on time are indignant – naturally enough they don’t want to see their work wasted – and insist that you go back to the recalcitrant scientists with a big stick. How do you threaten authors with a stick by email? Or by phone? Sin embargo, a combination of the metaphorical big stick, pleas for mercy and piling on the guilt eventually work and all the chapters are delivered! Hooray.

Hooray!

So, now, you’re on the last lap. Or the last dregs – the soul-destroying process of assembling the index and proofreading. Once, a sub-editor with a scientific background might have written an index, but not now. Academic publishers want their pound of flesh, so this task is delegated to authors and editors. Authors select keywords from their chapters, with varying degrees of enthusiasm or accuracy, then the editor attempts to assemble them into something useful to the reader. Finalmente, a draft proof arrives by email. You are now heartily sick of every word, but a final spurt of enthusiasm drives you on and the book is finished.

One more thing – did I forget? – you don’t get paid – but you are given a few free copies of your own book. Such fun!

 

Entrada publicada en Alivio de la Luz | 1 Response

¿Cómo funciona una mutación en un factor de transcripción oído pegamento causa?

Acute otitis media, sometimes known as “glue ear”, is the most common bacterial infection en children and by 1 year of age about 60% of children will have had one episode. In some cases, children develop a chronic condition, which, despite the infection being cured, the “glue” doesn’t go away and causes deafness.  In an inherited mouse model of chronic glue ear the causative mutation has been shown to be in a gene encoding a factor de transcripción, Evi1.

The EVI1 protein has multiple domains, can repress or enhance expression of target genes and interact with many other proteins. Indeed, the multiplicity of known and potential interactions is a challenge to determining the role of the mutation.  There were clues, however, as to how this mutation might lead to disease from differences in phenotype e.g. mutant mice raised in a “clean” SPF animal facility were less likely to become deaf than those kept in the older, “dirty” animal house.

Did this mean that gene-environment interactions v.g.. between immune system and microbes, influence disease susceptibility? It was also known that mutant mice showed high levels of influx of neutrophils into their middle ear cavities (inflammation), but it was unclear whether EVI1 was acting directly or indirectly in this process. Possible answers to these questions came recently from studies in cultured cells, showing that EVI1 can act as an inhibitor of one of the key proteins regulating inflammation, another transcription factor, nuclear factor kappa B (NFkB).  EVI1 binds to to one of the subunits of NFkB and interferes with a critical protein modification, acetylation.  However, EVI1 does not acetylate proteins directly, so other factors must be involved. What were those other factors?

I combined public and unpublished data using literature searches and open source software v.g.. iRefWeb in order to identify steps in the NFkB signalling pathways that might be disturbed by the mutation in EVI1.  The novel target proteins and starting points for drug development I discovered are suitable for testing in this preclinical model of chronic otitis media.

Read our testimonial from Dr Michael Cheeseman.

 

Entrada publicada en Pathway análisis, Target descubrimiento | Dejar un comentario

Target descubrimiento en el asma durante la infancia

Asthma is caused by a combination of environmental and genetic influences, but the specific factors are poorly understood. A significant “hit” detected in a genome-wide association scan (GWAS) for childhood asthma led a client to believe that one gene might be partially responsible. Proving that this genetic association really was causing asthma was, however, difficult. Firstly, no one knew the function of the protein made by the gene and secondly, changing genes in humans to test a hypothesis, rather than as therapy, is technically challenging & ethically questionable, especially in children. Fortunately, mice share about 90% of their genes with humans, so scientists “knocked-out” the equivalent gene, then tested whether these animals behaved like children with asthma. The short answer is – they didn’t. In lung-function tests that would have had asthmatics reaching for their inhalers, the knock-out mice were completely normal. So, what was going on? Were mice not enough like humans? Was this the wrong gene?

For this project, I went back to first principles – what was the evidence supporting the idea that this gene was responsible for increased asthma risk? Digging through the online literature, in particular papers from other groups studying the same gene and supplementary material not available in print, there were suggestions that the genetic effects were more complex. I found evidence that two other genes nearby were either more or less transcriptionally active in asthmatics and so might play a role in susceptibility to asthma. Furthermore, using data from the ENCODE project, I found that the regulatory element predicted to control these genes was conserved in mice, so it would be possible to test the predictions experimentally.

This suggested a novel therapeutic target – altering the activity of a cluster of genes, rather than just one, might alter disease risk.

Testimonial

Entrada publicada en Target descubrimiento | Dejar un comentario

Pathway análisis de datos de expresión génica – reduce la fertilidad masculina / esterilidad

Un grupo de animales que pueden reproducirse y producir descendencia fértil es una de las definiciones de una especie.

Esto significa que los mecanismos biológicos de la fertilidad y la infertilidad son de interés no sólo para los biólogos evolutivos, sino también para los médicos y por supuesto al público en general. En el Instituto de Genética Molecular en Praga, Prof.. Jiri Forejt está estudiando lo que controla la fertilidad en la descendencia híbrida producida por el apareamiento de ratón subespecies. Quería saber por qué algunos ratones machos eran infértiles - él sabía que los genes en una región del genoma particular eran importantes, pero no como esos genes influido en la expresión del resto del genoma.

Aquí es donde fue reclutado en el equipo, para ayudar en la identificación de las clases de genes alterados en los ratones con fertilidad reducida. Los científicos en su grupo había producido resultados Affymetrix expresión génica de los testículos de fértil, ratones sub-fértiles e infértiles y yo analizamos estos datos en todo el genoma de transcripciones diferencialmente expresados. Uso del Broad Institute maravilloso GSEA herramienta, Evalué la evidencia estadística de que los términos específicos de ontología de genes y vías estaban sobre-representados y también si los genes diferenciales se localizaron en regiones del genoma particulares. Este análisis descubrió pruebas de que determinado, conjuntos de genes funcionalmente relacionados estaban sobre-representados en los datos de expresión y ayudó a desarrollar nuevas hipótesis sobre las causas de la disminución de la fertilidad.

Entrada publicada en Pathway análisis, Target descubrimiento | 1 Response

Target descubrimiento en la debilidad muscular hereditario

Muscle weakness can be caused by a rare inherited disease called myofibrillar myopathy. Gonzalo Blanco’s team found a mouse model of this disease and wanted to identify the underlying cause of the severe muscle weakness. Their aim was to discover potential therapeutic targets to translate into pre-clinical and clinical studies.

Before I became involved, the disease had been mapped to a large region of one chromosome and Dr Blanco’s team were planning to use conventional positional cloning methods to find the mutation. I proposed that a faster approach would be to use next-generation sequencing targeted at genes in the region. I designed a set of probes to enrich specific DNA fragments and I worked with a bioinformatician, Dr. Michelle Simon, to design a software pipeline to find and characterise mutations.

At the end of the design process, the pipeline was used to identify mutations in the muscle weakness mutants and predict that they altered the coding sequences of two genes; Myh4 and Pmp22. Two lines of evidence suggested that the mutation in Myh4, which codes for a muscle myosin protein, was the most likely cause of the weakness. Firstly, our colleagues found that mice carrying only the myosin mutation still had the trait and secondly, abnormal protein aggregates from affected mice contained large amounts of the myosin.

Scientists at the MRC’s Mammalian Genetics Unit have used the same approach, that Michelle Simon and I pioneered, to find mutations in other disease models.

Publication in Human Molecular Genetics

Testimonial from Dr. Gonzalo Blanco

Entrada publicada en Target descubrimiento | Dejar un comentario
  • Conéctate con nosotros

    Link to ourLinkedin
    Link to ourRss
    Link to ourTwitter