CRISPR, Gene editing, gene interference and gene expression

Richard Norris
Nov 4, 2016
11 min read

CRISPR

Gene editing, gene interference and gene expression

This article is composed of a few short sections so its easy to get to the bit that interests you most:

1: What is CRISPR? 5: Investigating the non-coding genome

2: Genome editing 6: Latest developments

3: Beyond genome editing 7: Systematic interrogation of gene regulatory processes

4: Gene interference and activation 8: Problems/things to consider

Preamble

Many of you will have heard of CRISPR as it has been in the news a fair bit recently. If you haven’t, it’s a fairly safe bet you’ll be hearing quite a lot about it in the future. There has been a lot written about it already but my aim with this article is not only to write something interesting and informative on the subject but to try to make it more accessible to those who aren’t so familiar with the scientific terms. So, here we go…

CRISPR technology was initially developed as a gene editing (making specific changes to an organism’s DNA) tool, and its use has the potential to treat many genetic diseases. However, it is also a very powerful tool for investigating how genomes work. This includes studying the regulation of gene expression - the mechanism by which the information in a gene is used to produce whatever the gene codes for (a protein or a non protein-coding RNA molecule) - which is crucial for a multitude of cellular processes. Defects in gene expression contribute to the pathology of several diseases, so being able to conduct experiments where gene expression is selectively switched on or off, and to search for and learn more about the parts of the genome that control this process, is really important. CRISPR is enabling scientists to do just that, and in this article I shall endeavour to explain how…

What is CRISPR?

CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats, which is a bit of a mouthful so let’s break it down a bit. The CRISPR/Cas system is a region of DNA in many types of bacteria and single-celled organisms. It consists of short repeats (as suggested in the name) – a section of DNA where the same bases are repeated a number of times e.g. ATCGATCGATCGATCG. But the crucial bits are the sections of DNA in-between the repeats – the spacers. These are copies of viral DNA that act as a memory of infection - essentially an adaptive immune system …

So… when bacteria are attacked by a bacteriophage (a virus that infects bacteria), the virus would try to insert it’s DNA into the bacterial genome, make copies of itself and eventually kill the bacteria. But the bacteria can fight back! They produce a Cas protein (CRISPR associated protein) of which there are several different types. There are also many types of bacteriophage with different genomes. If the bacterium has a copy of the viral DNA – a spacer – in its CRISPR region (either from a previous infection or inherited – bacteria can pass CRISPR DNA to their progeny) it transcribes this into CRIPSR RNA (crRNA). The Cas protein and the crRNA then form a complex that acts like a GPS, scanning the bacterial cell for viral DNA (that matches the sequence of the crRNA) and then breaks down the viral DNA (the Cas is a nuclease - it cuts the DNA like a pair of molecular scissors) to stop the infection. If there is no spacer that matches the viral DNA, the bacteria produces a different type of Cas protein which not only breaks down the viral DNA but also makes a copy of it and inserts it into the bacterial genome as another spacer so that next time the virus is easier to fight - cool huh?! Some aspects of this process have not been fully elucidated but hopefully that gives you the gist of what’s going on. There’s also a very good video by Paul Andersen at Bozeman Science on YouTube.

Note: some CRSIPR activation pathways involve another RNA molecule called tracrRNA which may be required for the processing of crRNA post-transcrition. TracrRNA includes the same base sequence as crRNA so is able to bind to it forming an RNA duplex.

Figure 1: The adaptive immune system in bacteria featuring the CRISPR/Cas locus (region of DNA) image from Marraffini, 2015.

As my old supervisor used to say – pretty pictures are always good!

You may be wondering, how does the bacteria focus on the viral DNA and avoid destroying the corresponding spacer DNA in its own genome? Well, this is the function of the final piece of the jigsaw - PAM (protospacer adjacent motif). PAM is a 2-6 base pair DNA sequence immediately following the viral DNA targeted by the CRISPR/Cas complex. The Cas nuclease will only cut in the presence of the PAM sequence, and crucially the PAM sequence is only present in the DNA in the virus and not in the copy that takes the form of a spacer in the bacterial genome. Clever ay?!

From immune system to genome editing

Three types of CRISPR/Cas system have been identified so far and we know most about the type II system. This system is unique in that it uses only one Cas protein (Cas9), which processes crRNA and breaks down the viral DNA. In 2012 a research team led by Emmanuelle Charpentier and Jennifer Doudna adapted this system to target and cut specific regions in the genomes of model organisms. They fused a trans-activating crRNA (tracrRNA) with a crRNA to create a RNA chimera known as a single guide RNA (sgRNA). This sgRNA also contained a protospacer sequence which recognises the target sequence and a PAM sequence. Using a Cas9 enzyme from Streptococcus pyogenes they created a sgRNA/Cas9 complex which is able to create a double-strand break (cutting both DNA strands) in the target sequence. This break is then repaired by the cell machinery by one of two mechanisms: non-homologous end joining (NHEJ) or homology directed repair (HDR). NHEJ gives rise to insertion/deletion mutations (indel) and HDR allows for the addition of desired sequence edits (Doudna and Charpentier, 2014; Savic and Schwank, 2016). For genome editing the sgRNA/Cas9 complex is delivered in the form of a plasmid vector containing DNA sequences that code for the Cas9 endonuclease and the RNA components of the complex along with a promoter.

Figure 2: Gene editing using the CRISPR/Cas9 system. Image from Savic and Schwank, 2016.

In terms of genome editing and gene therapy the use of CRISPR technology provides the exciting prospect of being able to fix the genetic mutations of diseases like cystic fibrosis and Huntington’s, to cut out the HIV DNA from the genomes of infected patients, and even as a treatment for cancer! It has been successfully used in all common model organisms and promises some exciting future developments. But CRSIPR offers much more than just a new technique for gene therapy…

Beyond genome editing

An equally exciting (at least for this blog writer) aspect of this technology is what you can do when you break the molecular scissors i.e. the basic components of the system are still there but the nuclease activity of the CRISPR enzyme is disabled.

A year or so after the development of the CRISPR/Cas system another group led by Stanley Qi engineered a version of Cas9 which was effectively ‘dead’ –it couldn’t cut anything at all. Why would you want to do that? I hear you ask...

Well, it turns out that RNA molecules aren’t the only thing you can attach to a Cas protein, in fact it can act as a platform for a plethora of other proteins and molecules including activators and repressors (proteins that turn genes on and off respectively) or even substances that make a region glow so you can see its location. So, with the nuclease activity disabled, you have a delivery system that can control any gene of interest.

Gene interference and activation

Why is this important? Well, by reducing or stopping the expression of a gene so the cell makes less of the protein (or ncRNA) it codes for, or stops making it altogether, you can gain insight into the function of that gene (we’re still not sure what about a quarter of the genes in the human genome do!) or you can create a cellular model of a disease system where the gene doesn’t work as it should. Two important steps in gene expression are the binding of a transcription factor to a regulatory site upstream of the gene, and the production of an RNA copy (messenger RNA) of the DNA by an enzyme called RNA polymerase. Both of these processes can be disrupted using CRISPR interference (CRISPRi). The Cas9 gene in the vector (as described above) carries mutations that prevent the Cas9 enzyme from being able to cut DNA. The guide RNA carries a DNA sequence which recognises a region close to the start of the gene of interest and the complex can basically sit in the way of the transcription factor or the RNA polymerase and block these processes. It is possible to achieve the same outcome using the original CRISPR/Cas system – by cutting out a bit of the DNA that is important for gene expression – but it turns out that CRISPRi is a lot more accurate and efficient. It is also possible to activate genes by using CRISPR/Cas9 to deliver proteins that will activate gene expression (CRISPRa). This may be important in developing treatments for diseases or to be able to reprogram adult cells into stem cells.

Figure 3: CRISPRi – blocking gene expression using these CRISPR/Cas9 system

Key: Txn = transcription factor; Pol II = RNA polymerase; dark blue bit that looks like a train track = DNA; orange train track = guide RNA; big blue bubble = Cas9. Image taken from Dominguez, Lim & Qi, 2016.

The CRISPR/Cas9 system is extremely versatile – it can support the delivery of multiple guide RNA sequences at the same time, enabling the control of multiple genes. You can even combine CRISPRi and CRISPRa so that you can switch off some genes whilst activating others. Some scientists have also attempted to deliver enzymes that can modify the epigenetic (chemical modifications of DNA which alter gene expression but don’t change the base sequence) landscape of the genome.

Investigating the non-coding genome

Until recently, efforts to identify the genetic causes of disease have focused primarily on the 1 – 2 percent of the genome that codes for proteins. But genome-wide association studies (GWAS) – basically large scale experiments comparing the DNA of patients with a particular disease to those without that disease – have revealed that up to 93% of human disease and trait-associated genetic variants lie outside of the protein-coding sequence (Maurano et al, 2012) – broadly termed the non-coding regions. Then of course there’s the ENCODE (Encyclopaedia of DNA elements) project which has provided a vast amount of information about the biochemical activity of various parts of the genome, again shedding more light on the non-coding sequence. However, once again we don’t know what all of it does, and finding out is more difficult than it is for the protein coding regions. If you have a mutation in a protein coding region you can see what effect it has on the protein and predict the consequences for the cell. For non-coding regions the function is less clear cut, it might be a binding site for a transcription factor in which case it could affect gene expression, but the regulatory sequences of a gene can be a long way (in DNA terms) from the gene(s) that they control so it can be difficult to figure out what is going on.

Latest developments: hot off the press – very exciting times!

Two research teams based at the Broad institute at MIT and Harvard have recently published the findings from their work using CRISPR to study the non-coding parts of the genome. One team, led by Professor Fen Zhang, used gene editing (with an active Cas9) to make precise mutations in non-coding regions around three genes associated with resistance to chemotherapy. The other, led by Professor Eric Lander, used CRISPR interference (with a ‘dead’ Cas9) to block the non-coding sequences around two disease related genes. The Zhang lab identified hundreds of non-coding sites, mutations in which reduce the expression of the genes in the study, and a subset of these mutations resulted in resistance to a drug used to treat skin cancer (Sanjana et al, 2016). The Lander lab also identified hundreds of sites with regulatory potential, nine of which are involved in regulation of their genes of interest (Fulco et al, 2016). These studies could be followed by many more using a similar approach to find out more about the non-coding regulatory parts of the genome. As I have alluded to CRISPR can be multiplexed, meaning you can make edits to or block multiple regions at the same time. As the technology develops its scale could increase and perhaps eventually we could be able to scan the whole genome using these techniques – say whaaaaat?!!

Systematic interrogation of gene regulatory processes

The versatility and multiplexable nature of CRISPR/Cas-based technology opens up a world of possibilities for investigating the various processes involved in gene regulation. The expression of a gene is regulated by cis-regulatory modules, epigenetic modifications, biochemical pathways controlling the binding of transcription factors, chromatin state, the list goes on…..! In addition, many phenotypes (the physical characteristics resulting from the genetic make-up of an organism) may be influenced by several different genes and each gene may have a different level of influence. So rather than looking at each component individually it may be extremely useful to build models, be they experimental or computational, of how these processes function and interact. This approach is being used by many research projects already but CRISPR may prove to be a key tool in this area. For example, if you build a computational model to predict what would happen in a certain situation – if a gene regulatory process functions in a particular way – you could use CRISPR to edit that system in a cell culture or perhaps in a model organism to test the computational prediction. If you could build that system to mimic multiple situations – any combination of genome edits, gene interference or activation, or epigenetic modifications – you could learn a lot about that system and how perturbations in it result in disease phenotypes.

Problems/things to consider

So is CRISPR the holy grail of genome research and therapy?! Well it certainly has a lot of potential, but there are one or two things to consider… Some reports have suggested that Cas9 can cut in unwanted places if the guide RNA is not specific enough and that the guide RNA may bind to DNA sites that don’t precisely match its sequence. This is obviously a concern for gene therapy applications where the technology would be used on patients to correct disease-causing mutations. Any applications of this nature would have to ensure that the desired outcome (and only the desired outcome!!) will be achieved.

Then there are the ethical considerations. If the technology develops to a point where it is relatively easy and cheap to make any edits to the genome ‘desired’ then we will head into the realm of designer babies. This will obviously require lengthy debates involving ethical and legal issues. Some might say that we have reached that point already so it is important that the appropriate regulations are in place and that we are having useful conversations about the best ways to use the technology.

The end

I hope that I have achieved my aims with this article and, whilst there is obviously a lot more that could be written about CRISPR, perhaps it will have inspired you to search for more information about the technology and what it can do. If I have made any glaring errors or you have any comments about what I’ve written feel free to get in touch using the contact section of this website or via Twitter (@rnorris1260).

References

Dominguez, AA. Lim, WA. Qi, LS. (2016). ‘Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation’. Nat Rev Mol Cell Biol. Jan;17(1):5-15.

Doudna, JA. Charpentier, E. (2014) ’The new frontier of genome engineering with CRISPR-Cas9’.

Science, 346, p. 1258096

Fulco, CP. Munschauer, M. Anyoha, R. Munson, G. Grossman, SR. Perez, EM. Kane, M. Cleary, B. Lander, ES. Engreitz, JM. (2016). Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science. Sep 29. pii: aag2445.

Marraffini, LA. (2015) ‘CRISPR-Cas immunity in prokaryotes’. Nature. Oct 1;526(7571):55-61.

Maurano, MT. Humbert, R. Rynes, E. Thurman, RE. Haugen, E. Wang, H. Reynolds, AP. Sandstrom, R. Qu, H. Brody, J. Shafer, A. Neri, F. Lee, K. Kutyavin, T. Stehling-Sun, S. Johnson, AK. Canfield, TK. Giste, E. Diegel, M. Bates, D. Hansen, RS. Neph, S. Sabo, PJ. Heimfeld, S. Raubitschek, A. Ziegler, S. Cotsapas, C. Sotoodehnia, N. Glass, I. Sunyaev, SR. Kaul, R. Stamatoyannopoulos, JA. (2012) ‘Systematic localization of common disease-associated variation in regulatory DNA’. Science 337, 1190–1195

Sanjana, NE. Wright, J. Zheng, K. Shalem, O. Fontanillas, P. Joung, J. Cheng, C. Regev, A. Zhang, F. (2016). ‘High-resolution interrogation of functional elements in the noncoding genome’. Science Sep 30;353(6307):1545-1549.

Savić, N. Schwank, G. (2016) ‘Advances in therapeutic CRISPR/Cas9 genome editing’. Transl Res. Feb;168:15-21.

CRISPR, Gene editing, gene interference and gene expression

CRISPR

Gene editing, gene interference and gene expression

Recent Posts

Comments