Questions
and Answers (continued)
11)
How to find BACs for a particular human gene (example "FUS")
How
and where to we find this information?
There
are different routes to find clone ID's and we will first
explain one of these, which makes use of the University of
California at Santa Cruz (UCSC)
Genome Browser (A). Other routes might include
the recently expanded (late 2008) NCBI
CloneFinder program (B). The UCSC and NCBI links
will open the Human Genome Browser and the Human Clone Finder
program, respectively. If you're interested in a different
species, then change the species name for the Browser/CloneFinder,
and follow a somewhat similar procedure.
A)
The following text applies only to the UCSC Browser. Type
the gene name in the [position or search term] window to replace
the default text. In this case, type FUS (as an example human
gene). Click the "Submit" button. A new page
with many search results with data relevant to "FUS"
appears. Make your best selection based on your knowledge
of the FUS gene, for instance click the link for the RefSeq
entry for [FUS
at chr16:31098954-31110600] or the one linked to [Uncharacterized
Protein FUS] (under UCSC Genes) [FUS
(uc010caj.1) at chr16:31107147-31109220]. Once you
click on these links, you will get a map
encompassing the genomic features of a 2,074 bp
chromosome segment of chromosome 16p11.2. You can change the
display: zoom in or zoom out, display additional genomic features,
etc. You may also see a series of horizontal lines under the
heading of "BAC End Pairs" and horizontal lines
have arrow heads pointing to the left or the right. This "arrow"
polarity indicates the relative orientations of insert and
(BAC)vector sequences. If you don't see any horizontal lines
and a "BAC End Pairs" heading, then you will need
to activate the "BAC End Pair" display option on
the Genome Browser to "Full" and hit the "Refresh"
button within the UCSC Genome Browser. If you see the
BACs, how do you know the size and boundaries of the BAC insert
sequence? You can either zoom out until you see the two insert
ends displayed, or you can click on one of the horizontal
"BAC" lines. When you click, a new page opens with
many links to the related BAC-insert End Sequences ("BES").
You may also see a horizontal "Green" bar indicating
clones (mostly BACs) which have been mapped by Fluorescent
In Situ Hybridization (and other approaches) to this location.
All of the BACs displayed will either be from the RPCI-11
BAC library (created in our laboratory (previously at Roswell
Park Cancer Institute) or from the CTD (Caltech-D BAC library).
We only distribute (with few exceptions) the RP11 BAC clones.
In addition to BACs, you may also see fosmid clones (names
starting with "G248") and these are also distributed
from our BACPAC Resources Center. The fosmid are part of the
WIBR-2 human fosmid library
created at MIT. If you don't see the fosmids displayed within
the UCSC Genome Browser, then make sure to activate the "Fosmid_end
Pairs" option to "Full", followed by "Refresh".
How much information is available for any of these BACs or
fosmids? All of these have only been sequenced very partially:
at most a few hundred base pairs from both insert ends. The
sequence pairs have been mapped to the assembled human genome
and have been approved if the orientation and distance of
the sequence pair is compatible with the size of a typical
BAC or fosmid clone.
You
may be lucky to find a BAC clone which has been completely
sequenced AND includes the complete locus. How does one identify
such a completely sequenced BAC? Check on the dashboard of
options (below the UCSC map) if the "Assembly" option
has been activated to "Full". If necessary, activate
this option and "Refresh" the browser. Then, inspect
the top of the map and look for horizontal brown bars under
the heading "Assembly from Fragments". The brown
bars represent sequence assemblies obtained from shotgun-sequenced
clones, in most cases BAC clones and 75% are derived from
the "RP11" a.k.a "RPCI-11" BAC library.
All of the clone-derived assemblies are labeled by their NCBI
sequence accession numbers, for instance for the Fus gene):
"AC009088.9".
Inspect the annotations for NCBI sequence accession file and
you will find the name of the clone (in this case "RP11-388M20").
Please note: the finished sequences are sometimes only a part
of the BAC and the BAC may be bigger! Why? Because an overlapping
BAC clone may also have been sequenced and only one sequence
of the overlap was finished and used to created the human
reference sequence. Be also aware that the clone names in
the sequence files are not always standard names in compliance
with the NCBI
Nomenclature rules . The reason for the non-compliance
is historical: some clones were sequenced before the Nomenclature
rules were established. If you want to order a "non-compliant"
clone, then translate the name into standard nomenclature
before ordering the clone through our electronic shopping
cart.
B)
An additional option for finding clones for many human libraries
became available late 2008. This involves the use of the NCBI
CloneFinder program . The program is self-explanatory.
Please realize that the number of libraries and options may
be more abundant than needed. Please activate only the RPCI-11
(RP11, BAC), CHORI-17 (CH17, BAC) and W12 (fosmid) search
options if you want clones available through our BACPAC Resources
Center. After specifying BAC libraries and the search feature
(region spanning from �Fus to Fus�), you may obtain the following
weblink
. If this is done for the �Fus� gene example, you will
obtain data indicating BAC clones CH17-103K03 and RP11-112L3,
and also 7 fosmid clones. Please remove the non-compliant
"zero" from the CH17 clone name before ordering
online from our BACPAC Resources Center: CH17-103K3 is
the correct name.
Following
the instructions under (A) or (B), you
will likely have found some or many BACs and fosmids containing
the FUS locus (our example). Most of the displayed BACs have
a name starting with "RP" or "CH17". The
"RP11" and "CH17" BAC libraries were made
in our BACPAC laboratory and a more extensive description
can be found on the corresponding pages in our BACPAC
library browser (look for the full library names:
RPCI-11 and CHORI-17
). You will discover that the first library is derived
from an anonymous "diploid" human donor for the
Human Genome Project. The second library is from a "haploid"
human DNA source, which has been sequenced at the Washington
University Sequencing Center. You might eventually be able
to obtain the complete sequence equivalent to CH17 BACs through
data mining (retrieval of the corresponding sequence from
the haploid sequence assembly). The WIBR-2
fosmid library was created at MIT as to support the original
human genome assembly. The fosmids can also be ordered from
our BACPAC Resources Center.
Please
let us know if any of the links in this answer are no longer
working, and we will update the text.
|