LSM2104 Group Project

Weiwei’s Page (U052166X)

CP000522

1. FASTA format for CP000522

    Data retrieved from NCBI Genome

2. Blast Efflux Pump   http://sf01.bic.nus.edu.sg/blast/

 

                 Regular blast

                 Setting: CP000522, blastx, Efflux_pump, No filter. Expect 10, BLOSUM62,

                 Query genetic codes: baterial(11). Database genetic codes: baterial(11),

                 Frame shift penalty: No.  Graphical Overview. Descriptions/alignments:5000

                 Remark: Actually set Descriptions/alignments 50 is more than enough here.

 

Results: total 13 hits

 

 

 

 

 

 

 

 

 

 

      Getting the detail for the hits from here: Blast results.xls   Blast results webpage

 

 

Acinetobacter baumannii ATCC 17978 plasmid pAB1, complete sequence

3. Identify statistical significant regions

                  

 

 

 

 

 

 

 

 

4. Find Open reading frame

        http://www.ncbi.nlm.nih.gov/gorf/gorf.html

                 Setting: whole CP000522, FASTA, Bacterial code.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5. Scan putative functional annotation

Note: Only use the ORF obtained from 6.3.2.

Question: There are 4 regions 9879-9583, 7661-7266, 6567-6373, and 3954-2973.

        4*6=24 possible translated protein sequences

        Is importing the above 9 ORF enough to represent 24 sequences?

 

5.1 Prosite  http://au.expasy.org/prosite (results save as webpage)

ORF 1: Uncheck Exclude motifs with a high probability of occurrence: 18 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

ORF 2: Uncheck Exclude motifs with a high probability of occurrence: 12 Hits

        Check exclude motifs with a high probability of occurrence:: 3 hits.

                Hits by patterns: [3 hits (by 3 distinct patterns) on 1 sequence]

                PS00867   CPSASE_2   Carbamoyl-phosphate synthase subdomain signature 2 :

                1 - 8:   MIEMNTRI 

                PS00397   RECOMBINASES_1   Site-specific recombinases active site :

                9 - 17:   YLRASTkdQ 

                PS00398   RECOMBINASES_2   Site-specific recombinases signature 2 :

                65 - 77:   GDtlLVeSIDRLS

 

ORF 3: Uncheck Exclude motifs with a high probability of occurrence: 11 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

ORF 4: Uncheck Exclude motifs with a high probability of occurrence: 17 Hits

        Check exclude motifs with a high probability of occurrence: 2 hits.

                Hits by profiles: [1 hit (by 1 profile) on 1 sequence] score = 21.989

                PS50931   HTH_LYSR   LysR-type HTH domain profile :

                DNA_BIND 20 39 H-T-H motif (By similarity) [condition: none] 

                Hits by patterns: [1 hit (by 1 pattern) on 1 sequence]

                PS00606   B_KETOACYL_SYNTHASE   Beta-ketoacyl synthases active site

 

ORF 5: Uncheck Exclude motifs with a high probability of occurrence: 5 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

ORF 6: Uncheck Exclude motifs with a high probability of occurrence: 2 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

ORF 7: Uncheck Exclude motifs with a high probability of occurrence: 20 Hits

        Check exclude motifs with a high probability of occurrence: 2 hits.

        hits by patterns: [2 hits (by 2 distinct patterns) on 1 sequence]

PS00878  ODR_DC_2_1   Orn/DAP/Arg decarboxylases family 2 pyridoxal-P attachment site :

PS00879  ODR_DC_2_2   Orn/DAP/Arg decarboxylases family 2 signature 2 :

 

ORF 8: Uncheck Exclude motifs with a high probability of occurrence: 8 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

ORF 9: Uncheck Exclude motifs with a high probability of occurrence: 5 Hits

        Check exclude motifs with a high probability of occurrence: No hits.

 

 

5.2 pfam6.3.3 Scan putative functional annotation

ORF1: Trusted matches: DUF262

ORF2: Trusted matches: Resolvase, HTH_7

ORF3: Trusted matches: Sel1            

ORF4: Trusted matches: HTH_1, LysR_substrate

ORF5: Potential matches:  RNA_pol_Rpa2_4

ORF6: No matches

ORF7: Trusted matches: Orn_Arg_deC_N, Orn_DAP_Arg_deC

ORF8: Potential matches: Metallothio_7

ORF9: No matches

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6.  Annotation

Regions

 

Frame

9879

9583

-2

7661

7266

-3

6567

6373

-2

3954

2973

1 or -3

Regions

 

Frame

Details

From

To

Length(na)

Length(aa)

9583

9879

-2

ORF 1 FASTA

8602

7074

6340

2523

3669

3903

3538

3025

2743

9732

1131

376

7266

7661

-3

ORF 2 FASTA

7074

7685

612

203

6373

6567

-2

ORF 3  FASTA

6340

6882

543

180

2973

3954

-3

ORF 4 FASTA

2523

3437

915

304

 

 

 

ORF 5 FASTA

3669

3896

228

75

 

 

 

ORF 6 FASTA

3903

4094

192

63

2973

3954

1

ORF 7 FASTA

3538

4788

1251

416

 

 

 

ORF 8 FASTA

3025

3426

402

133

 

 

 

ORF 9 FASTA

2743

2916

174

57