Auto Annotation is a great tool for curating your sequences. For example if you receive a unannotated sequence then you can scan it against other sequences to find. Blasting an unknown sequence, fetching the hits, then annotating your unknown sequence against those hits is a good way to find out information about your unknown sequence too. Plus you can also give sequences a consistent look and feel.

For example making all CDS features to be green, or all AMP genes to be a blue arrow.

The following script will take a folder of blank sequences, then scan them against a curated folder of annotated sequences. It will add any annotation it finds.

The script is very simple and could be enhanced much more. But it serves as a basis if you need to develop your own scripts. If you need any assistance with developing scripts for a specific purpose, please do email support. We will be happy to help. set inputFolder to (choose folder with prompt "Select Folder of MV files to annotate:")

The Analyze | Reverse Translation menu option lets you create a DNA sequence from a Protein sequence, reverse translated using a specific Genetic Code (by default, the Universal Genetic Code). The default option creates a DNA sequence with N's and other ambiguities reflecting the degeneracy of the genetic code. This is great if you want to identify less ambiguous sections to design probes or primers and in fact MacVector will even display a list of probes with the least ambiguities.

However, MacVector also offers an optimization function if you are interested in designing a gene with codon usage optimized for expression in a particular organism.

To use this function, you do need to supply a codon usage table – a number of common tables are shipped with MacVector in the /Applications/MacVector/Codon Bias Tables/ directory. There are four different algorithms that MacVector provides for optimizing codon usage;• Most Frequently Used Codon – this simply uses the most commonly occurring codon for each amino acid. So if, e.g. the most common Leu codon is CTC, all Leu codons will be CTC. Perhaps this is only useful if you want to design a "best guess" primer and are willing to accept a certain failure rate. If you used this to optimize expression, the host would likely run out of that tRNA and you wouldn't see optimal expression.

• Frequency Distribution – this selects a random codon for each amino acid, biased towards the most commonly used codon that encodes each amino acid. Each time you run the algorithm, a different, random set of codons will be selected. If you were to generate a new DNA over and over again, eventually this would create a collection of sequences where the average codon usage would exactly match the average for the .bias organism. But any individual reverse translation may randomly be quite different.

• Probability Distribution – this is probably the most powerful setting if you are interested in expression. Similar to the Frequency Distribution, this chooses a random codon, biased towards the most frequently used codons for each amino acid. However, this version tries to ensure that the final DNA sequence has a codon usage profile as closely matching as possible to the codon usage of the selected .bias file. Again, each time you invoke the algorithm, it will produce a different sequence. But as the overall codon usage in the DNA sequence is guaranteed to be as close as possible to the codon usage in the .bias organism this should, in theory, give you the best chance of high expression. Again, you will get a different sequence each time you invoke this.

• Uniform Distribution – this ignores the usage of each codon and randomly assigns an appropriate codon for each amino acid. Its similar to the default algorithm that uses ambiguities to create an "absolute" coding DNA, but here it just chooses a random codon with no regard for codon usage probability. Again, you will get a different sequence each time you invoke this.

All analysis results for an individual sequence are collected into a single tabbed result window to reduce window clutter. However, there are times when it is very convenient to have results displayed in side-by-side windows. For example, if you run a dot plot you can zoom in to view sections of the comparison by drag-selecting over a region of interest in the Matrix Plot tab and the Aligned Sequence tab will update to only display the text alignments across the new selection. Constantly toggling between the tabs to drill down to the region you are interested in (e.g. a potential splice site on a genome versus cDNA alignment) can be very frustrating.

All you need to do is to click on the title of a tab, hold down the mouse button, then drag the selected tab away from the parent window. Database vs server When you let go of the mouse button, a new window will be created containing just that single tab.

Not only that, you can organize the tabs into multiple windows if you like. Database is in transition If you drag a tab from one window and drop it onto the tab bar of another window (this only works on the tab bar, you can’t drop on the content region of a window), then the tab will be added to the target window.

UPDATE – November 8, 2016 – Although the official switch off date is not until Wednesday the 9th, Entrez and BLAST are NOT currently working from MacVector 15.0.3 and earlier. Data recovery ios We suspect this change is now permanent.

UPDATE – November 7, 2016 – We have just been notified by the NCBI that they will be making the server changes that will prevent Entrez and BLAST from working on any version from MacVector 15.0 and earlier this Wednesday, 9th November, NOT the 1st of December as we were previously told.

The NCBI hosts one of the world's definitive sequence repositories and MacVector gives you direct access to sequences from these databases. Unfortunately, due to recently announced major infrastructure changes at the NCBI, MacVector's ability to access these services will be severely impacted from December of this year. These changes are completely out of our control, however, our developers have been hard at work to resolve this. We are pleased to announce that we are about to release a new version that will restore this functionality: MacVector 15.1. ENTREZ AND BLAST

MacVector gives you access to the NCBI's Entrez database. This allows you to search for and retrieve sequences directly to your desktop. Both simple and complex queries are allowed. For example, you can search for all human kinases and specific accession numbers.

MacVector also allows you to perform Blast searches directly from your desktop against the NCBI's BLAST database. From example, you can submit a protein or gene sequences against published sequence databases, or even a reverse translated protein against DNA databases. You can very easily retrieve any hits directly to your Desktop as well as view the alignments.

From September to December the NCBI are making two major infrastructure changes that will impact MacVector's use of these two services. HTTP to HTTPS

These changes have been implemented in a US government wide change to ensure safe access of websites for all. HTTP is insecure and HTTPS is much safer. The US government website explains this further. Accession number, GI numbers and versioning

From September onwards, they are gradually transitioning the way sequences are referenced, and therefore retrieved. Previously upon submission every sequence was assigned a GI number (since 1994) as well as an accession number. An accession number never changes, but with a new version of a sequence submission the GI number would change. From September 2016, all new sequence submissions will be assigned a single number instead. This will be the accession number AND version combined. The redundant GI number will no longer be assigned (although it will be a long time until older sequences have this replaced with the new version). The accession number and version will always be read together so you now have a simpler way of referencing a sequence, and more importantly, a human readable way of determining the version. NCBI Toolkit and Entrez Programming Utilities.

There are two toolkits that the NCBI offer for accessing these services from within a software application. NCBI Toolkit and Entrez Programming Utilities (E-Utils). The NCBI have released a new version of E-Utils that supports HTTPS connections and ACCESSION.VERSION numbering. However, NCBI Toolkit is not going to be updated. MacVector 15.0.3 uses NCBI Toolkit and this unfortunately means that from December MacVector 15.0.3 will no longer be able to access ENTREZ or Blast. MacVector 15.1

This change was announced with only a short notice and since the announcement our developers have been working extremely hard to migrate MacVector to use E-Utilities.

We're pleased to announce that this will be released shortly. MacVector 15.1 contains a whole new implementation of the BLAST and Entrez tools. MacVector 15.1 will be released very shortly a few months before the final switching off of these services.

To avoid any interruption in service, please ensure that you have downloaded and installed MacVector 15.1 before the start of December. MacVector 15.5 and beyond

Although our hand has been forced somewhat with the release of MacVector 15.1, we are still planning to release MacVector 15.5. Since the Blast and Entrez tools have now been rewritten, expect to see even more new enhancements to these tools in future releases.