The sequence rules require the use of standard symbols and a standard format for submitting sequence data in most patent applications that disclose nucleic acid or amino acid sequences. On individual patents in The Lens you can find the sequences tab where you can explore all the biological sequences disclosed within the patent. You can check the patent sequence counts available from that patent and the source of data.
Individual Patent Bibliography
From the results page or individual patent record you can see in the bibliography section of individual patent, you can find additional information labels with caption “Sequence”. Clicking the button will led you to the sequence tab of the page.
On individual patent or scholarly work, you can find the tab for the sequences. Here you can filter, refine and explore all of the biological sequences disclosed from data obtained from EMBL-EBI Pat, DDBJ Pat, USPTO Fulltext and NCBI GenBank.
How to Use
You can filter the sequences by 1) their type (Peptide (protein) or Nucleotide (DNA or RNA)), 2) their length and here you have various categories to choose from, and 3) their location in the document (Claims, drawings, description, or example, etc… whenever we were able to locate that sequence) . You can also select a specific SEQ ID NO to view. This feature is handy when you have a large number of disclosed sequences within a patent document. Sequences referenced in the Claims section are usually important since they may be critical to the invention and the granted/to be granted rights and so you may find this feature particularly useful when a patent has hundreds of sequences disclosed.
You will see options to download various categories of the sequences into a single multi-FASTA file. FASTA is a standard format for biological sequences. Note that sometimes the files may be saved as “. fasta.txt” and you may need to change them to just “.fasta” to input them into bioinformatics programs.
Displaying the sequence Information Table
All sequences in the document are displayed in a table format with the column headings:
- SEQ ID NO – The sequence ID number, clicking this will take you to that sequence’s specific page.
- Length – number of nucleotides/amino-acids in the sequence
- Sequence type – nucleotide or protein
- Locations (claims, drawings, description, summary, example, or undetermined.)
- Declared Organism – the declared species of organism for this sequence, where applicable
- Determined Organism – the determined species of organisms for this sequence
- NCBI Entrez GI – The number and link to this sequence on NCBI’s entrez database whenever available. More about GI numbers.
- Blast Search – quick links to search for similar sequences to this within the Patent Sequences database in PatSeq Finder in the Lens or within the NCBI’s GenBank
Sequence View Page
When you click on a sequence ( SEQ ID NO X), you will be able to see the actual sequence information in FASTA format. You can download this sequence individually or copy it for use elsewhere.