ictvdb1 ictvdb1
ICTVdB - A Universal Virus Database

Descriptions of ICTV Approved Virus Taxa

C. Büchen-Osmond and M. J. Dallwitz

The ICTVdB Project

In 1991, the Executive Committee of the ICTV made the extraordinary decision during a meeting held in Atlanta, Georgia, USA, to commit the ICTV to develop a universal virus database (ICTVdB). This commitment of the ICTV was triggered among other things through a petition brought forward by the American Type Culture Collection (ATCC). This petition in form of a letter to the President of the ICTV had been drafted during a workshop Lois Blaine had organised at the ATCC, Rockville, Maryland, USA, in March 1990, where the need for a standardisation of virus descriptors, the cornerstone for the development of a comprehensive virus database, had been discussed among leading virologists and database developers. At this workshop it was agreed unanimously that a standardised list of characters (descriptors) for all viruses and the development of a universal database was urgently needed. However, it was felt that the data model for such a databases should be developed under the auspices of the ICTV.

The goal of the ICTVdB is to describe all viruses of animals (vertebrates, invertebrates, protozoa), plants (higher plants and algae), bacteria, fungi, and archaea from the family level down to strains and isolates. The lower levels especially are of great importance not only for application in medicine and agriculture, but to give insight into evolutionary trends. The database will benefit research and applications, and people on all levels of expertise.

Accessibility of an identification tool, such as a the WWW version of the ICTVdB, is of particular importance in remote areas, which commonly do not have access to the wealth of knowledge and equipment available at highly specialised medical and research centers, but do have access to the Internet. Thus descriptions at the low end of the taxonomic hierarchy must include all information needed for an unambiguous identification of and differentiation between strains and isolates. The precision of the identification becomes especially important when the unknown is a new, re-emerging or uncommon virus.

The heart of such a universal virus database is a standardised character list capable to describe unambiguously all viruses of humans, animals, plants, invertebrates, protozoa, bacteria and fungi. In 1991 the development of such a list began under the sponsorship of the American Type Culture Collection with the support of an NSF Grant (DIR-91- 07464).

Although many virus descriptors are available and have been in use for many years, it has been proven almost impossible to come up with a terminology acceptable to fit all cultures in virology. By the same token, the advent of molecular biology brought so many new aspects and data important for the concise description of a virus, that it is difficult to pick only the most basic characteristics without loosing specificity important during the identification process. Thus it was necessary to rethink the concept of a list of standardised characters and the basic layout of a virus database.

The standardised character list for the ICTVdB should conform with the ICTV descriptions of virus families and genera, but should be extensible to allow the incorporation of detailed information that is required to differentiate between strains and isolates. The currently accepted guidelines for the ICTV Reports provide the basis for the section layout of the character list and the framework for the virus descriptions in the database.

The ICTV descriptions of the virus families and genera are the source from which the single property statements are drawn to prepare a list of descriptors. The primary list of descriptors evolving from partitioning of the different virus family descriptions into single property statements will have redundancies and non-standardised wording of similar features and characteristics. These lists have to be streamlined and it takes hard decisions to reduce the number of descriptors without loosing out on unique features and specificity. The different terminologies for similar or identical features prevalent in the different cultures of virology make the preparation of universally acceptable descriptors even harder. Thus the progress in the development of a standardised character list, and in particular the reviewing process of the list by the Data Subcommittee and Study Groups of the ICTV, has been much slower than anticipated. The database is still in its infancy and only the morphological and genomic aspects have been recorded for of all virus families and genera including their type species. The descriptions are drawn from the most recent ICTV Report on the "Classification and Nomenclature of Viruses" (Murphy et al. 1995). The description of the particle morphology has been tackled first because the shape of a virus is the most important characteristic to identify a virus down to the genus level. Primary lists of descriptors exist for most other particle and biological properties awaiting culling, amalgamation, sorting and reviewing.

The ICTVdB uses DELTA, the DEscription Language for TAxonomy, developed by Dallwitz et al. (1993). As the character list develops, in particular for the biological aspects, all known virus strains and isolates will be described. At this stage of development, the descriptions contain genome accession numbers and reference lists. Thus far, only some descriptions accommodate images (mainly electron micrographs of the virus particle), but it is planned to include many more images as the descriptions are evolving, such as gene maps, distribution maps, images of host and vector, images of symptoms and histopathology. The existent sections of the virus descriptions have been formatted for the World-Wide Web (WWW) and is accessible to everyone. We hope that this accessibility will improve and speed up the reviewing process by the Study Groups of the ICTV.

The WWW provides an ideal platform to pull together all presently available data related to viruses and thus to augment the still sketchy data of the ICTVdB. Many more links to other databases will be incorporated so that for example the protein composition and structure can be looked up in a protein database, instead of incorporating such detail into the database itself. In the near future it will be possible to interrogate the database by an interactive keying system.

Decimal code - a tool for virus classification

It is often difficult to uniquely identify each virus, because virus names for the family, genus and species can be very similar and the vernacular name usually does not indicate the family or genus to which the virus belongs. Thus, for the purposes of the ICTVdB it was found it to be convenient to use a decimal numbering system similar to that used for enzyme nomenclature. The families have been sorted in alphabetical order and each has been assigned a number which represents a particular family, or genus, if the genus is not yet assigned to a family (Table 1). The system can carry more levels to accommodate strains and isolates.
Table 1: Decimal classification illustrated from family to species in the Parvoviridae

This type of numbering system enables the user to follow the path linking the various features of one genus or family, and permits the presentation of similarities between different groups on more than one level. The numbering system also gives an internal structure to the database that indicates the descriptors needed for completing the coding of a specific virus or data set. With this numbering system the database is structured such that we do not need to repeat the same information between levels. When calling up a description of a strain we can supply the pointers to the next higher level where the full description of the species is stored as illustrated in Table 2.

Table 2: Natural language excerpt from ICTVdB illustrating the economic, non-redundant accumulation of information using the decimalised virus nomenclature.

50. Parvoviridae

50.1 Parvovirinae

50.1.1 Parvovirus mice minute virus

The numbers assigned to each virus also serve as locators numbers within the database and as an unchanging reference. The locator number is easily transformed into a file accession number by removing the decimal stops and filling each number up to 8 digits as demonstrated here on our example of the family Parvoviridae where the family level becomes 50000000 , the genus level, that is the genus Parvovirus, 50110000, and for the type species mice minute virus 50110001. These accession numbers are used throughout the ICTVdB as file names to access the computer generated virus descriptions.

All data presented here are drawn from

The Classification and Nomenclature of Viruses

"Fifth Report of the International Committee on Taxonomy of Viruses" (1991)

Eds. R.I.B. Francki, C.M. Fauquet, D.L.Knudson, F.Brown
Archives of Virology, Springer Verlag, Wien New York

"Sixth Report of the International Committee on Taxonomy of Viruses" (1995)

Edt. F.A. Murphy, C.M. Fauquet, M.A. Mayo, A.W. Jarvis, S.A. Ghabrial,
M.D. Summers, G.P. Martelli, D.H.L. Bishop
Archives of Virology, Springer Verlag, Wien New York

Created by C. Büchen-Osmond April, 95 Last updated: 30 November, 1995

Viral Diseases - Virus Names - Virus Families - Virus Hosts - Viruses By Genome Type

Big Picture Book of Viruses - FAQ - Submit a Site - All the Virology on the WWW

© 1995-2007. D. Sander Established 5/95.