|Home | What's New? | Cephalopod Species | Cephalopod Articles | Lessons | Bookstore | Resources | About TCP | FAQs|
CephBase, OBIS and Standardization<< Cephalopod Articles | By James Wood, Catriona Day and Ron O'Dor, Biology Department, Dalhousie University, Halifax, Canada B3H 4J1
What is CephBase?
CephBase is a dynamic html (dhtml) relational database-driven interactive web page, sponsored by the Sloan Foundation following the Workshop on Non-Fish Nekton in Boston, December, 1997. CephBase was a demonstration project to initiate the process of consolidating life history, distribution, catch and taxonomic data on all living species of cephalopods (octopus, squid, cuttlefish and nautilus) to help define the goals of the Census of Marine Life. It is modeled loosely on FishBase, but was designed from the outset to operate dynamically from the web. We provide contact information for almost all of the world's cephalopod specialists because part of our role is to encourage data sharing and consolidation. Our goals are to provide reliable referenced data on cephalopods and to provide a platform for collaboration both within the cephalopod community and with all marine sciences. Since CephBase went online in August, 1998, we have had two citations in NetWatch (Science 282:587 and 285:2027).
How does CephBase work?
Under the hood of CephBase you will find a Microsoft Access database. This relational database holds the data in various tables. SQL (Structured Query Language) is used to manipulate these data and an NT server linked with Cold Fusion (an API or Application Programming Interface) serves dynamic web pages to users. This allows our database to be fully accessed by anyone with a computer linked to the internet.
What are the features of CephBase?
1) Classification of all known cephalopods down to subspecies level. The data are searchable from the genus and/or species level and classification, synonymies, type repositories, type localities, references and common name are listed.
Future Directions for CephBase?
1) Improved on the fly mapping system (OBIS?).
5) Maintain the International Directory of Cephalopod Workers.
OBIS and Web Philosophy
OBIS is an important concept and potentially provides a relational mapping function far superior to Xerox Parc Map Viewer. It is critical to recognize at the outset that by adopting the distributed system approach OBIS will acquire a responsibility to a potentially vast clientele that will ultimately expect them to maintain and 'freeze' their service, just as Xerox Parc did. Once large numbers people came to depend on the Map Viewer, Xerox was essentially 'forced' to maintain it. OBIS could also prove to be very attractive. We think it is generally desirable for OBIS to be as useful as possible, but, unless access is restricted in some way, it is important to establish a physical facility that anticipates considerable growth over the next decade.
The distributed approach makes it possible to think separately about database sites and functionally organized analytical sites like OBIS. Databases should likely be organized taxonomically because that is the way biological expertise is organized, but they should be compatible with many different types of analytical engines because that is the way analytical expertise is organized. Geographic analysis is only one way of looking at data. An important aspect of the distributed approach is that financial support for these various elements may come from a variety of sources from phylum or class oriented societies to fisheries managers and oil companies.
One of the main meeting goals is "to exchange information on existing systems and databases on marine organisms." The classification on CephBase is an example of what can be done by a taxa-specialized research team. The databases should not just be taxonomic, however. They should include distribution, ecology, behavior, migrations, life history traits, fisheries catch statistics, population genetics, publications, researchers, etc. Obviously, different taxonomic groups have to be farmed out to specialists. But, just as important is that the data need to be in a consistent and expandable format. This does not mean that they have to be compiled on one system, although a central storage site is an option. The disadvantage of centralized databases is that they all go down at once and remote management presents more of a security challenge. Perhaps the key from Sloan's perspective could be to use funding strategically to encourage maximum compatibility between various distributed databases and analytical sites. OBIS seems to be heading in the right direction, but the more open we make the model, the more collaboration there will be.
The concept of a distributed system is excellent and consistent with general philosophy of web use. It is ridiculous to imagine that all of the world's best taxonomic authorities on various animal groups would ever be in one place. Therefore, the databases should be wherever the authorities can best manage them. Given the flexibility of the web, this is not a simple question. For example, when James Wood completes his PhD, should CephBase remain at Dalhousie, move to the location of his new job, or move to OBIS at Rutgers? Actually he could probably manage the dataset remotely, making any of these options plausible. We suggest that the best arrangement may be to handle taxa the way web domains are handled, i.e. responsibility is distributed hierarchically. OBIS (or some agency) assigns responsibility for phyla to authorities, who then assign responsibility for classes, etc. Authority would include insuring appropriate biologically qualified people were in control of databases, as well as insuring that links were maintained as people and databases moved. It is equally important to establish some sort of systematic financial basis for such a network. There is currently no funding to maintain or expand CephBase at Dalhousie, for example. Although, generally speaking, maintenance costs are low compared to development costs, data continue to increase.
In the distributed system concept it should be clear that OBIS does not span the full range of analysis that managers of databases like CephBase or FishBase may wish to use. Predator/prey data could interact powerfully with distribution data to clarify ecosystem relations to the environment, for example. Sorting out current taxonomy is potentially a large and complex function on its own, but huge databases exist on fisheries catch statistics that are not specimen documented (people ate them!), which provide important information on temporal changes. Time series analyses are as important as spatial analyses, and, of course, we really need to think and analyze in four dimensions. Sites exist where time series analyses can be done, such as Ram Myers' Fish Stock-Recruitment site: (http://www.mscs.dal.ca/~myers/welcome.html), but these, too, need expansion.
To help the transition, these are issues CephBase dealt with historically or faces now:
1) Currently everybody has separate databases - the longer this continues, the more difficult it will be to integrate in the future. It is much easier to change, edit or integrate databases when they are few and small (Think, exponentially growing problem!). If we had a database 'form' to plug into, we would now have two or three times as much data stored.
Information and goals
When setting out goals, keep in mind that this new media is excellent at handling certain kinds of information and not so good at others. Databases do extremely well with hierarchical information such as classification. Standard fields such as weight or age at maturity also work well, but we must decide which population and what references to follow. Images can work well but take up huge amounts of space and download time. Data in odd formats, data fields of widely varying size, datasets with many missing fields, or repeated data can all cause problems and do not work well. Basically, it is important to have some idea about what can and can't be done if you want to set realistic goals.
Right now CephBase, FishBase and OBIS are all using Access. Although we often have similar data fields, no doubt our tables and the relationships between them are very different. Left on its own CephBase would move to a more secure client-server model using MS SQL Server 7, while OBIS has stated a preference for Oracle. Standards need to be agreed to and put in place as soon as possible, but funding must be made available to convert to those standards, otherwise the distributed database will shrink rather than grow.
James B. Wood or Ron O'Dor tel: (902) 494-6697
|Home | What's New? | Cephalopod Species | Cephalopod Articles | Lessons | Resources | About TCP | FAQs | Site Map|
The Cephalopod Page (TCP), © Copyright 1995-2015, was created and is maintained by Dr. James B. Wood, Associate Director of the Waikiki Aquarium which is part of the University of Hawaii. Please see the FAQs page for cephalopod questions, Marine Invertebrates of Bermuda for information on other invertebrates, and MarineBio.org and the Census of Marine Life for general information on marine biology.