The Cephalopod Page Home
Subscribe to the Ceph Group

Ceph Mailing Groups

CephBase, OBIS and Standardization

<< Cephalopod Articles | By James Wood, Catriona Day and Ron O'Dor, Biology Department, Dalhousie University, Halifax, Canada B3H 4J1

What is CephBase?

CephBase is a dynamic html (dhtml) relational database-driven interactive web page, sponsored by the Sloan Foundation following the Workshop on Non-Fish Nekton in Boston, December, 1997. CephBase was a demonstration project to initiate the process of consolidating life history, distribution, catch and taxonomic data on all living species of cephalopods (octopus, squid, cuttlefish and nautilus) to help define the goals of the Census of Marine Life. It is modeled loosely on FishBase, but was designed from the outset to operate dynamically from the web. We provide contact information for almost all of the world's cephalopod specialists because part of our role is to encourage data sharing and consolidation. Our goals are to provide reliable referenced data on cephalopods and to provide a platform for collaboration both within the cephalopod community and with all marine sciences. Since CephBase went online in August, 1998, we have had two citations in NetWatch (Science 282:587 and 285:2027).

How does CephBase work?

Under the hood of CephBase you will find a Microsoft Access database. This relational database holds the data in various tables. SQL (Structured Query Language) is used to manipulate these data and an NT server linked with Cold Fusion (an API or Application Programming Interface) serves dynamic web pages to users. This allows our database to be fully accessed by anyone with a computer linked to the internet.

What are the features of CephBase?

1) Classification of all known cephalopods down to subspecies level. The data are searchable from the genus and/or species level and classification, synonymies, type repositories, type localities, references and common name are listed.

2) Distribution maps for 320 species of cephalopods. Maps are made on the fly using the Xerox Parc Map Viewer. All latitude and longitude data used to generate maps are from published sources and are listed in tables and referenced. In most cases, the individual specimens used to populate our database can be tracked to their repository location for verification.

3) Ecological data is needed to fit cephalopods in global models. Predator and prey species are listed for 69 and 80 cephalopod species, respectively.

4) We also maintain the International Directory of Cephalopod Workers to help foster global collaboration.

Future Directions for CephBase?

1) Improved on the fly mapping system (OBIS?).
2) Move to client-server architecture for increased security.
3) Add life-history and fisheries data to the database.
4) Continue to add predator, prey and location data to the database.

5) Maintain the International Directory of Cephalopod Workers.

OBIS and Web Philosophy

OBIS is an important concept and potentially provides a relational mapping function far superior to Xerox Parc Map Viewer. It is critical to recognize at the outset that by adopting the distributed system approach OBIS will acquire a responsibility to a potentially vast clientele that will ultimately expect them to maintain and 'freeze' their service, just as Xerox Parc did. Once large numbers people came to depend on the Map Viewer, Xerox was essentially 'forced' to maintain it. OBIS could also prove to be very attractive. We think it is generally desirable for OBIS to be as useful as possible, but, unless access is restricted in some way, it is important to establish a physical facility that anticipates considerable growth over the next decade.

The distributed approach makes it possible to think separately about database sites and functionally organized analytical sites like OBIS. Databases should likely be organized taxonomically because that is the way biological expertise is organized, but they should be compatible with many different types of analytical engines because that is the way analytical expertise is organized. Geographic analysis is only one way of looking at data. An important aspect of the distributed approach is that financial support for these various elements may come from a variety of sources from phylum or class oriented societies to fisheries managers and oil companies.

One of the main meeting goals is "to exchange information on existing systems and databases on marine organisms." The classification on CephBase is an example of what can be done by a taxa-specialized research team. The databases should not just be taxonomic, however. They should include distribution, ecology, behavior, migrations, life history traits, fisheries catch statistics, population genetics, publications, researchers, etc. Obviously, different taxonomic groups have to be farmed out to specialists. But, just as important is that the data need to be in a consistent and expandable format. This does not mean that they have to be compiled on one system, although a central storage site is an option. The disadvantage of centralized databases is that they all go down at once and remote management presents more of a security challenge. Perhaps the key from Sloan's perspective could be to use funding strategically to encourage maximum compatibility between various distributed databases and analytical sites. OBIS seems to be heading in the right direction, but the more open we make the model, the more collaboration there will be.

The concept of a distributed system is excellent and consistent with general philosophy of web use. It is ridiculous to imagine that all of the world's best taxonomic authorities on various animal groups would ever be in one place. Therefore, the databases should be wherever the authorities can best manage them. Given the flexibility of the web, this is not a simple question. For example, when James Wood completes his PhD, should CephBase remain at Dalhousie, move to the location of his new job, or move to OBIS at Rutgers? Actually he could probably manage the dataset remotely, making any of these options plausible. We suggest that the best arrangement may be to handle taxa the way web domains are handled, i.e. responsibility is distributed hierarchically. OBIS (or some agency) assigns responsibility for phyla to authorities, who then assign responsibility for classes, etc. Authority would include insuring appropriate biologically qualified people were in control of databases, as well as insuring that links were maintained as people and databases moved. It is equally important to establish some sort of systematic financial basis for such a network. There is currently no funding to maintain or expand CephBase at Dalhousie, for example. Although, generally speaking, maintenance costs are low compared to development costs, data continue to increase.

In the distributed system concept it should be clear that OBIS does not span the full range of analysis that managers of databases like CephBase or FishBase may wish to use. Predator/prey data could interact powerfully with distribution data to clarify ecosystem relations to the environment, for example. Sorting out current taxonomy is potentially a large and complex function on its own, but huge databases exist on fisheries catch statistics that are not specimen documented (people ate them!), which provide important information on temporal changes. Time series analyses are as important as spatial analyses, and, of course, we really need to think and analyze in four dimensions. Sites exist where time series analyses can be done, such as Ram Myers' Fish Stock-Recruitment site: (http://www.mscs.dal.ca/~myers/welcome.html), but these, too, need expansion.

To help the transition, these are issues CephBase dealt with historically or faces now:

1) Currently everybody has separate databases - the longer this continues, the more difficult it will be to integrate in the future. It is much easier to change, edit or integrate databases when they are few and small (Think, exponentially growing problem!). If we had a database 'form' to plug into, we would now have two or three times as much data stored.

2) However, anyone who provides the 'form' (such as we were hoping to get from FishBase) has to be fully committed to it and offer support—otherwise people will continue to do their own thing. For example, CephBase could not sit around with 12 month funding, waiting nine months for FishBase to deliver a product. We realize that they have more than enough to do, but if someone were coordinating things many of us could use a standard set-up which could be integrated. The OBIS database is going in the right direction, but a model where those who donate data have more control over the system might attract more contributors. There is, of course, a delicate balance between making progress on your own project and supporting everyone else who wants to contribute.

3) Does one hand over all the hard work done on a web site when absorbed? If we hand over the database we have developed, do we also maintain the CephBase site? If not, what happens to specific but important functions like the International Directory? Maybe each taxonomic database needs its own 'face' with a visible organismal and taxonomic orientation, as well as an invisible database function.   4) We now use Xerox Parc as a graphing interface. It is free but unstable. A dedicated mapping www site that allows others to pass in information via an url will attract many web developers. Should access be restricted?

5) It is possible to do quite a bit of database updating and editing with forms on the web - this separates the complexity from the content providers.

6) The content providers and administrators should have at least a vague idea of relational data base function and complexity so that they don't have unrealistic expectations.

Information and goals

When setting out goals, keep in mind that this new media is excellent at handling certain kinds of information and not so good at others. Databases do extremely well with hierarchical information such as classification. Standard fields such as weight or age at maturity also work well, but we must decide which population and what references to follow. Images can work well but take up huge amounts of space and download time. Data in odd formats, data fields of widely varying size, datasets with many missing fields, or repeated data can all cause problems and do not work well. Basically, it is important to have some idea about what can and can't be done if you want to set realistic goals.

Right now CephBase, FishBase and OBIS are all using Access. Although we often have similar data fields, no doubt our tables and the relationships between them are very different. Left on its own CephBase would move to a more secure client-server model using MS SQL Server 7, while OBIS has stated a preference for Oracle. Standards need to be agreed to and put in place as soon as possible, but funding must be made available to convert to those standards, otherwise the distributed database will shrink rather than grow.

Contact:

James B. Wood or Ron O'Dor                       tel: (902) 494-6697
Biology Department, Dalhousie University     fax: (902) 494-3736
Halifax, Nova Scotia, Canada B3H 4J1         email:

The Cephalopod Page: http://www.thecephalopodpage.org/
CephBase: http://www.cephdev.utmb.edu/

» What's New?
» Cephalopod Species, Information, and Photographs
» Articles on Octopuses, Squid, Nautilus and Cuttlefish
» Cephalopod Lesson Plans by Wood, Jackson and Amity High School Teachers
» The Cephalopod Page F.A.Q.
Resources
CephBase Cephalopod database by Wood, Day and O'Dor
Upcoming Conferences
Sources of Live Cephalopods
Cephalopod Links
Want to learn more about Cephalopods?
References and Credits

Home

The Cephalopod Page (TCP), © Copyright 1995-2024, was created and is maintained by Dr. James B. Wood, Associate Director of the Waikiki Aquarium which is part of the University of Hawaii. Please see the FAQs page for cephalopod questions, Marine Invertebrates of Bermuda for information on other invertebrates, and MarineBio.org and the Census of Marine Life for general information on marine biology.