Pre-Deployment System Development for the Visible Embryo Project

Lewis E. Berman
Leif Neve
Anne Altemus
May 30, 1996


1.0 Introduction

The Visible Embryo Project is a collaborative effort by the National Library of Medicine's (NLM) Communications Engineering Branch (CEB), the Human Developmental Anatomy Center (HDAC) at the National Museum of Health and Medicine at the Armed Forces Institute of Pathology (AFIP) in Washington, D.C., and the National Institute of Child and Human Development, to show how a collection of existing histological material could enhance medical research. One such collection, the Carnegie Collection of Human Development, consists of 10,000 human embryos spanning the 23 stages of embryo development. Many of the embryos have been sectioned and a small subset has been reconstructed using 3D visualization software. Accompanying this collection is a small textual database describing characteristics of 552 of the embryos.

The emphasis of this project is to provide data from this collection to researchers and students over the ubiquitous Internet. To do this we have leveraged the rapid prototyping capability of the World Wide Web (WWW) with public domain software for the client/server interaction, image and data conversion, and data storage in a relational database. Coupling this with a RAID storage device and a T3 connection to the Internet has made this a paradigm for the distribution of biomedical image and text information.

1.1 Role of the NLM

Computer expertise from the National Library of Medicine's Communications Engineering Branch and Audiovisual Program Development Branch (APDB)was solicited by the National Institute of Child and Human Development (NICHD) to support the HDAC in creating a World Wide Web site for this data. The NLM's project proposal consisted of the following four tasks:

  1. Subtask 1: Data Load/Quality Control
    NLM/CEB shall obtain all of the necessary text and images from AFIP and shall load the collection into a SPARC Storage Array RAID system for high speed access. Following the load phase the data will be viewed, in the presence of AFIP personnel, to ensure the quality of the images. CEB shall create suitable thumbnail representations of each image in a common image format such as GIF or TIF, if necessary. In addition, suitable MPEG cine loops will be created and mail available. CEB shall perform this subtask in the presence of designated AFIP personnel, as desired by AFIP.
  2. Subtask 2: HTTPD Configuration
    NLM/CEB shall configure the httpd server parameters as necessary. This will include setting up the appropriate user authorization, accounting, etc. CEB shall perform this subtask in the presence of designated AFIP personnel, as desired by AFIP.
  3. Subtask 3: HTML Development
    NLM/CEB shall integrate the text and images into archive, CEB's Web server. This subtask shall include verification of hypertext links, authoring of HTML pages, integration of an image-mapping function, and linkage of thumbnail images to high resolution images. The image mapping function will allow remote users to select a desired embryo slice. Links will also be made available for the 3D models and MPEG cine loops. CEB shall perform this subtask in the presence of designated AFIP personnel, as desired by AFIP.
  4. Subtask 4: Verification/Deployment
    While AFIP will be responsible for delivering all textual/image data to be disseminated, CEB will assist AFIP personnel in verifying the extual/image information for correctness after inclusion on the httpd server. After verification is complete, the system will undergo a 2-week period of beta testing with testers to be selected by AFIP and/or NLM. CEB shall perform this subtask in the presence of designated AFIP personnel, as desired by AFIP. After the beta testing phase, NLM will deploy the system.

It was expected that this entire task would take eight weeks to complete.

1.2 Deliverables

The development team spent a considerable amount of time cleaning the textual data, formatting HTML pages so that they would look consistent across different browsers and platforms, reformatting image data, determining which parts of the embryo matched the section data, and responding to tester inquiries. Perhaps the most difficult task was choosing a navigation model that was consistent with the inherent linear nature of the data set, while still utilizing the flexibility of the WWW.

The team was successful in completing all four subtasks of the contract. However, due to delays in receiving the entire data collection, incorrect data, incomplete design specifications, delays in receiving data after the alpha test was completed, and the inclusion of material outside the scope of the contract, the task took 12 weeks to finish. In addition, it was decided that we would extend the testing to include two stages: an alpha and beta testing stage. The first stage, alpha testing, included mostly those people familiar with the collection. The second phase, the beta testing stage, was extended to people outside of this small community, such as medical researchers, NLM staff, database distributors, and graduate students.

The alpha and beta test questionnaires have been delivered to HDAC and a transition plan has been formalized. The HDAC will host the embryo collection on their own SGI machine and we will lend technical support to the installation phase.

1.3 Work Environment

The development team consisted of several people from disparate backgrounds which created a synergistic environment. One of the more compelling aspects of this project was the impact of the Internet on wide area collaboration. The team utilized ftp and email extensively to retrieve new pieces of data and report information to one another. One of the more exciting capabilities of this technology is the ability to rapidly prototype the look and feel of the site. This was used extensively while communicating over the phone or for in office meetings. Changes could be discussed and immediately introduced into a page. This was very productive since it allowed us to quickly assess how modifications would affect the navigation of the site.

Although the process of development was dynamic and iterative, involving all members of the development team it should be noted that final decisions regarding the site design and organization were made by the HDAC. CEB personnel had made several suggestions about organization of the data and navigation design, and aesthetics that were not accepted by the HDAC. These issues will be covered in other sections of this report.

2.0 Alpha Testing

2.1 Scope of Audience

The scope of the audience was considered in the very early stages of project development. Specifically identified for the alpha testing were researchers familiar with the project and the contents of the collection, graphic designers, and computer and network personnel from the collaborating institutions. These people were chosen to insure the integrity of the data being offered and for their in-depth knowledge and experience with the WWW and computer graphics.

2.2 Design Considerations

Design considerations from the onset of development dealt with the target audience, interface design, graphic elements, navigation and efficiency, transmission speed, transition from a SUN platform to an SGI platform, and an overall content design that would enhance the user's utilization of the collection via the Internet.

It was suggested by NLM that the inherent architecture of the Carnegie Collection, i.e. stage-based data organization, be utilized for data presentation and access to the user. In addition, multiple, simultaneous methods of access/navigation were suggested to appear on each page at all times, so the site would offer several mechanisms with which to navigate through the site, pursuant to differing levels of browsing styles by site visitors. The issues of design for other interactive media are usurped by the inherent design challenges presented by the Internet's hypermedia capability and the lack of conventional document design. Graphics and text must be developed for clarity and legibility as well as file size to give the user optimum speed and efficiency in use. Additionally, different platforms and screen size must be considered in the page layout and design.

2.2.1 Organization of the content

The inherent architecture of the collection database was recommended as an organizational guide for content design. As the embryos in the collection are classified by stage of development, presentation of the material in the appropriate stage category was logical and reinforced both the collection and web site structure. All other features of the site, movies, text, database, were suggested to be offered as accessories to the stage sites.

2.2.2 Navigation

The principle objective in the development of the navigational model for this site was to assure ease of use and access to all image data. It was suggested by NLM team members that several navigational options be made available to the user, to offset disorientation and difficulty in accessing important data. Web users instinctively begin to build a mental model of site structure based on the presentation of material on the home page. Because spatial and conceptual metaphors in data organization on the Web are not intuitive or established, the development team felt the information presented on the home page should accurately represent the scope of material available at the site. In addition, navigational tools should be provided to the user to readily investigate and understand how the information has been organized. The initial site architecture as proposed by the AFIP did not incorporate the aforementioned elements.

2.2.3 Database

The 552 specimens represented in the textual database were extensively reviewed by HDAC staff and selected for their significant value to research; they represent all 23 stages of human embryonic development. The database was designed so the user could conduct simple searches on the collection by entering allowable ranges for one or more fields in an HTML form. The backend processor at the server uses the form data to logically "AND" together these values and extract the matching records. The user also has control over the format of the resulting output by clicking on the appropriate check boxes in the form.

Often, the record describing a specific case number (embryo) is not complete. In some instances it is because data is still being entered into the database or there is incomplete data on the embryo. By default, searches always show all existing blank fields on the search results page, although this could be modified at the user's discretion.

The search facility was intended to return with the results quickly nd respond in a meaningful way if data was entered incorrectly. The tools used to construct it were to be portable and easy to nstall and maintain. To meet these requirements, a database management system (DBMS) called MiniSQL and companion Web "middleware" called Personal Home Page/Forms Interface (PHP/FI) were chosen allow to search and retrieval of embryo data from the Web browser. MiniSQL is a lightweight, low-cost DBMS which can perform quick searches but does not support the full range of SQL commands. PHP/FI, sometimes referred to as a CGI "wrapper", sits between MiniSQL and the Web browser to make database queries possible from HTML forms. It provides a flexible set of scripting commands which can be embedded in HTML forms to set up and query the database, control the logical flow of the HTML, do forms validity checking, and so on. Both tools run on most Unix boxes and require minimal administration overhead.

2.3 Aesthetic Features

The aesthetic elements of page design were considered at length. Encumbered page design impedes transmission, and so design features not necessary in the organization and presentation of the data was strongly discouraged. Aesthetic features should be tempered with function and the appropriate use of media. Screen real estate was considered optimum at each level, and suggestions pursuant to this were put forth. The blue bar that ran the length of the left side of any page on the site, seemed to serve little function, and compromised valuable screen real estate. However HDAC wanted to maintain this graphical design because they felt it was more aesthetically pleasing.

2.4 Feedback and Corrective Action

Feedback from the alpha test phase can be divided into:

  • bug reports
  • design suggestions
  • requests for added features

Bugs ranged from those that arose from inherent machine limitations to those resulting from incorrect browser or server configuration. An example of a machine limitation was the heavier-than-usual demands placed on machine memory by the database search facility. The Netscape browser running on the Macintosh computer was particularly sensitive to this problem. To address this problem, the default search was made less memory-intensive by having it return fewer fields.

An example of a server configuration bug was an instance where the server was not aware of a certain MIME type. This resulted in data being downloaded to the client in an improper format when chapter reprints were requested. A more difficult problem was how to provide MPEG movies in a format that MPEG viewers on diverse platforms could handle. This arose because the initial software package used to generate the MPEG movies produced a file format not supported by all MPEG viewers. After substantial experimentation with different MPEG programs, a portable format was encoded using the MPEG utilities developed by Lawrence Rowe's group at the University of California at Berkeley. Still another server bug was an inaccurately mapped imagemap.

Design suggestions can be divided into the categories of organization of content, screen layout, and navigational design.

How to organize the site was a hotly discussed topic during the design phase. One idea was to organize the material around each embryonic stage. Selecting "Movie", "Reprint" or "Model" would take one to the movie, reprint or model for that stage, and arrows would take one forward or backward in the sequence of stages. This approach suited the subject matter well but required that the navigational aids were easy to understand.

For a variety of reasons, this organization was only partially implemented, the end result being that users were confused about how to navigate the site. Since a thorough implementation of a stage-based navigation model was not possible, the final design settled on an organization based on the information type. In this case a user first selects what type of information to view (movie, reprint or model) and then selects the desired stage. Some of the original stage-based navigational model was retained, however by virtue of the fact that navigation between stages and also browsing between different information types for a specific stage was still facilitated. Based on alpha user feedback, and continued discussions on developing an effective navigational model, the beta release incorporated many development team suggestions which enhanced the original site structure and provided effective interface navigational tools. One example was the implementation of button bars at the top of each page, as opposed to the bottom. This to ensure that the researcher/user not familiar with manipulation of the browser interface would not get lost or stuck should he/she not understand that the "page" on a web site does not as yet have a standard length, and the parameters of the screen do not necessarily define the limits of the page. If this were the assumption, the novice browser would not know to scroll to the bottom of the page to find navigational aids, i.e. button bars.

In the navigational design of the beta test version, access to the embryo section data exists via several routes, including a direct link from the home page. This site structure satisfies the principle development objective, assuring direct access to all image data for research and study.

Other significant organizational changes were to:

  • Remove ambiguous arrow buttons.
  • Turn the home page into an index to the entire site.
  • Create a new introduction section encompassing the historical materials.
  • Fold some of the original home page into a credits page.
  • Create a site-wide glossary.

The original aesthetic approach to overall screen layout remained basically unchanged even though concern was expressed over the loss of real estate due to the large headers and border.

Some significant design changes that did take place were to:

  • Move the navigational buttons to the top of each page.
  • Make the screen titles more prominent.
  • Rework the database search screen to be more intuitive.
  • Maintain consistent margins on the left side of the page.
  • Add a page revision date and the URL of the home page to the bottom of every page.

Requests for new features that arose from alpha testing were mostly postponed since they were beyond the original requirements of the task. One significant new feature however, the enhancement of the individual stage pages to allow selective viewing of different anatomical structures, was incorporated. Also, a page was added that contained information about related WWW sites.

One original feature, the Quicktime movies, was actually removed. Although the quality of Quicktime movies was found to be superior to the MPEGs, this benefit was thought to be offset by their enormous size and thus lengthy download time.

3.0 Beta Testing

3.1 Scope of Audience

The audience for the beta testing phase was expanded to include faculty and students in the medical and biological sciences, persons unrelated to any of the aforementioned groups, yet familiar with the Internet and Web browsing, and members of the alpha testing team. This group consisted of 50 people of which 38 participated in the beta experiment.

3.2 Feedback

The user feedback from this phase was greater (due to the larger number of participants) and more detailed. In addition, it lent support to the major organizational changes resulting from the alpha testing phase.

Some specific concerns were that:

  • Images, drawings and movies did not incorporate enough explanatory text and resolution information.
  • Navigation through serial microscope slides was not natural and could be enhanced both in terms of navigation and maintaining contextual information.
  • Downloading efficiencies might be achieved in some instances by closer cropping of movies and reduction of banner and margin sizes.
  • The database form could be made more intuitive.
  • The reprints were too large for the screen.

There was considerable support expressed for several features incorporated since the alpha test:

  • The index on the home page.
  • The facility for selective viewing of internal anatomical structures.
  • Movement of buttons to the top of each page.
  • Addition of related sites page.

3.3 Corrective Action

The AFIP will now decide what changes to make based on the feedback from the beta test.

3.4 Statistics

Page Number of Hits Percentage of Hits
All stage 10 pages, excluding region browser 259 13.1
All stage 12 pages 192 9.7
All stage 23 pages 181 9.2
Database 176 8.9
All reprint pages 170 8.6
All stage 13 pages 162 8.2
All stage 16 pages 135 6.8
All region browser pages 125 6.3
All MPEG movies 106 5.4
Main models page 46 2.3
Total 1979  

4.0 Conclusions and Recommendations

This project proved to be a valuable experience with respect to making a collection of multimedia biomedical information available over the Internet and the World Wide Web. We gained an understanding of how to effectively work in a wide area collaborative effort and how to use tools to support this effort.

In general this product will be useful to researchers and students, as noted in the alpha and beta surveys and feedback. The combination of images, 3D reconstructions, reprints, and especially the textual database will all help to promote learning in the area of human embryo development. This site could be improved with the following changes:

  • Modify the background design of the pages by removing the blue border in the left margin. This will save screen real estate and make it easier for people that have lower resolution monitors to view all of the available data on complex pages.
  • It would be worth considering a move to "JavaScript" to utilize an index or table of contents type system within the frame paradigm.
  • The reprints should be converted to HTML or PDF format for easier viewing and consistent printing.
  • The models should be converted into a format such as VRML so that researchers could rotate and scale the embryos as desired.
  • Some of the section images currently available should be rescanned due to the poor quality. More section data is necessary so that researchers have access to data from all levels of embryo development and from all of the systems of the embryo.
  • An automatic registration procedure should be designed to register adjacent slices so that models can be created.
  • Dr. Hutchins' dataset should be included in the site.
  • The cardiac embryology project developed by the APDB should be converted to a hypertext document and included in this site.