Artigo Revisado por pares

Theoretical Considerations of Lifecycle Modeling: An Analysis of the Dryad Repository Demonstrating Automatic Metadata Propagation, Inheritance, and Value System Adoption

2009; Taylor & Francis; Volume: 47; Issue: 3-4 Linguagem: Inglês

10.1080/01639370902737547

ISSN

1544-4554

Autores

Jane Greenberg,

Tópico(s)

Advanced Database Systems and Queries

Resumo

Abstract The Dryad repository is for data supporting published research in the field of evolutionary biology and related disciplines. Dryad development team members seek a theoretical framework to aid communication about metadata issues and plans. This article explores lifecycle modeling as a theoretical framework for understanding metadata in the repository environment. A background discussion reviews the importance of theory, the status of a metadata theory, and lifecycle concepts. An analysis draws examples from the Dryad repository demonstrating automatic propagation, metadata inheritance, and value system adoption, and reports results from a faceted term mapping experiment that included 12 vocabularies and approximately 600 terms. The article also reports selected key findings from a recent survey on the data-sharing attitudes and behaviors of nearly 400 evolutionary biologists. The results confirm the applicability of lifecycle modeling to Dryad's metadata infrastructure. The article concludes that lifecycle modeling provides a theoretical framework that can enhance our understanding of metadata, aid communication about the topic of metadata in the repository environment, and potentially help sustain robust repository development. KEYWORDS: metadatametadata theoryrepositoriesDryadlifecycle modelingautomatic metadata propagationmetadata inheritancevalue system adoption This work is supported by National Science Foundation (NSF) grants #EF-0423641 and NSF/BDI #0743720. I thank Dryad development team members for their inspiration, input, and encouragement with this article. Notes 1. Libraries catalog and create metadata for resource holdings. Cataloging is more often associated with print and physical resource, whereas metadata is more often associated with digital resources, although the terms cataloging and metadata are used interchangeably. 2. Alan Danskin, "Tomorrow Never Knows: The End of Cataloguing?" (paper presented at the World Library and Information Congress: 72nd IFLA General Conference and Council, Seoul, Korea, August 20–24, 2006). http://www.ifla.org/IV/ifla72/papers/102-Danskin-en.pdf (accessed November 2, 2008). 3. Thomas Mann, "Off the Record but Off the Track: A Review of the Report of The Library of Congress Working Group on The Future of Bibliographic Control, With a Further Examination of Library of Congress Cataloging Tendencies," March 14 2008, http://www.guild2910.org/WorkingGrpResponse2008.pdf (accessed November 2, 2008). 4. Karen Coyle, "Technology and the Return on Investment," Journal of Academic Librarianship 32, no. 5 (2006): 537–39. 5. Library of Congress Working Group on the Future of Bibliographic Control, "Report on the Future of Bibliographic Control," November 30, 2007, http://www.loc.gov/bibliographic-future/news/lcwg-report-draft-11–30-07-final.pdf (accessed November 2, 2008). 6. Jihie Kim, Yolanda Gil, and Varun Ratnakar, "Semantic Metadata Generation for Large Scientific Workflows," in Proceedings of the 5th International Semantic Web Conference, ISWC-2006, Athens, GA, USA, ed. I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo (Berlin: Springer, 2006), 357–70. 7. Marko A. Rodriguez, Johan Bollen, and Herbert Van de Sompel, "Automatic Metadata Generation Using Associative Networks," ACM Transactions on Information Systems 27, no. 2 (2008), http://arxiv.org/abs/0807.0023 (accessed November 2, 2008). 8. Jane Greenberg, "Metadata Extraction and Harvesting: A Comparison of Two Automatic Metadata Generation Applications," Journal of Internet Cataloging 6, no. 4 (2004): 59–82. 9. Jane Greenberg, Kristina Spurgin, and Abe Crystal, "Functionalities for Automatic-Metadata Generation Applications: A Survey of Metadata Experts' Opinions," International Journal of Metadata, Semantics, and Ontologies 1, no. 1 (2006): 3–20. 10. Jane Greenberg, Kristina Spurgin, and Abe Crystal, "Final Report for the AMeGA (Automatic Metadata Generation Applications) Project," February 17, 2005, http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf (accessed November 2, 2008). 11. Metadata Encoding and Transmission Standard (METS) Official Web site, http://www.loc.gov/standards/mets/. 12. arXiv.org e-Print archive, http://arxiv.org/. 13. M. C. Leoni, M. Dolensky, P. Padovani, P. Rosati, A. Wicenec, and A. Micol, "Multi-Purpose Metadata Repository for a Real and Virtual Observatory," Astronomical Data Analysis Software and Systems XV (2006): 414. 14. Dryad, http://datadryad.org/. 15. William Whewell, Theory of Scientific Method, 2nd edition, ed. Robert E. Butts (Indianapolis: Hackett Publishing Company, 1989). 16. Ibid. 17. George Lakoff and Mark Johnson, Metaphors We Live By (Chicago: The University of Chicago Press, 1980). 18. Michael J. Reddy, "The Conduit Metaphor: A Case of Frame Conflict in Our Language about Language," in Metaphor and Thought, 2nd edition, ed. Andrew Ortony (Cambridge: Cambridge University Press, 1993), 284–324. 19. Francis Miksa, "The Cultural Legacy of the 'Modern Library' for the Future," Journal of Education for Library and Information Science 37, no. 2 (1996): 100–119. 20. Doralynn J. Hickey, "Theory of Bibliographic Control in Libraries," Library Quarterly 47, no. 3 (1977): 253–73. 21. Doralynn J. Hickey, "Bibliographic Control in Theory," IFLA Journal 6, no. 3 (1980): 234–41. 22. Lynne C. Howarth, "Metadata and Bibliographic Control: Soul-Mates or Two Solitudes?," Cataloging & Classification Quarterly 40, no. 3 (2005): 37–56. 23. Jane Greenberg, "Understanding Metadata and Metadata Schemes," Cataloging & Classification Quarterly 40, no. 3/4 (2005): 17–36. 24. Charles A. Cutter, Rules for a Dictionary Catalog, 4th ed. (Washington, DC: Government Printing Office, 1904). 25. Ann Arbor Accords: Principles and Criteria for an SGML Document Type Definition (DTD) for Finding Aids, http://sunsite.berkeley.edu/FindingAids/EAD/accords.html. 26. Design Principles for Text Encoding Guidelines: TEI ED P1. 1988 [rev. January 1990], http://www.w3.org/People/cmsmcq/1990/edp1.html. 27. Erik Duval, Wayne Hodgins, Stuart Sutton, and Stuart L. Weibel, "Metadata Principles and Practicalities," D-Lib Magazine 8, no. 4 (2002), http://www.dlib.org/dlib/april02/weibel/04weibel.html (accessed November 2, 2008). 28. Marieke Guy, Andy Powell, and Michael Day, "Improving the Quality of Metadata in Eprint Archives," Ariadne 28 (2004), http://www.ariadne.ac.uk/issue38/guy/ (accessed November 2, 2008). 29. Jane Greenberg, "Metadata and the World Wide Web," in Encyclopedia of Library and Information Science, ed. Marcia J. Bates, Mary Niles Maack, and Miriam Drake (New York: Marcel Dekker, Inc., 2003), 1876–88. 30. Anne J. Gilliland, "Setting the Stage," in Introduction to Metadata, Version 3.0, ed. Murtha Baca (Los Angeles, CA: Getty Information Institute, 2008), http://www.getty.edu/research/conducting_research/standards/intrometadata/setting.html (accessed November 2, 2008). 31. Greenberg, "Understanding Metadata and Metadata Schemes," 17–36. 32. Thomas R. Bruce and Diane I. Hillmann, "The Continuum of Metadata Quality: Defining, Expressing, Exploiting," in Metadata in Practice, ed. Diane I. Hillmann and E. L. Westbrooks (Chicago: American Library Association, 2004), 238–56. 33. Kat Hagedorn and Sarah Shreeves, eds., "Best Practices for OAI Data Provider Implementations and Shareable Metadata," June 25, 2007, http://webservices.itcs.umich.edu/mediawiki/oaibp/index.php/BestPracticesIntroduction (accessed November 2, 2008). 34. Resource Description Framework (RDF), http://www.w3.org/RDF/. 35. Mikael Nilsson, Matthias Palmér, and Ambjörn Naeve, "Semantic Web Metadata for E-Learning—Some Architectural Guidelines," in Proceedings of the 11th World Wide Web Conference, 2002, http://kmr.nada.kth.se/papers/SemanticWeb/p744-nilsson.pdf. 36. DCMI Abstract Model, http://dublincore.org/documents/abstract-model/. 37. IFLA Study Group on the Functional Requirements for Bibliographic Records, "Functional Requirements for Bibliographic Records: Final Report," February 2008, http://www.ifla.org/VII/s13/frbr/frbr_2008.pdf (accessed November 2, 2008). 38. Sarah Higgins, "The DCC Curation Lifecycle Model," The International Journal of Digital Curation 1, no. 3 (2008): 134–40. 39. Ibid., 135. 40. Steve Duplessie, Nancy Marrone, and Steve Kenniston, "The New Buzzwords: Information Lifecycle Management," March 31, 2003, http://www.computerworld.com/hardwaretopics/storage/story/0,10801,79885,00.html (accessed November 2, 2008). 41. Ya-ning Chen and Shu-jiun Chen, "Metadata Lifecycle Model and Metadata Interoperability" (paper presented at the 5th International Conference on Conception of Library and Information Science (CoLIS 5), June 4–8, 2005, University of Strathclyde, Glasgow, UK), http://pl11.sinica.edu.tw:8080/dspace/handle/1868/2273 (accessed November 2, 2008). 42. Ann Green and Jean-Pierre Kent, "The Metadata Life Cycle," in MetaNetWork Package 1: Methodology and Tools, ed. Jean-Pierre Kent (The MetaNet Project, 2002), 29–34, http://www.epros.ed.ac.uk/metanet/deliverables/D4/IST_1999_29093_D4.pdf (accessed November 2, 2008). 43. Research Libraries Group, Trusted Digital Repositories: Attributes and Responsibilities, an RLG-OCLC Report (Mountain View, CA: RLG, Inc., 2002), http://www.oclc.org/programs/ourwork/past/trustedrep/repositories.pdf (accessed November 2, 2008). 44. Ronald Jantz and Michael J. Giarlo, "Digital Preservation: Architecture and Technology for Trusted Digital Repositories," D-Lib Magazine 11, no. 6 (2005). 45. Richard P. Smiraglia, The Nature of "A Work": Implications for the Organization of Knowledge (Lanham, MD: Scarecrow Press, 2001), 88–119, 165. 46. Richard P. Smiraglia, ed., Works as Entities for Information Retrieval (Binghamton, NY: Haworth Press, 2002). 47. Martha M. Yee, "What is a Work?" (paper presented at the International Conference on the Principles and Future Development of AACR, Toronto, Ontario, Canada, October 23–25, 1997), http://repositories.cdlib.org/postprints/3085/ (accessed November 2, 2008). 48. Barbara Tillet, "A Taxonomy of Bibliographic Relationships," Library Resources and Technical Services 35, no. 2 (1991): 150–58. 49. Gregory H. Leazer and Richard P. Smiraglia, "Bibliographic Families in the Library Catalog: A Qualitative Analysis and Grounded Theory," Library Resources and Technical Services 43, no. 4 (1999): 191–212. 50. Richard P. Smiraglia. "A Meta-Analysis of Instantiation as a Phenomenon of Information Objects," Culture del Testo e del Documento 9, no. 25 (2008): 5–25. 51. Anita Sundaram Coleman, "Scientific Models as Works," Cataloging & Classification Quarterly 33, no. 3/4 (2002): 129–59. 52. Richard P. Smiraglia, "Authority Control and the Extent of Derivative Bibliographic Relationships" (Ph.D. diss., University of Chicago, 1992). 53. National Evolutionary Synthesis Center (NESCent), http://www.nescent.org/. 54. SILS Metadata Research Center, UNC/CH, http://ils.unc.edu/mrc/. 55. News of National Science Foundation/Division of Biological Infrastructure (NSF/DBI) recent award, http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0743720. 56. Dryad Partners, http://datadryad.org/partners.html. 57. Dryad project description, https://www.nescent.org/wg/digitaldata/images/9/96/Dryad.proj.descr.07.pdf. 58. TreeBASE, http://www.treebase.org/. 59. GenBank, http://www.ncbi.nlm.nih.gov/Genbank/GenbankSearch.html. 60. Jed Dube, Sarah Carrier, and Jane Greenberg, "DRIADE: A Data Repository or Evolutionary Biology," in Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital libraries, ed. Ray Larson, Edie Rasmussen, Shigeo Sugimoto and Elaine Toms (New York, New York: ACM Press, 2007), 481. 61. Ryan Scherle, Sarah Carrier, Jane Greenberg, Hilmar Lapp, Abbey Thompson, Todd Vision, and Hollie White, "Building Support for a Discipline-Based Data Repository" (poster presented at the Third International Conference on Open Repositories 2008, April 1–4 2008, Southampton, United Kingdom). 62. Jennifer L. Knies, Kristen K. Dang, Todd J. Vision, Noah G. Hoffman, Ronald Swanstrom, and Christina L. Burch, "Compensatory Evolution in RNA Secondary Structures Increases Substitution Rate Variation among Sites," Molecular Biology and Evolution 25, no. 8 (2008): 1778–87. 63. Sarah Carrier, Jed Dube, and Jane Greenberg, "The DRIADE Project: Phased Application Profile Development in Support of Open Science," in Proceedings of the International Conference on Dublin Core and Metadata Applications 2007, ed. S. A. Sutton, A. S. Chaudhry, and C. Khoo (Dublin Core Metadata Initiative, Singapore, 2007), 35–42. 64. Sarah Carrier, "The Dryad Repository Application Profile: Process, Development, and Refinement" (Master's Paper, University of North Carolina at Chapel Hill, 2008). 65. Helping Interdisciplinary Vocabulary Engineering (HIVE), http://ils.unc.edu/mrc/hive/. 66. Allen H. Renear, Karen M. Wickett, Richard J. Urban, Dave Dubin, and Sarah L. Shreeves, "Collection/Item Metadata Relationships," in Proceedings of the International Conference on Dublin Core and Metadata Applications 2008, ed. J. Greenberg and W. Klaus (Dublin Core Metadata Initiative, Berlin, Germany, September 22–26, 2008), 80–9. 67. Stephan Reebs, Fish Behavior in the Aquarium and in the Wild (Ithaca, NY: Cornell University Press, 2001). 68. Richard P. Smiraglia, "Subject Access to Archival Materials Using LCSH," Cataloging & Classification Quarterly 11, no. 3/4 (1990): 63–90. 69. Robert Losee, "A Performance Model of the Length and Number of Subject Headings and Index Phrases," Knowledge Organization 31, no. 4 (2004): 245–51. 70. A phylogentic tree tracks the evolution of a biological species, by tracking relationships among species that are believed to, or are shown to, have had common ancestors. 71. LISD are "persistent, location-independent, resource identifiers for uniquely naming biologically significant resources including but not limited to individual genes or proteins, or data objects that encode information about them" (http://xml.coverpages.org/lsid.html). 72. Connotea: http://www.connotea.org/.

Referência(s)