Open Source Book: EDM Issue #2

How do you differentiate a Collection of things from a Class of things?

Effective Data Modeling Book Outline:

  1. Class versus Instance.  The boundary problem.
  2. Collection versus Class. Understanding how they differ (in this article).
  3. Class versus attribute.
  4. Characteristic versus Association.
  5. Abstract Class versus Concrete Class.
  6. Transaction versus Entity.
  7. Role versus Class.
  8. Fixed versus dynamic enumerations.
  9. Identifiers and Identity
  10. Opaque versus Explicit associations.
  11. Globally Unique versus Locally-unique identifiers.
  12. Value codes versus Labels.
  13. Specialization versus Subclass.
  14. Semantic versus Opaque identifiers.
  15. Long versus Short Names.
  16. Part-of versus Subclass.
  17. Explicit versus implicit qualifiers.
  18. Context-dependent versus independent metadata.
  19. Explicit versus implicit scope.
  20. Binary versus N-ary relationships.
  21. Optional versus mandatory attributes.
  22. Composition versus aggregation.
  23. Permanent versus temporary relationships.
  24. Concept versus representation
  25. Attribute versus category.
  26. Informal (subjective) versus formal categorization.
  27. Intensional versus extensional membership.
  28. Categories versus views
  29. Upper ontology versus ontology mapping.
  30. Conceptual versus logical data models.
  31. Metadata versus data.
  32. External versus Internal code tables.
  33. Term versus Class.
  34. Set versus collection.
  35. Alternative code lists versus harmonized code list.
  36. Folksonomy versus taxonomy.
  37. Enumeration versus identifier.
  38. [What modeling issues are you facing? Email me to discuss at mdaconta at oberonassociates dot com!]

Issue #2: Collection Versus Class

Is it best to differentiate or integrate?  From a mathematical perspective, both classes and collections are just sets of things; however, we shall see that from a Knowledge Representation perspective they are different. 

The key difference between a collection and a class is that a collection is an optimized container for grouping other data types (including class instances) while a class is a mechanism to create custom data types (also called user-defined types).  An example of a collection would be a group like "My Friends" or "Favorite Things" while an example of a Class would be "Person" or "Vehicle".  In software development examples of collections are arrays, lists, hash tables and trees.  Each of these collection structures are optimized for different types of manipulation, search and access.

From a Knowledge Representation perspective, the difference between a collection and a class is one of membership.  A collection, by definition, allows heterogeneous members of any kind, at any time.  A class only has homogeneous members (called instances) that are instantiated as members of the class.  In other words, collections add members of any type but classes create members of their customized type.

A final example distinguishing these two concepts can be seen in modeling an Automobile where the auto's "Interior" or "Exterior" are collections while "Wheel" and "Engine" are classes.  This differentiaton between collections and classes becomes important in Taxonomy design.

 

[Comments on this article are welcome.]