Hybrid data modeling techniques for XML schema definition with data semantics preservation


Student thesis: Doctoral Thesis

View graph of relations


  • San Kuen CHEUNG

Related Research Unit(s)


Awarding Institution
  • Shi Piu Joseph FONG (Supervisor)
  • Wenyin LIU (Co-supervisor)
Award date14 Jul 2006


XML Schema Definition (XSD) has gained currency in most web applications and is now becoming a standard of logical schema for exchanging information. XSD is complex, however, and requires a lot of knowledge of defining data semantics. The XML model also lacks data modeling techniques for the efficient development of business applications; therefore, users must be provided data modeling techniques for generating an XSD from different sources. First method ― “Relational-to-XSD Data Modeling” ― was developed such that a relational schema could be translated into an XSD through an Extended Entity Relationship (EER) Model. The EER model can be mapped into XSD using forward engineering and the data semantics of participation, cardinality, generalization, aggregation, categorization, n-ary, and u-ary relationships are preserved. The benefit is that it provides a general modeling technique for re-engineering from a Relational Model to an XML Model. The method can be validated by using information capacity recovery after schema translation. Second method ― “XBL-to-XSD Data Modeling” ― was developed using a higher level language tool (XSD-Builder) to generate an XSD automatically. The XSD-Builder consists of an XML Tree Model, an XSD-Builder Language, an XSD-Translator, and an XSD. The XBL is constructed using the conceptual information extracted from an XML Tree Model. An XML Tree Model consists of hierarchical nodes representing all elements within an XSD. The XBL is specified by the users and can be processed by XSD-Translator to generate the XSD. It is a user-friendly and well-formed language for specifying the data semantics of an XSD. Data modeling techniques can be validated by recovering the relational schema, EER Model, and XML Tree Model from the generated XSD in prototyping. The contributions of proposed methodologies of creating an XSD can be summarized in two sources: Source 1 - In an existing relational database, a relational schema can be mapped into an XSD through EER model by recovering their data semantics. Source 2 - Users design an XML Tree Model, which can be implemented by an XSD-Builder Language to generate a required XSD automatically. Two sources can also specify 14 major types of data semantics such as cardinality, specialization, generalization, participation, aggregation, categorization, u-ary and n-ary. The results are two feasible data modeling techniques that can be applied as case tools for designing an XSD. Additionally, XML Tree Model, XSD, EER Model and Relational Schema can be used interchangeably upon the users’ confirmation of data semantic constraints.

    Research areas

  • XML (Document markup language), Data structures (Computer science), Database design