Data Modeling in Robotics – 1
This post discusses data modeling – a branch of model based engineering (MBE) that has gained traction in the last few years. A data model formalizes the semantics (i.e. the meaning) of the information to be processed and exchanged by a system. As with all branches of MBE, the initial investment in the model supports tool-based development, software product lines, and interoperability.
Semantics is concerned with the structural relationship between information elements, which may be captured in a data model, and the definition of those elements, which may be captured in a data dictionary. In Part 2 we will discuss the design of mathematically coherent data dictionaries; in Part 1 we will be focusing on the basics of data models for robotic systems. Therefore, before proceeding with a general introduction to data models, we shall say a few words about the information processed and exchanged in robotic systems. In particular, we are interested in their information qualities and how these will drive the structure of the data model.
Robotic systems
Robotic systems are designed to perform a complex series of actions automatically, based on their assigned objectives, their local environment, and available capabilities and information. In general, therefore, we can say that robots are task focused, adaptive to the dynamic situation, share information in message exchanges, and operate in a spatial-temporal environment.
Data modeling arose out of the design of databases where the ‘data is at rest’. Here, we mainly concerned with ‘data in motion’ in the form of real-time message exchanges that share information about the dynamic global state of the system in the context of a given task. Therefore, the relationship between message exchanges and a global state model is an important aspect of data modeling for robotic systems.
Information ontology
Before attempting a data model, it is important to sketch out the underlying ideas that will pervade the data that is to processed and exchanged in a robotic system. The formalization of these common ideas is often called an ontology, which is a powerful way of achieving conceptual interoperability. If another system requires a different ontology, it will most-likely require a different data model structure. A robotic system will often interact with different systems and ontological realms via the production and consumption of global information records, which will be discussed shortly. The ontology (or ontologies) will not only drive the data model, but other views of the system design, such as the system use cases and the functional architecture.
A suitable ontology for robotic systems is shown in the informal model below. We will call this the Objective-Mission-Resource (OMR) ontology, because it is structured around the principal ideas of: operational objective, resource mission, and taskable resource.
In the OMR ontology, a taskable resource can be any system, component, service, or actor that can be tasked with a mission or action. In a robotic system, a taskable resource could be an individual robot or controller, a system of robots, or a system component or service, such as a mission planner, sensor, manipulator, or propulsion system.
The operational objective, resource mission, and resource action collectively represent the larger idea of an activity, which appears in many ontologies. However, for our purpose additional structure is needed. A taskable resource is assigned a mission within its capabilities in order to satisfy an operational objective. The mission may be supported by resource actions based on cued behaviors that must occur during execution of the mission. For example, during a surveillance mission, a robot might perform cued actions related its sensor. The objective is agnostic to the identity of the tasked resource and there could be multiple ways to satisfy an objective with different missions assigned to the available resources at the time. The ability to assign a mission to a resource will depend on the mission conditions under which the mission is to be performed and on the corresponding capability of the resource. The operational objective, itself, may be constrained by pre-established limitations such as regulations, operational doctrine, and system policies.
Objectives, missions, and resources may be decomposed. For example, a system of heterogeneous robots could have a group objective and assigned mission, which is then decomposed into individual objectives and missions for particular robots, and decomposed again into objectives and missions for components or services within a given robot.
In the course of performing its assigned mission, the resource might be required to consume or produce global information records of shared interest outside of the system. It is important to note that this global information record could conform to a different ontological realm. It is in this manner that a robotic system is integrated with systems outside its domain of governance. For example, if a robotic system is evacuating a combat casualty it might send vital signs information to the destination medical treatment facility. This information would conform to the ontology of clinical data, which has a structure oriented to clinical forensics. Clinical data records are concerned with recording related sequences of clinical acts and all the human, institutional, and technical entities that participated in each act via their role.
While it might be possible to force everything into the same ontological realm, and many have attempted to do this, it would ignore the pragmatics of how data is intended to be structured and used in different systems. The notion of a single ontology for all information has, so far, resulted in architecture overreach. So, in summary, it is important to understand the core ontology and peripheral ontological realms of the information that is being processed and exchanged in a system before attempting to design a data model architecture.
When expressing the ontology of data, it is sometimes useful to first capture any information qualifiers (or modalities) that will be needed. These can include alethic modalities (concerning possibility and necessity), temporal modalities (concerning past and future), deontic modalities (concerning obligation and permission), and epistemic modalities (concerning the quality of knowledge). In the OMR ontology, the information modals must address deontic and alethic concerns. In a clinical data ontology, as another example, there will be focus on temporal, deontic, and epistemic modalities.
Data model realization
A data model models identifiable and enduring things in the real-world (called entities) and their properties. Of particular interest is the identification of each entity of interest, its relationship to other entities, and the attributes of each entity. In the OMR ontology, entities include operational objectives, resource missions, taskable resources, and global information records. Conditions and capabilities are modeled as various complex attributes.
The structure of these complex attributes will be discussed in more detail later. However, as a simple example, consider a robot (a taskable resource) that is assigned a surveillance mission to satisfy an operational surveillance objective. In this example we are interested in a generalized attribute that we shall name ‘quantity’ and its associated complex attribute types (data types). An actual quantity in the model could be time, position, force, velocity, orientation, for example.
In the figure below, which is derived from the SAE Unmanned Systems (UxS) Control Segment (UCS) Architecture, the three entities are modeled as UML classes (in green) and the conditions and capabilities concerning the quantity of interest are modeled as complex UML attributes. To show the structure more clearly, we are using the association-like notation for attributes, in which the attribute type (in blue) is an association end and the attribute name denoted by the name of the association.
Defining messages
There are many benefits to defining messages from a data model. First, all message exchanges relate to a single understanding of an entity and its properties. There is no ‘impedance mismatch’ between different messages or between different end systems in a message exchange. In addition, the full context of a message exchange is documented in the data model. Second, there is an effective separation of concerns between the design of the data model (the domain) and the design of the messages (system architecture). This focusses the expertise where needed. Third, the data model allows different systems to share information about the ‘global state’ of the system of systems and its environment. Fourth, a simulation model may be constructed directly from the data model and derived message model to support system test and evaluation.
In a data model, a message type represents a ‘view’ of the overall model. For example, in the UML figure below, we have two resources: a robot type with a motor unit. Here, we are interested in defining a simple robot status message. Each attribute in the message class is a projection of an owned attribute in the data model. It will be seen that the name of the attribute can be different at each end of the projection, which might be necessary to avoid duplication of attribute names in the derived message and to make the semantics of the names more obvious or familiar to users in the context of the message exchange. A derived message might conform to a prior third-party message specification, for example, which is ‘reverse engineered’ into the data model to support interoperability. The handling of different data types in ‘reverse engineered’ messages will be addressed later.
The construction of a large data model will require a division of labor and delegation of governance based on different concerns. The figure below shows four layers based on a suitable profile of UML to be supported by formally-specified modeling constraints and suitable development tools. This figure similar is to the organization of the data model in the SAE UCS Architecture.
The OMR ontology is expressed by a core entity-relationship model (ERM) in the foundation classes layer and by the structure of supporting data types, which together form the data architecture framework. The foundational entities related to objectives, missions, actions, resources, and information records will be defined in the architecture framework, as will be the complex attribute types that express the important information modalities.
Common entity types that inform the general subject matter of robotic systems are defined by one or more ERMs in the interoperability layer. These are then extended for particular robotic domains in the domain-specific layer. For example, all the higher robotic domains must share the same concept of a vehicle, a payload, a communication system, and so on, which will allow the separate domains to form a cross-domain system of systems. The particular entity- relationship model (ERM) for a given domain is constructed in the domain-specific layer. For example, the vehicle type can be extended in the Air Domain from an air vehicle type.
Underneath the ERM layers are the supporting data types and an associated data dictionary. We will discuss the structure of the data types next.
Data type abstraction
It is common practice to refer to data models at three levels abstraction. These are the conceptual data model (CDM), the logical data model (LDM), and the physical data model (PDM). In some cases, the CDM can be simply an entity-relationship model without any attributes, but here it is the attribute type that defines the CDM, LDM, and PDM – the attribute name is unchanged in the conceptual, logical and physical data models. From the semantic point of view, we are only interested in the CDM and LDM. The PDM is simply concerned with how the LDM is realized in physical memory or in a physical interconnection. Therefore, here we are interested primarily in the difference between the conceptual data types and logical data types.
In the CDM, the attribute type is stereotyped as an observable type (or a complex type based on an observable type). In the LDM the attribute type is stereotyped as a measurement type (or a complex type based on a measurement type. At its simplest level, an observable type is defined by an ordinary dictionary in non-technical language, while a measurement type is a valid realization of the observable based on a formal technical definition. The observable type specification is insufficient for an instance (a given value) to have semantics, while the instance of a measurement type has precise semantics.
An example of this is provided in the figure below. In the CDM, a robot type owns the attribute ‘speed: SpeedType’. The transformation from CDM to LDM is one of refinement, where the observable type SpeedType is substituted with a valid measurement type. In the figure three refinements are shown. SpeedType can be refined into the quantities Vehicle-carried NED velocity (m/s) or Earth-fixed ENU velocity (knots). For a nautical application it can also be refined into the enumerated nominal property engine order telegraph (EOT) with the values full ahead, half ahead, slow ahead, dead slow ahead, stop, and so on.
The structure is intended to support the reverse engineering of a legacy third-party message type into the data model, and thereby understand the possibility of performing message translation. This has a lot of practical utility, but great care should be taken in ensuring semantic equivalence. First, in the projection the third-party message attribute name must be directly substitutable with the name of a semantically identical class attribute in the data model. Second, a third-party measurement type can only be morphed into another measurement type refined from the same observable type if there exists for all possible instances (values) in the first measurement type a semantically identical instance (value) in the second measurement type.
To address these concerns it is important to supplement the data model with a data dictionary. This is the subject of Part 2.