Ph. D. Project
Formal methods for knowledge extraction and reuse from heterogeneous sources for semantic interoperability of distributed
architectures
Dates:
2022/02/28 - 2025/02/27
Student:
Supervisor(s):
Description:
Tourism is a strategic issue for France. It has been the world's leading tourist destination for several years, with 89 million visitors per
year, and the sector accounts for more than 7% of France's GDP. In the Grand Est region there has been the emergence of a new
regional policy on tourism. It appears to be a major challenge both in terms of tourist and economic attractiveness, and in terms of
participation in the definition of a true image of the Grand Est. Defined at the end of a work of convergence, the regional strategy is
accompanied by the elaboration of a Regional Tourism Development Plan (SRT). Voted on in March 2018, it meets the requirements
of attractiveness, excellence and proximity in order to promote a sector which alone represents 4% of regional employment. The
transformation of the tourism market began in the early 2000s with the democratisation of the internet; Booking and Expedia
appeared in 1996 and Trip Advisor in 2000 for example. The appearance of OTAs (Online Travel Agencies) and online comparators
had a direct impact on the existing value chain, which included a large number of intermediaries. Numerous platforms have thus
developed with the aim of centralising internet traffic, capturing and monetising customer data in order to position themselves as an
obligatory point of passage for tourism players. Through commission on sales, sometimes up to 30%, these players have become a
strength and a weakness of the current model, leading to an increase in sales prices in order to take commissions into account. The
traveller wants to be able to manage the booking of his next holiday simply and from start to finish via a single point of contact. As an
actor strongly involved in the economic development of the massifs, and in our region aiming at the development of the Massif des
Vosges, the Syndicat National des Moniteurs du Ski de France (SNMSF), which gathers 17,000 instructors within the French Ski
Schools (ESF) throughout the country, has been participating in the development of Tourism since 1945.
In 2016, the SNMSF launched a marketplace called "Mon Séjour en Montagne" (My Mountain Stay), which is based on the strong
values of fairness for tourism players and transparency and fluidity for the customer. This collaborative platform carries the SNMSF's
vision of sustainable, fair and inclusive tourism.
The project aims to make the platform a more equitable means of laying the groundwork for a global, intelligent and connected
system (Smart Montagne) allowing the collection and processing of data in real time to extract new knowledge for tourism purposes.
This data and information will then be aggregated and transformed by automatic inference systems to formalise existing implicit
knowledge.
In order to make precise and concrete scientific contributions to this project, which is already underway, an interoperable systems
engineering approach will be used, which consists of relying on different types and levels of abstraction or models. These models
should express and formalise not only the "structural" aspect of the system components, but also their behaviour (Maier, 1998), which
may be constrained by the specific requirements of the system domain (business rules). Another type of constraint can be induced by
the interoperability protocol(s) which can impose strict rules to endow interoperable systems with properties such as autonomy,
confidentiality and transparency.
The objective of this research project is twofold: on the one hand, to model data from heterogeneous sources and, on the other hand,
to study the problems posed by model-driven engineering in cooperative systems. Involving cooperation concerning "systems of
actors" willing to interoperate. Collaborative systems are now organised in networks, i.e. as complex systems (Camarinha-Matos,
2014). The complex systems envisaged will be composed of networks of Cyber Physical Systems, intelligent sensors, which will
retrieve data by inserting the context and thus form information networks (Cardin, 2016).
Faced with this challenge, the scientific obstacles concern:
1. The lack of formalisation (in other mathematical terms) of the agglomeration of information in system models and the
information systems that emerge from them, as well as the definition of the semantics of the concepts and relationships they
implement, to ensure their common understanding, and to facilitate their interoperation by minimising semantic losses;
2. The adaptation (or even extension) of tools of an algebraic and/or geometric nature (lattice theory, category theory, homological
algebra) in the context of the analysis of formal concepts, for the treatment of heterogeneous data in constant evolution. This is a
recent approach that has not yet been fully developed (even from a mathematical point of view) for this type of data.
The proposed thesis topic is clearly positioned in the continuity of the thesis carried out by Mickael Wajnberg from 2017 to 2020 at
CRAN, which laid the foundations of a versatile knowledge extraction methodology and more specifically on the management of
association rules in relational contexts. The proposed thesis falls within the framework of the Strategic Expertise Area "Industry 4.0".
Previous work has demonstrated the interest of a holistic approach to all information resources and has allowed the development of a
methodology focused on the optimisation of knowledge management. The present proposal therefore aims to continue the work
undertaken by developing a formal method for the extraction and reuse of knowledge from heterogeneous sources for the semantic
interoperability of distributed architectures. This method will be integrated as a methodological brick in the management process of
information resources suitable for decision support. The case of application will be constituted by the information and data feedback
systems of at least one of the Vosges ski resorts.
year, and the sector accounts for more than 7% of France's GDP. In the Grand Est region there has been the emergence of a new
regional policy on tourism. It appears to be a major challenge both in terms of tourist and economic attractiveness, and in terms of
participation in the definition of a true image of the Grand Est. Defined at the end of a work of convergence, the regional strategy is
accompanied by the elaboration of a Regional Tourism Development Plan (SRT). Voted on in March 2018, it meets the requirements
of attractiveness, excellence and proximity in order to promote a sector which alone represents 4% of regional employment. The
transformation of the tourism market began in the early 2000s with the democratisation of the internet; Booking and Expedia
appeared in 1996 and Trip Advisor in 2000 for example. The appearance of OTAs (Online Travel Agencies) and online comparators
had a direct impact on the existing value chain, which included a large number of intermediaries. Numerous platforms have thus
developed with the aim of centralising internet traffic, capturing and monetising customer data in order to position themselves as an
obligatory point of passage for tourism players. Through commission on sales, sometimes up to 30%, these players have become a
strength and a weakness of the current model, leading to an increase in sales prices in order to take commissions into account. The
traveller wants to be able to manage the booking of his next holiday simply and from start to finish via a single point of contact. As an
actor strongly involved in the economic development of the massifs, and in our region aiming at the development of the Massif des
Vosges, the Syndicat National des Moniteurs du Ski de France (SNMSF), which gathers 17,000 instructors within the French Ski
Schools (ESF) throughout the country, has been participating in the development of Tourism since 1945.
In 2016, the SNMSF launched a marketplace called "Mon Séjour en Montagne" (My Mountain Stay), which is based on the strong
values of fairness for tourism players and transparency and fluidity for the customer. This collaborative platform carries the SNMSF's
vision of sustainable, fair and inclusive tourism.
The project aims to make the platform a more equitable means of laying the groundwork for a global, intelligent and connected
system (Smart Montagne) allowing the collection and processing of data in real time to extract new knowledge for tourism purposes.
This data and information will then be aggregated and transformed by automatic inference systems to formalise existing implicit
knowledge.
In order to make precise and concrete scientific contributions to this project, which is already underway, an interoperable systems
engineering approach will be used, which consists of relying on different types and levels of abstraction or models. These models
should express and formalise not only the "structural" aspect of the system components, but also their behaviour (Maier, 1998), which
may be constrained by the specific requirements of the system domain (business rules). Another type of constraint can be induced by
the interoperability protocol(s) which can impose strict rules to endow interoperable systems with properties such as autonomy,
confidentiality and transparency.
The objective of this research project is twofold: on the one hand, to model data from heterogeneous sources and, on the other hand,
to study the problems posed by model-driven engineering in cooperative systems. Involving cooperation concerning "systems of
actors" willing to interoperate. Collaborative systems are now organised in networks, i.e. as complex systems (Camarinha-Matos,
2014). The complex systems envisaged will be composed of networks of Cyber Physical Systems, intelligent sensors, which will
retrieve data by inserting the context and thus form information networks (Cardin, 2016).
Faced with this challenge, the scientific obstacles concern:
1. The lack of formalisation (in other mathematical terms) of the agglomeration of information in system models and the
information systems that emerge from them, as well as the definition of the semantics of the concepts and relationships they
implement, to ensure their common understanding, and to facilitate their interoperation by minimising semantic losses;
2. The adaptation (or even extension) of tools of an algebraic and/or geometric nature (lattice theory, category theory, homological
algebra) in the context of the analysis of formal concepts, for the treatment of heterogeneous data in constant evolution. This is a
recent approach that has not yet been fully developed (even from a mathematical point of view) for this type of data.
The proposed thesis topic is clearly positioned in the continuity of the thesis carried out by Mickael Wajnberg from 2017 to 2020 at
CRAN, which laid the foundations of a versatile knowledge extraction methodology and more specifically on the management of
association rules in relational contexts. The proposed thesis falls within the framework of the Strategic Expertise Area "Industry 4.0".
Previous work has demonstrated the interest of a holistic approach to all information resources and has allowed the development of a
methodology focused on the optimisation of knowledge management. The present proposal therefore aims to continue the work
undertaken by developing a formal method for the extraction and reuse of knowledge from heterogeneous sources for the semantic
interoperability of distributed architectures. This method will be integrated as a methodological brick in the management process of
information resources suitable for decision support. The case of application will be constituted by the information and data feedback
systems of at least one of the Vosges ski resorts.
Keywords:
Knowledge formalisation, Multi-relational data mining, Relational concept analysis, CPS
Department(s):
Modeling and Control of Industrial Systems |
Publications: