Our approach aims to bootstrap the development of a conversational agent by generating a dataset that contains an arbitrary number of dialogues for training any neural network of one’s choice. We build a Machine-to-Machine (M2M) system, with three main components: a prompt generator, a user simulator, and a task-oriented dialogue system (TODS). With the help of semantic technologies, the domain-scope knowledge is mapped under an ontology, and the dialogue context is represented as a local knowledge graph, while pre-defined rules transform text templates into natural language responses. The final metrics obtained highlight the benefits of the aforementioned framework.
More details about the work can be found in the attached paper. The work is the result of a Master Thesis by V.I. Iga, coordonated by prof. G.C. Silaghi, at FSEGA, UBB, Cluj-Napoca, Romania.
The TOD System alone can be found here --> https://github.com/IonutIga/TOD-System.
A NLU BERT-based model finetuned using datasets generated by the Dialogue Simulator can be found here --> https://github.com/IonutIga/Domain-Specific-NLU-BERT.
Version 1.0 is the standard build of the system. It is described in the attached paper.
Version 1.1 modifies some probabilities to add more update type utterances and more parameters to utterances.