Introduction
Air traffic control (ATC) is by no means an easy task: Directing multiple aircraft through a huge, three-dimensional airspace simultaneously involves a lot of work and there is no room for error. Therefore, a number of people are developing more-advanced technology in order to enable controllers to do their jobs more efficiently and safely. One example of such is the arrival management software 4D-CARMA from the German Aerospace Center. An arrival manager (AMAN) is a system used by controllers to plan aircraft arrivals and to monitor their progress according to an arrival plan already determined for the day. In this particular case, natural-language understanding is is being developed for such software in order to allow the system to understand and process commands given by controllers to aircraft on approach so that many tasks once performed manually by the controller can be done automatically.
Objective
Unfortunately, at the time of the project, there was very little data available for creating an automatic speech recognition (ASR) system which performed well for ATC: Not only is there e.g. a significant amount of signal noise to compensate for, but even pronunciation is different from that of standard English. For these reasons, the output of an ASR system trained on “normal” data was instead improved using a logical constraint program which encoded static and dynamic knowledge about the airspace which the system is being used to control. Since the actions a pilot can take are very restricted and the pilot cannot undertake any actions without being explicitly commanded by a controller to do so, it is very easy to model an ATC scenario with relatively few constraints.
Methods
A number of simple constraints were defined which recognized air-traffic control commands must satisfy in order to be semantically and pragmatically valid given the state of the airspace at the time the command was given.
For example, given that flight DLH645 is on an approach vector (i.e. is preparing to land) and is currently flying at flight level 220 (i.e. at 2,200 feet), a controller would never issue a command to descend to flight level 250 — such a command pragmatically makes no sense given the current state of the ATC scenario.
Despite that this method of rescoring was used for speech recognition, knowledge-based classification can be incorporated into any linear classifier while still maintaining linearity: The score from the original classifier is simply summed with that of the knowledge-based scorer.
Results
A pre-recorded corpus of 1,107 ATC commands were recorded using sixteen participants who participated in a simulated ATC scenario, giving commands to aircraft in the scenario which had been pre-planned just as would be done in a real-life scenario. Using this corpus annotated with the gold standard for each command, the ASR system used for classifying the commands was then run with the rescoring method described above. Doing so resulted in roughly an 80% reduction in the word error rate of the ASR system. For more information, see my M.Sc. thesis and/or the subsequent publication.