Simplish The team and the objectives
The Simplish team is part of the group working on Artificial Intelligence at The Goodwill Company Limited, a venture technology and innovation company dedicated to serving the Defense, Electronics, Power and Aerospace industries from their headquarters in Guildford, England, close to Farnborough and Aldershot – the home of the British Army.
The main objective of this team effort is producing a means to reduce the number of words employed to convey knowledge, while substantially maintaining the information content. Currently, the Simplish wizard is able to translate text, based on a 100,000+ vocabulary, to a representation using less than 1,000 words.
Making available this tool for use by the general public, through the site, we provide a useful service that our team is very happy to offer to those whose mother tongue is not English, particularly in the global scientific community; as well as those wanting to use it for AI or Big Data for instance.
Reduced-vocabluary representation From 100,000 to 1,000 words
Possibly the most important aspect of cognition has to do with memory. We know that the processes of acquisition, storage and retrieval of knowledge lie at the heart of human cognition. Furthermore, it has long been known that organization and memorization are inseparable and that memory is aided by meaning. Therefore, working out a way to establish the meaning of words in an artificial cognition system helps organization and is a crucial step in developing these systems. One way to achieve this objective is to assign meaning in terms of the other words in a vocabulary. However, attempting to associate 100,000 words as in Standard English, to each other is an almost impossible task.
On the other hand, relating the core 1,000 words of Basic English to each other has already been done by another team within our group using multivariate methods, which also provides means to create a multi-dimensional graphical representation of a sentence, in a space pre-conditioned by the inter-relationship of all words in this reduced lexicon.
In a multidimensional space representation of this reduced-vocabulary language, the central low-dimensionality points describe words and their relationship, while increasingly complex phrases are represented by higher-dimensionality points or trajectories (that look very much like Chinese symbols or ideograms). Core relationships have been derived from a single interchangeable individual and then refined; unlike the majority of current efforts which rely on a large corpus of text, which have the disadvantage that ambiguity is a major problem.
Ambiguity The main challenge
A descriptive phrase may either be represented by a complex word (i.e. not one of the Basic English lexicon) or it may represent an abstract concept. In this way, our method is able to cope with both data-driven requirements and concept-driven information needed for problem-solving. This representation is conceptually similar to Chinese writing with ideograms, with the added advantage that the source data can be extended both to other languages and to other alphabets. In this pre-conditioned space two phrases written using different words that convey similar meaning will be represented by a similar ideogram, which can be compared to existing data, analyzed for concepts being searched for and/or extrapolated.
The applications Cognitive systems and Simplish
Once a reduced-vocabulary representation of a text is produced, it is substantially easier both for a foreign reader to assimilate and to devise artificial cognition systems to process this knowledge. Cognitive systems are natural or artificial information processing systems, including those responsible for perception, learning, reasoning, decision-making, communication and action. Many of the new application opportunities lie at the interface between life sciences, social sciences, engineering and physical sciences - such as bioinformatics -, data-mining, semantic web, and human-machine interaction for command and control of autonomous vehicles/robots.
Another service we are currently working on and expect to be able to offer soon is a semantic search engine, based on the Simplish reduced-vocabulary wizard and a meaning-assignment module. Our group is concentrating on the following research areas, in which we are keen to make contact with potential academic collaboration partners, particularly in the EU:
Multidimensional intelligent agents – computing started as sequential, then arrays, then distributed nodes, parallel processing, and lately intelligent agents in distributed networks. We are interested in extending these to multi-dimensions to develop analogues for memory executives, environment updating, and basic function monitoring, low-level assistants, deep processing for long-term memory representation, among others.
Learning strategies for the cognition model – a fundamental aspect of our strategy lies in generating initial schemata of concept associations, which removes ambiguity associated with corpus-based schemes. However, to be generally useful the model must have the capability to evolve based on learning and recalculation of the initial spatial relationships. The developer can choose material at will, thereby biasing in the desired direction this learning process.
Expert systems logic engine – Although the memory acquisition-storage-retrieval process is key, a method like ours that redefines the pattern-matching paradigm to a concept-matching approach requires an efficient algorithm for associating concepts and making logical deductions in this modified conceptual space. Our initial ideas are oriented towards making an extension to existing algorithms that use a combination of neural nets and fuzzy logic.
Code execution in multidimensional memory spaces – This is a heavily mathematical area with considerable emphasis in computer system architecture aimed at developing codes capable of running within the multi-dimensional space itself, so in the long term multidimensional lattices could be produced where the cognition system could run without the aid of an outside processor, as in the case of a human brain.
Loebner Prize competition – This is a yearly competition for conversational agents aimed at passing the Turing test. The intention is to supplement a standard conversational engine with our concept-matching module and a small knowledge base, and enter this competition.