Role: Contextualised Synonyms and Similarity for Natural Language querying of databases
CountOpen is a six-person startup based in London.
Their first goal is to level the playing field in the understanding of data.
They see the value that some people derive by interrogating data to provide a basis for their own, fact based, decisions; but also see the barrier of expertise put up by overly complex and technical products which deny others access to that value. They want to break this cycle by making software which, through innovative technology and design, allows everyone to interact with data no matter what their ability.
They see natural language as playing a key part by removing the need for users to gain the expertise to translate their intent into some technical pseudo-language (SQL or a series of menu selections and shortcuts for example) instead allowing them to express it as they would to themselves or a colleague.
Contextualised Synonyms and Similarity for Natural Language querying of databases
Making a computer provide synonyms for a given word or to assess the similarity of a pair of words is a difficult and unsolved problem. Some of this problem is due to the difficultly of providing a good metric of distance between a pair of words – many metrics exist, based on semantic networks, word embeddings and the like – while some is due to having to try to replicate the human sense of context.
For example, for the word ‘turnout’: for data on footfall in a venue, ‘attendance’ is a good synonym; for data on road maintenance, ‘lay-by’ is; and what about in the context of a ballet venue, where ‘turnout’ is also a ballet technique?
You will work on the provision of context dependent synonyms as part of real-time autocomplete on a person’s natural language input querying a database.
This will involve work on the distance metric, forming a contextualised distance metric and then assessing the goodness of this metric (and determining how to assess its goodness) in both focused tests and in the integrated product.
You will also get to see what it is like working in a tech start-up, from design, testing and commercial discussions to the day-to-day work of the development team.
How to apply: