RESEARCH

Our areas of research and product development are focused on the following hard problems faced by organizations today.

TRAINING WITH LESS DATA

While organizations today have large amounts of data, their datasets tend to be noisy, incomplete, and imbalanced. This results in data scientists and engineers spending most of their time pre-processing, cleaning, and featurizing the data. These efforts are often insufficient, and deep learning techniques routinely fail on sparse datasets. Organizations are then forced to use classical machine learning techniques that require enormous amounts of manual feature engineering. At RealityEngines.AI, we are pursuing the following three research areas that will enable training on less data.

Generative Models for Dataset Augmentation - Dataset augmentation is a technique to synthetically expand a training dataset by applying a wide array of domain-specific transformations. This is a particularly useful tool for small datasets, and it is even shown to be effective on large datasets like ImageNet. While it is a standard tool used in training supervised deep learning models, it requires extensive domain knowledge, and the transformations must be designed and tested carefully. Over the last 2 years, Generative Adversarial Networks (GANs) have been used successfully for dataset augmentation in various domains including computer vision, anomaly detection and forecasting. The use of GANs makes dataset augmentation possible, even with little or no domain-specific knowledge. Fundamentally, GANs learn how to produce data from a dataset that is indistinguishable from the original data. However, there are some practical issues with using GANs, and training a GAN is notoriously difficult. GANs have been an active area of research, and several new types of GANs including Wasserstein GANs and MMD GANs address some of these issues. Recently, there has also been some work on domain-agnostic GAN implementation for dataset augmentation. At RealityEngines.AI, we will implement and innovate on state-of-the-art GAN algorithms that can perform well on noisy and incomplete datasets.
Combining Neural Nets with Logic Rules/Specifications - The cognitive process of human beings indicate that people learn not only from concrete examples (as deep neural nets do), but also from different forms of general knowledge and rich experiences. In fact, most enterprise systems today, are rule-based systems. Experts have encoded rules based on tribal knowledge from their domain. ML models that are built to replace these rule-based systems often struggle to beat them on accuracy, especially when there is sparse data. At RealityEngines.AI, we want to preserve expert knowledge by developing hybrid systems that combine logic rules with neural networks. While there is some recent research in this area, including a recent paper by DeepMind, laying the groundwork for a general purpose constraint driven AI is still nascent. Most research papers do not address building these hybrid models at scale or incorporating multiple rules into the models. RealityEngines.AI is working on a service that allows developers and data scientists to specify multiple knowledge rules, in addition to training data, to develop accurate models. One example might be a rule that "dog owners tend to like buying dog toys" in a recommender system, or a constraint that a learned dynamic system must be consistent with the laws of physics.
Transfer Learning - Transfer learning is a machine learning technique which allows us to reuse policies from one domain or dataset on a related domain or dataset. By using transfer learning, we will enable organizations to train models in a simulated environment and apply the models in the real world. State-of-the-art language and vision modeling techniques typically pre-train on a large dataset, and then either use fine-tuning or transfer learning to train a custom model on the target dataset. RealityEngines.AI will offer a cloud service that packages and extends the state-of-the-art transfer learning techniques that have proven to result in the most performant models. As part of our service, we will package pre-trained language and vision models and make it possible to easily fine-tune those models or apply transfer learning to adopt them for a custom task.
 

AI Assisted ML

Deep learning has seen great success across a wide variety of domains. The best neural architectures are often carefully constructed by seasoned deep learning experts in each domain. For example, years of experimentation has shown how to arrange bidirectional transformers to work well for language tasks and dilated separable convolutions for image tasks. Neural architecture search (NAS) is a rapidly developing area of research in which the process of choosing the best architecture is automated. Reinforcement learning saw one of the field’s first successes, and recently the computational time for NAS has been made tractable due to continuous optimization and weight-sharing techniques. At RealityEngines.AI, we will use NAS to both fine-tune proven deep network paradigms, and learn novel architectures for new domains.

At RealityEngines.AI our goal is to empower data scientists and developers to create custom production-grade models in days, not months. We are actively pursuing research in compute-efficient neural architecture search (NAS) techniques, automated feature engineering and unsupervised learning. Our AI-powered tools increase the productivity of data scientists and developers by ten-fold and help them focus on the custom aspects of any particular task.

 

Explainability in Neural Nets

Business Analysts and subject matter experts within organizations are often frustrated when dealing with deep learning models. These models can appear to be black boxes that generate predictions that humans can’t explain. Over the last two years, there has been considerable research in explainability in AI. LIME, measures the responsiveness of a model's outputs to perturbations in its inputs, while Google has introduced Testing with Concept Activation Vectors (TCAV), a technique that may be used to generate insights and hypotheses. Google Brain’s scientists explored attribution of predictions to input features in their 2016 paper, Axiomatic attribution for deep neural networks. Our efforts in this area will build on these techniques to create a cloud service that will explain model predictions and determine if models exhibit bias.

 

Sign up to get an invite and take our science for a spin

We are giving a select few organizations invites to test drive our services and validate our research. If you are working on an interesting ML/AI problem, sign up here and we will invite you to try early versions of our services. We will use this information for no other purpose but to contact you about our service. You may delete your information from our records by contacting us here.

 
Copyright © 2019 RealityEngines.AI. All Rights Reserved