Y Meadows Technical Overview

Audience

This document is intended for the Technology teams of prospective Y Meadows customers and partners.

Purpose

This document will explain the basics of the Y Meadows product and its technology. We will be focusing on how Y Meadows integrates with customer and partner systems and how Y Meadows keeps our customer data secure.

Introduction

Y Meadows endeavors to deliver a product that integrates with the IT systems of our customers with the least possible amount of time and effort from the customer’s IT department. The nature of our product requires us to handle sensitive data from our customers. This data is often subject to regulatory and compliance requirements. We take the security of our systems and data very seriously. We strive to be able to accommodate any specific requirements that a customer may have.

Certifications

Y Meadows is SOC2 certified and ISO-27001 certified.

Product Descriptions

Y Meadows is an application that builds Artificial Intelligence (AI) models to determine the intent of text-based communication and then, based on that determination, executes a specific workflow for that intent. Typically, the incoming message is a ticket / issue / case from a service / support desk system, but it can also be an email, a webform submission, or a message in a shared inbox email system. The workflows usually try to either:

a.) resolve the case and reply to the sender or

b.) direct the case to the correct queue / team / person, or

c.) enrich the case with additional information that can help resolve it more efficiently.

The Y Meadows workflow is made up of a series of steps that can execute API or Web UI automations in sequence to accomplish these goals. A workflow can include a human review step where messages are put into a queue for a human to confirm our AI model’s determination before Y Meadows acts on it.

Example

An e-commerce company called ACME Corp may use a Zendesk ticketing system that receives all emails sent to customerservice@acme.com. The most common email that ACME receives is from customers asking about the status of an order. These emails may include examples like:

“Where is my order?”
“What’s the status of my order?”
"When can I expect my shipment to arrive?”

ACME Corp can connect Zendesk to Y Meadows so that Y Meadows can act on every new message. In this case, the new Intent will be called “Order Status Request”. ACME will give Y Meadows examples of previous messages they received that were classified as the“Order Status Requests” intent. Y Meadows will create an AI model based on those examples. ACME will create a workflow (called a “Journey” in our application) that will take steps for every new message classified as an “Order Status Request”. In this example, The steps would do the following:

Make an API call to the CRM system to lookup the user’s ID from their email
Make an API call to another system to get the user’s latest orders
Make an API call to a third system to get the status of that order
Apply that data to a template to write a response
Add the response to a queue for human review
If approved, respond to the customer
Close the case

Deployment Options

Generally, Y Meadows is offered as a SaaS product. The application is hosted in our cloud (currently Google Cloud Platform - GCP). However, we can accommodate other deployment options including on-prem deployment.

For on-prem deployments, we will require Linux machines (bare metal or virtual) with CPU, RAM, and disk requirements that we will specify. GPUs are also highly recommended.

On-prem installs increase the cost and complexity of delivering and maintaining the product. However, it may be necessary for companies that have security, privacy, or compliance issues that preclude them from using our cloud product. We are also able to accommodate requests about the nature of our own cloud deployment

if needed we can deploy to specific clouds (e.g. AWS GovCloud) or to specific geographies (e.g. Europe) to meet regulatory requirements.

Integration Points

Y Meadows integrates with our customers’ IT systems in a couple of ways.

We integrate with the source of messages (e.g. Office 365, Gmail, Front, Salesforce, Zendesk, etc.). We prefer to integrate with the same system that customer service agents use. So, for example, if a customer has an email system that forwards to a ticketing system, we would prefer to integrate with the ticketing system, not the underlying email system.

These integrations differ based on the source system, but typically we either need an API key or its equivalent created for us. The API key should be scoped to the minimum access that we need. At a minimum this means reading the messages that we are supposed to monitor and being able to edit the ticket (e.g. assign it, comment on it, etc.) and to reply to it as well. This all depends on the use cases we need to implement for.

We may also require an integration to be configured to notify us of new messages. For some systems this means configuring a webhook or rule. In other systems, the API access is all that is necessary. For example, with Salesforce, we can use the API to create a PushTopic which will notify us of all new Cases.

In our workflow system we have steps that effectively mimic the actions that a human agent would have taken. These actions are usually API calls. We can also execute web UI automation. The steps are typically Python scripts - we can execute anything that can be done from a Linux based system. We store credentials securely for these systems (e.g. API keys, tokens, passwords, etc.).

We have pre-written steps for common systems, but we can write custom steps that are tailored for each customer. These steps can talk to proprietary customer specific systems.

In general, these steps have the same level of access as the human who is currently performing them. We always prefer to have limited access restricted to what the step needs to do and no more - under the “least privilege” security principle.

Our steps need to be able to access the 3rd party systems. If the system is reachable on the public Internet, this is simple. However if the system is behind a firewall, we need to be allowed access. We can provide an IP address for whitelisting and we can authenticate using any authentication scheme required (e.g. OAuth 2.0, API Key, Bearer Token, Basic Auth, mTLS, etc.). If the application cannot be made accessible to the public Internet, we have other options available to connect to the system. Please speak to us for more details.

Data Security and Life Cycle

We have a few different types of customer data that will be discussed here:

Messages (i.e. emails, attachments, cases, tickets, etc.)
Intermediary Data - Data that steps create during a workflow
External System Credentials

Messages are kept in our database by default. If permitted by the customer, messages may also be used to help improve the customer’s AI model.

Messages may optionally be automatically deleted after a customer-specified period of time. For example, our system could delete message data 90 days after the completion of the journey. However, if messages are deleted, we can no longer use those messages to re-train and improve the AI model. Machine learning works best with periodic retraining. We suggest that messages be kept for as long as possible, striking a balance between the AI accuracy and data security.

Intermediary data is the data retrieved by various steps in an automation. One step can get data that a future step can consume, therefore the data needs to be retained until all of the steps are completed. Once the workflow execution is complete, the data can be deleted. Intermediary data is usually deleted shortly after a trip is completed, but may be kept for a longer period for diagnostic purposes if the trip fails. Customers can determine how long those periods are.

Messages and intermediary data are stored in our SQL database. They may also be stored temporarily in other systems like our messaging system and our workflow orchestration system. All of our data stored on disk is encrypted by our cloud provider at the disk level and during transit.

As noted above external system credentials such as API keys and passwords are needed to access messaging systems or other systems that steps interact with. We take additional precautions to secure this data. The data is stored using a 3rd party product called Hashicorp Vault. The data is encrypted at the application level in addition to disk level encryption. This data can never be seen via the UI. It is available only to the message retrieval system to get messages and to the specific steps that need it.

Any files accessed during message processing, such as email attachments and downloads, are normally deleted as soon as related automations complete. In the event of errors during execution, files may be retained for up to 30 days to aid in troubleshooting.

General Application Architecture

Our application is made up of a few main components:

The Backend - a Java application that is the core of our application. It hosts all of the application APIs and runs several scheduled jobs and event driven tasks.
The Frontend - the UI is a single page application (SPA) that is written in JavaScript using React. It is purely API driven. We do not perform any server side HTML rendering. The frontend is served via an Nginx web server.
The Workflow Orchestration System - this manages our workflows
The Step Services - we use a “severless” type model for our steps. Each step runs as a Docker container hosting a single HTTP API. The step service is auto-scalable..
Database - we use CockroachDB, an open source database largely compatible with Postgres. It is optimized to run and scale well in the cloud.
Vault - Hashicorp Vault used to store secrets like external system credentials.
Keycloak - our authentication system. Keycloak manages access to our UI and API. It can support many different authentication schemes, including single sign-on systems that use SAML or OIDC. It can also federate identity with LDAP. Keycloak is a Red Hat supported project.
Machine Learning Training System - a Python application that trains new AI models.
NLP Inference System - a Python system that hosts the AI model and makes predictions.

All of our applications are containerized with Docker. We use Kubernetes as a container orchestration system.

We use a single tenant architecture. Each customer gets a separate instance of the application including all of its components. This means that each customer's data is in a separate database from all of the other customers.

Authentication

Access to the UI and APIs is access controlled using an RBAC system. We use an open source system from RedHat (IBM) called Keycloak. This allows us to support many different authentication schemes like username and password or single sign-on. We can integrate with an LDAP system or anything that supports SAML or OIDC. For example, we can use Okta or Google Workspace for logins. If you choose username and password authentication you can customize the password policies to meet your needs.

Each of the end users can be assigned a role in Keycloak which will determine their access level in the application.

Internally, authentication works by having the user perform an OIDC login with Keycloak. Keycloak grants the user a JWT (JSON Web Token) signed by Keycloak. Our backend APIs validate the token and read it to determine the user’s access level. We use Spring Security in our Java application to control access.

Operational Security

Security is very important to us and we strive to follow all industry best practices and to constantly improve our security.

Our application is hosted in Google Kubernetes Engine. This means we have a shared security model where Google is responsible for infrastructure security and we are responsible for managing workloads, securing data in transit and at rest, developing secure applications and managing access and roles. Google maintains the Operating Systems on all of our Kubernetes nodes (we use Ubuntu).

Our cluster nodes use only Internal IPs, which are not accessible to the public Internet. Our internal systems can make outbound calls to the Internet, but inbound traffic cannot reach them directly. Our application opens an outbound tunnel using a product called Cloudflare Tunnel. Cloudflare provides our DNS, WAF, and CDN services. Inbound Internet traffic flows into the cluster only through Cloudflare via the tunnel. Cloudflare provides DDoS protection and its firewall prevents malicious traffic.

For our operators, access is restricted based on IP whitelisting and Google authentication. We use mandatory MFA for all of our employees where the 2nd factor is a hardware security key.

Cloudflare ensures all of the traffic flowing to our cluster on the public Internet is encrypted. SSLabs rates our HTTPS setup as A grade.

We undergo a penetration test annually.

Application Security

All application code is peer reviewed. We design new features with security in mind at all points during the development and deployment process. Our engineers look for security issues in every code change.

Code is scanned with SAST and IAST scanners looking for potential vulnerabilities.

GitHub’s Dependabot notifies us of dependency updates for our open source dependencies and specifically calls out security updates.

We scan all of our Docker images at build time and re-scan them daily. If a vulnerability is detected, we remediate the issue in a timely fashion described in our patch management policy based on the severity of the vulnerability.

Our code is maintained in GitHub for source control. We require MFA to be enabled for all of our GitHub accounts. We use GPG commit signing to ensure that code changes can only be made by our engineers.

We use CSP and other security HTTP headers to enforce the latest browser security features.

Corporate Security

All Y Meadows employees undergo security training. This includes anti-phishing and password hygiene training.

Most of our systems use Google Workspace as a single sign-on provider. We require high quality passwords, the use of password managers, and MFA everywhere possible. Every employee is issued a physical security key (Yubikey or Google Titan) to use as the 2nd factor.

We evaluate the security practices of all of our vendors as outlined in our SOC2 and ISO27001 certified Third Party Management Policy.

PreviousCreating Intent Datasets NextHow to Report an Issue

Last updated 10 months ago

Was this helpful?