Rita Nunes Posted April 22, 2024 Posted April 22, 2024 We understand the importance of data security, and to ensure the utmost protection of your information, AllAi has implemented a comprehensive set of security measures. To provide you with a clear understanding of our commitment to data security, we have compiled a list of frequently asked questions that will address any concerns you may have. ➡️ Q: Where does Customer Data reside? Who has access to the data? What security controls applied? Answer: Data about your customer, which resides in Salesforce, remains in Salesforce and is never transferred to or through the LLM. Metadata, such as code, object structures, and similar elements, is stored. For data storage, in a code repository (GitHub): standard access restrictions apply, meaning only the team members allocated to the project have access to it. Regarding data storage in a private embedding index (Milvus): by private we mean that the index is accessible solely by the subscriber who created it. The subscriber is protected through Auth0 for both creation and retrieval processes. Retrieval is safeguarded before reaching the LLM. As for the data processor, the relevant segments of metadata, combined with the Developer’s Prompt, are sent directly to one of two OpenAI endpoints (/v1/chat/completions or /v1/completions), both of which adhere to OpenAI's zero-retention policies. ➡️ Q: Are user requests and AI answers or completions stored and shared? Answer: Customer queries (and the associated outputs) are considered confidential information and OSF agrees to protect such information under our agreement We securely store user requests and responses to provide users with access to their conversation history in certain services, such as Chat. The queries entered by users and the answers provided by AllAi are not used by OSF to further train the AI, except for the customer's benefit when agreed upon, nor will they be shared or become public information. Note that if the user utilizes the code indexing feature (available in the Pro and Enterprise plan), we host the indexed data. The servers hosting this data can only be accessed through the application's IPs within a restricted Virtual Private Cloud (VPC). Even for this specific feature, where we host the indexed data, we do not use it to train the AI, and it is not shared. ➡️ Q: What security controls are in place to protect data and any IP imported/processed into/by the AllAi? Answer: "Private" embeddings are secured via Auth0 authentication. We ensure secure user authentication through integration with Auth0, an identity management system. We employ "ZeroCopy" policies for the LLM itself. As an enterprise customer of OpenAI, our agreement prohibits OpenAI from using input or outputs for any other purposes, including for training their AI. We adhere to the policy outlined in Enterprise privacy, and we have explicitly opted out of allowing API data to be used for training or improving OpenAI's models. We have implemented a protected data retrieval system to ensure the verification of private customer data within the workspace. We categorize information based on its sensitivity into two main types: public data, such as Salesforce documentation and GitHub repositories, and other data categories. Public data sources are accessible to everyone, whereas company-specific data is private and limited to their specific database. When a customer decides to leave, we delete their database after a period that is defined in terms of use. We use encryption techniques to secure data and make it unreadable to unauthorized users. Our database is encrypted using industry standards (We use Atlas DB for managing databases) and TLS encryption for prompts via HTTPS. Additionally, we offer a toxicity detection system that can be tailored to specific use cases. But by default, we utilize OpenAI's abuse monitoring system. ➡️ Q: What specific Data privacy laws are covered by security implementations within AllAi Productivity Platform and how are these applied and fulfilled (in a partner model)? Answer: OSF have implemented compliance controls for our teams meet their compliance obligations and adhere to industry regulations, depending on the nature of the product and the data it handles. (PO-OSF-09_A4_Security Statements for OSF IT Systems processing PII.docx ) We have abbreviated technical and privacy documentation for our customers, detailing our compliance with GDPR, and guides to help enable secure and compliant use of our products and services. Company emails are stored by OSF middleware during the usage contract time to collect behavioral analytics. On customer request after termination, OSF can delete these emails hey can be deleted by request. ➡️ Q: How are developers/ team members trained in security best practices when using AllAi capabilities to run their work. How is malicious use prevented? Answer: The same guidelines that typically apply to maintaining intellectual property in a code repository (GitHub) are enforced. This ranges from code isolation between teams to best practices for not storing credentials and secrets within the code. There is a human in the loop when code is produced, which reduces the risk of unsupervised malicious code generation. The standard delivery process includes a code review by a senior developer, thereby reducing the risk of unintentionally deploying malicious code. We also maintain internal policies to assess outputs for accuracy, biases, appropriateness, and usefulness before relying on and utilizing them. (MC-OSF-03_P08_AI Code of Conduct.docx) ➡️ Q: Do other OSF customers have access to the customer code/data touched by AllAi? Answer: The codebase is not stored in the shared LLM; rather, it is kept in a "private" knowledge base. Access to these is restricted to the team assigned to the project. The retrieval of the codebase occurs before it reaches the LLM for generation. We have some default data sources (Salesforce Docs, B2C Commerce, etc.) that are shared with everyone, but custom data that is specific to a company will remain private and scoped exclusively to that company. Additionally, a single "Private" knowledge base can be activated at any given time, which prevents developers assigned to multiple projects from unintentionally merging the information during generation. Queries that users enter and the responses provided by AllAi are not being used by OSF to further train the AI, nor will they be shared or become public information. There are some initiatives, agreed upon with the customer, that use their code to improve AllAi's specialization, but this is done through a back-channel agreement and is not associated with the AllAi productivity enhancements. ➡️ Q: How does OSF protect against a threat actor in the context of AllAi usage on the customer side? Answer: The main, and most effective safeguard remains the human in the loop and the traditional deployment processes requiring Code Review. Malicious attempt at modifying or injecting into the generated code base would requires bypassing the vigilance of the Developer (human in the loop) and Code Review processes. Malicious Actors with access to the LLM still don’t have access to the Codebase as it is stored in a “private” knowledge base. Malicious attempt at reverse understanding the codebase would require access to (Milvus) which is shielded by single user authentication. There is also a per-client authentication where Retrieval operations from RAG are restricted to the content indexed by the user, preventing information leakage. Retrieval happening before ever hitting the LLM to be less subject to typical injections. For protection against prompt injection, we primarily rely on OpenAI's security measures. Nevertheless, it is important to acknowledge that there is no method to completely safeguard against potential prompt injections in certain services, such as Chat. Non-chat applications such as inline code completion and Devops are significantly less susceptible to these risks. This is due to the fact that users do not have the capability to iteratively prompt the system, which effectively reduces the likelihood of most prompt attack strategies. ➡️ Q: Do you have a SOC 2 certification for AllAi Productivity Platform? SOC 2 certification is in progress but not supported yet.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now