Bosch Case Study
MarbleRun Enables Bosch’s Confidential AI Pipeline in the Public Cloud
Automotive, Machine Learning
Bosch is a global technology and engineering company with over 420,000 employees and more than €80bn in annual revenue. One of the major areas of focus is the development of automated driving systems.
Bosch engineers trained an AI model on video footage filmed with a front-facing car camera to improve their automated driving capabilities. As they wanted to leverage public cloud services but still needed to abide to GDPR, they decided to keep data always encrypted using confidential computing technology (Intel SGX).
Bosch selected MarbleRun to scale and verify the confidential computing based AI pipeline. Marble Run helped specifically with:
- ensuring a simpler and better verification process,
- providing a secure service-to-service communication layer to enable independent scaling of pipeline steps, as necessary, and
- defining future access policies to guarantee the secure access needed for Day-2 operations, such as maintaining and updating the pipeline.
- Bosch can now limit and prove to others, for example regulators, that none of their employees (nor anybody else) have access to the raw personally identifiable information (PII), further abiding by GDPR requirements.
Trusted execution environments play a crucial role in implementing future-proof data strategies. In our collaboration with Edgeless Systems, their versatile tools turned out to be the missing link between confidential computing and scalable infrastructure.
Bosch is a global technology and engineering company, with a concentration on automotive and industrial applications, consumer products, as well as energy and construction solutions. They employ over 420,00 employees and have over €80 billion in annual revenue. A significant aspect of their divisions involves pioneering automated driving systems via the Bosch mobility sector.
To improve their advanced driving assistance systems (ADAS), Bosch managed a project to collect street data. They filmed streets with a front-facing car camera, labelled the video, and used it to train their neural network.
When processing publicly collected video data, Bosch needed to distinguish how they worked with two kinds of information: personally identifiable information (PII), such as faces and license plates, and non-PII data. Both types of data were passed between many tools in a complex AI pipeline and Bosch needed dedicated technology to keep this process secure and the information shared confidential. And to adhere to stringent European GDPR requirements.
GDPR greatly limits the use of personal data. Bosch engineers knew that this highly useful data could also be a liability if improperly stored and processed. In the past, keeping to the strict privacy requirements would have required the information to be handled ‘on prem’, potentially air gapped from intruders. However, the cost of such infrastructure for each individual project would have been exorbitant, and thus not realistic over the long term.
Bosch’s security architects understood that confidential computing facilitates a compliant, ethical and cost-effective processing of the data, even on Microsoft Azure. However, the implemention of a scalable and truly confidential AI pipeline was challenging.
While individual services throughout this pipeline were confidential, the transmission of information between them in a secure way was not straightforward. Several challenges needed to be addressed to ensure privacy throughout.
Ensuring the confidentiality, integrity, and authenticity of the pipeline is crucial for all stakeholders involved. Third parties, such as regulators, may also need to verify the pipeline as a whole.
The verification process should cover the entire deployment rather than individual services and should be applicable at any given time, including scaling events.
Secure service-to-service communication
The distributed nature of the pipeline necessitates secure inter-service communication through encryption and authentication.
Implementing these measures at the application layer would blur the separation of concerns and require modifications throughout the application.
To address this challenge, a service mesh architecture can be employed.
The confidentiality of the pipeline requires that the encryption and authentication provided by the service mesh can be verified during remote attestation.
The pipeline should not be treated as a "fire and forget" system but rather as a cloud-native application that requires ongoing maintenance. This includes activities such as updates, access management, configuration parameter changes, secrets management, and migration.
All of these operations need to be properly authenticated and verifiable. Additionally, the verification process of the system should determine which operations can be performed and by whom.
Through a stack of composed of TensorFlow, KubeFlow, Gramine, MarbleRun and Intel SGX on Azure cloud infrastructure, Bosch engineers were able to create an end-to-end confidential AI pipeline.
MarbleRun played a key role in resolving all three challenges.
MarbleRun simplifies the verification process by shifting it from individual trusted execution environments (TEEs) to entire pipeline deployments.
It achieves this through a deployment manifest that outlines the identity of each service, their metadata, the desired connections between them, and role-based access controls.
Secure service-to-service communication
- MarbleRun leverages the deployment manifest to authenticate individual pipeline services through remote attestation.
- It configures these services with their deployment metadata and issues each individual identity certificates based on MarbleRun’s certificate authority.
- These certificates are then used to establish encrypted and authenticated connections between the services.
- By providing that secure service-to-service communication (service mesh layer), MarbleRun enables independent scaling of pipeline steps without the need for explicit authentication and encryption of service-to-service communication on the application layer.
The role-based access control defined in the MarbleRun manifest allows for granular specification of authorized entities and their operations on the pipeline deployment.
For example, a DevOps engineer may be granted permission to change the API for database access, while a release engineer may have the authority to update service versions but only after confirmation from two additional engineers.
MarbleRun ensures the continuous operation and maintainability of a production system by enforcing these access control policies without compromising the security benefits offered by confidential computing.
With all the services running inside SGX enclaves the PII data is encrypted during processing at all times.
MarbleRun ensures that the data is also encrypted during transit between the cars, the de-identification, and the training services. When data is stored in a database between the de-identification and the training, MarbleRun provides encryption keys that are only present inside the SGX context.
At no time, therefore, would anybody have access to the PII data other than the processing services inside SGX. Day-2 operations, separate from the PII data, can be performed securely by the DevOps team, with the manifest precisely controlling the access, and providing a record of this, as needed, after the fact.
As a result of the effective encryption in key parts of the video and image processing mechanism, Bosch has a highly scalable AI pipeline and the ability to handle data on Azure.
Everything is done respecting European regulations and privacy, with no loss to analysis capabilities, and at a reasonable cost, thanks to the flexibility of a public cloud.
With Confidential Computing tooling, Bosch's collected data is never a liability and always an asset.