Future of CRDs: Structural Schemas
CustomResourceDefinitions were introduced roughly two years ago as the primary way to extend the Kubernetes API with custom resources. From the beginning they stored arbitrary JSON data, with the exception that kind, apiVersion and metadata had to follow the Kubernetes API conventions. In Kubernetes 1.8 CRDs gained the ability to define an optional OpenAPI v3 based validation schema.
By the nature of OpenAPI specifications though—only describing what must be there, not what shouldn’t, and by being potentially incomplete specifications—the Kubernetes API server never knew the complete structure of CustomResource instances. As a consequence, kube-apiserver—until today—stores all JSON data received in an API request (if it validates against the OpenAPI spec). This especially includes anything that is not specified in the OpenAPI schema.
To understand this, we assume a CRD for maintenance jobs by the operations team, running each night as a service user: The privileged field is not specified by the operations team. Their controller does not know it, and their validating admission webhook does not know about it either.
Nevertheless, kube-apiserver persists this suspicious, but unknown field without ever validating it. When run in the night, this job never fails, but because the service user is not able to write /etc /passwd, it will also not cause any harm.
The maintenance team needs support for privileged jobs. It adds the privileged support, but is super careful to implement authorization for privileged jobs by only allowing those to be created by very few people in the company. That malicious job though has long been persisted to etcd.
The next night arrives and the malicious job is executed.