Zero-Trust Kubernetes Networking with Network Policies

8 min read
KubernetesSecurityNetwork PoliciesZero-TrustDevOpsPlatform Engineering

Kubernetes clusters ship with a flat, all-to-all networking model by default. Every pod can reach every other pod across every namespace without restriction. For a development cluster running on a laptop, that permissiveness is convenient. For a production cluster hosting customer data, payment processing, and internal services, it is an incident waiting to happen. When a single compromised pod can scan the entire cluster network, enumerate internal APIs, and exfiltrate data from services it should never have been able to reach, the absence of network segmentation is not a missing feature — it is an architectural vulnerability. Network Policies close this gap by defining L3/L4 firewall rules that restrict pod-to-pod traffic at the Kubernetes control plane level. This guide walks through designing, deploying, and testing Network Policies that enforce a zero-trust networking model in production clusters.

## Why Default-Allow Is a Security Gap

The Kubernetes networking model is deliberately simple: pods receive IP addresses from a flat address space, and all pods can communicate with all other pods without Network Address Translation. This design removes the network as a deployment bottleneck, which is why Kubernetes networking works so smoothly out of the box. However, the same flat network that makes service discovery effortless also makes lateral movement effortless for an attacker. If a workload in the frontend namespace can reach the database namespace directly, a cross-site scripting vulnerability in the frontend becomes a direct path to the database credentials and data. If a CI runner pod can reach the kube-system namespace, a compromised build pipeline can attack core cluster components.

The security implication is structural, not theoretical. In a cluster with no Network Policies, any pod with a valid service account token can attempt connections to the Kubernetes API server, to etcd if exposed, to cloud metadata endpoints on the node, and to every other workload in the cluster. Rate limiting and authentication protect the API server, but internal services rarely have the same defenses. An attacker who gains code execution in one container inherits the full network reachability of the entire cluster. Network Policies transform this open network into a least-privilege model where pods can only communicate with explicitly authorized destinations.

## Building a Default-Deny Baseline

The first Network Policy every production namespace should contain is a deny-all rule. This policy selects all pods in the namespace and blocks all ingress and egress traffic. Once applied, no pod in the namespace can receive or initiate any network connection until additional policies explicitly allow specific traffic. This flips the default from allow-all to deny-all, which is the foundational principle of zero-trust networking: no communication is permitted unless explicitly authorized by policy.

### The Universal Deny-All Policy

A deny-all Network Policy is simple and applies to every namespace that requires segmentation. The policy selects an empty pod selector, which matches every pod, and defines no ingress or egress rules — meaning all traffic is blocked. Here is the canonical deny-all policy that platform teams should apply as a starting point for every production namespace:

```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-all namespace: production spec: podSelector: {} policyTypes: - Ingress - Egress ```

Applying this policy to a namespace immediately breaks all pod communication, including DNS lookups and connections to the Kubernetes API. That is the intended effect — it forces platform teams to explicitly authorize every required communication path. The next step is to layer allow policies on top of this baseline that permit the traffic each workload genuinely needs.

### Allowing DNS and Infrastructure Traffic

After applying the deny-all baseline, the first allow policy should restore DNS resolution. Pods need to reach the cluster DNS service to resolve internal service names and external hostnames. A DNS egress policy selects the pods that need DNS access and permits UDP and TCP traffic to the kube-dns service on port 53. The policy uses a namespace selector to identify the kube-system namespace and a pod selector to target the DNS pods specifically:

```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-dns-egress namespace: production spec: podSelector: matchLabels: app: my-app policyTypes: - Egress egress: - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53 - protocol: TCP port: 53 ```

With DNS working, the next policies can permit application-specific traffic. A frontend pod might need egress to a backend service, and a backend pod might need egress to a database. Each allow policy adds a narrow, labeled communication path on top of the deny-all baseline, and no unexpected traffic passes through.

## Namespace Isolation and Microsegmentation

Namespace-level isolation enforces boundaries between environments and between teams sharing a cluster. In a multi-tenant cluster, the platform team can apply a Network Policy that blocks all cross-namespace ingress, ensuring that pods in the development namespace cannot reach pods in production. The policy uses a namespace selector combined with a pod selector to restrict ingress to only traffic originating from the same namespace or from a designated infrastructure namespace:

```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-cross-namespace namespace: production spec: podSelector: {} policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: environment: production ```

This policy allows ingress only from pods in namespaces labeled environment: production. Pods in any other namespace — including staging, development, or an unlabeled namespace — cannot send traffic into the production namespace. For finer-grained control within a namespace, microsegmentation policies label individual workloads and restrict traffic to specific pod selectors. A payment-processing service, for example, might only accept ingress from the API gateway pods, and a database might only accept ingress from the application pods that own its schema.

The Kubernetes Network Policy API does not natively support L7 filtering — it operates at L3 (IP address) and L4 (port and protocol). Teams that need HTTP path-based routing, TLS termination policies, or application-layer filtering typically adopt a CNI plugin that extends Network Policy capabilities. Cilium, Calico, and other CNI providers offer custom resource definitions that add L7 filtering, DNS-based egress policies, and cluster-wide default deny without requiring a per-namespace policy. Platform teams evaluating Network Policy strategy should assess whether the native Kubernetes API meets their segmentation requirements or whether a CNI with richer policy primitives is warranted.

## Testing Network Policies Before Production

Network Policies are firewall rules, and misconfigured firewall rules cause outages. A policy that accidentally blocks health checks can cause pods to restart in a loop. A policy that blocks kubelet liveness probes can make deployments appear unhealthy. The operational risk of deploying Network Policies into a running production namespace is real, and the safest approach is to test policies in a staging cluster that mirrors production namespace structure and pod labels.

The simplest testing workflow deploys the deny-all policy first, observes which services break, and iteratively layers allow policies until all legitimate traffic is restored. The kubectl command for debugging blocked connections is straightforward: exec into a pod and attempt the connection that should succeed. If it fails, the Network Policy is blocking it. If it succeeds, the allow rule is correctly configured:

```sh kubectl exec -it deploy/frontend -- curl -s --connect-timeout 3 http://backend.production.svc.cluster.local:8080 ```

A more systematic approach uses the <a href="/blog/kubernetes-runtime-security-ebpf-falco/">Kubernetes runtime security monitoring</a> that Falco provides. Falco can detect unexpected network connections at the kernel level using eBPF, flagging traffic that violates the intended segmentation model even before Network Policies are enforced. This gives platform teams a safe observation window, and the same eBPF-based monitoring provides ongoing assurance that policies are working as intended after deployment.

Network Policy testing also belongs in CI/CD pipelines. Before deploying a new policy, a pipeline job can apply the policy in a test namespace, run a suite of connectivity assertions, and roll back the policy if any required path is broken. Tools like Cyclonus and the Kubernetes Network Policy recipes repository provide pre-built test cases that validate common policy patterns. Integrating these tests into the deployment pipeline ensures that policy drift — an engineer removing a rule to unblock a release — is caught before it reaches production.

## Practical Implementation Checklist

Network Policy adoption does not require a big-bang rollout. The most successful production deployments follow a phased approach that builds confidence at each stage. The following checklist captures the progression from zero Network Policies to a fully segmented production cluster:

- Verify that the cluster CNI plugin supports the Kubernetes NetworkPolicy API. Calico, Cilium, Weave Net, and Antrea all support it. Some managed Kubernetes offerings require explicit enablement. - Start in a staging namespace that mirrors production labels and pod topology. Apply the deny-all policy and restore DNS first, then application traffic. - Use namespace labels to enforce environment boundaries. Apply a cross-namespace deny policy to production namespaces before adding intra-namespace rules. - Add allow policies incrementally, one workload pair at a time. Test each policy by verifying that allowed traffic succeeds and denied traffic fails. - Monitor dropped connections at the CNI or node level during the rollout. Sudden connection failures in production monitoring dashboards are the fastest signal that a policy is too broad. - Integrate Network Policy validation into the CI pipeline. Run connectivity tests in a sandbox namespace on every policy change before merging. - Document every allow rule with a comment or annotation explaining which team owns it and which business requirement justifies the permitted path. This ensures future auditors and on-call engineers understand the policy intent.

## Securing the rest of the stack

Network Policies address the network layer of a Kubernetes security posture, but they are one control among many. Pod security standards restrict container privileges and prevent workloads from bypassing network restrictions through host networking. <a href="/blog/kubernetes-secrets-management-beyond-base64/">Kubernetes secrets management</a> ensures that even if an attacker reaches a pod, the credentials they find are encrypted, rotated, and scoped to the minimum required permissions. Runtime security monitoring with Falco and eBPF, mentioned above, detects anomalous network connections that bypass or precede policy enforcement. Admission controllers built on OPA Gatekeeper or Kyverno can reject workloads that deploy without a matching Network Policy in their namespace, closing the gap between policy-as-code and network policy enforcement at admission time.

The operational reality is that most production Kubernetes clusters run for months or years with no Network Policies at all, relying on cloud firewalls, security groups, and the assumption that internal pod traffic is trusted by default. That assumption is increasingly difficult to defend as clusters grow, as more teams deploy to shared namespaces, and as supply-chain attacks make compromised containers more common. Implementing a deny-all baseline with explicit allow rules is a weekend project for a platform engineer. The cost is low, the blast radius reduction is immediate, and the security posture improvement is measurable in every penetration test and compliance audit that follows. If your team wants a second pair of eyes on Network Policy design — or a broader review of Kubernetes security controls — Secpros can audit your cluster segmentation, CI/CD pipeline, and runtime defenses and return a prioritized hardening plan.

## Sources

The Network Policy model and API reference are documented in the [Kubernetes Network Policies documentation](https://kubernetes.io/docs/concepts/services-networking/network-policies/). The zero-trust architectural principles applied in this guide align with the [CISA Zero Trust Maturity Model](https://www.cisa.gov/zero-trust-maturity-model), which recommends network segmentation as a foundational control for modern infrastructure.

/ author

Pawel Bedynski

DevOps Engineer & Kubernetes Consultant. Building cloud-native infrastructure on GCP since 2019. 80+ production clusters deployed.

LinkedIn