Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAPI registration by agents don't support TLS skip verify #213

Open
erwanval opened this issue Dec 10, 2024 · 9 comments
Open

LAPI registration by agents don't support TLS skip verify #213

erwanval opened this issue Dec 10, 2024 · 9 comments
Assignees
Labels
kind/bug Something isn't working needs/triage Needs triage

Comments

@erwanval
Copy link
Contributor

Following a recent update, crowdsec agent are now registering to the LAPI using an API call performed through the cscli.

cscli lapi register --machine "$USERNAME" -u $LAPI_URL --token "$REGISTRATION_TOKEN"

However, when TLS is enabled for LAPI, this registration fails with the following error, because the init container has the LAPI url hardcoded:

level=fatal msg="api client register: api register (http://crowdsec-service.crowdsec:8080/) http 400 Bad Request: API error: http code 400, response: Client sent an HTTP request to an HTTPS server.\n"

But even if we change the LAPI_URL to https, it still fails with another error, because the certificate cannot be verified:

level=fatal msg="api client register: api register (https://crowdsec-service.crowdsec:8080/): Post \"https://crowdsec-service.crowdsec:8080/v1/watchers\": tls: failed to verify certificate: x509: certificate signed by unknown authority"

I have tried to set an env variable named INSECURE_SKIP_VERIFY for the init container, similar to how it's done for the agent container, but it doesn't work.
As far as I can tell, there is no option for cscli lapi register to skip certificate verify. Thus, the agent cannot start when LAPI is deployed using TLS with a certificate signed by a custom CA or self signed (which is what the chart install by default if tls.enabled: true).

@github-actions github-actions bot added the needs/triage Needs triage label Dec 10, 2024
Copy link

@erwanval: Thanks for opening an issue, it is currently awaiting triage.

If you haven't already, please provide the following information:

  • kind : bug, enhancementor documentation
  • area : agent, appsec, configuration, cscli, local-api

In the meantime, you can:

  1. Check Crowdsec Documentation to see if your issue can be self resolved.
  2. You can also join our Discord.
  3. Check Releases to make sure your agent is on the latest version.
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the forked project rr404/oss-governance-bot repository.

@github-actions github-actions bot added the needs/kind Kind label required label Dec 10, 2024
Copy link

@erwanval: There are no 'kind' label on this issue. You need a 'kind' label to start the triage process.

  • /kind bug
  • /kind documentation
  • /kind enhancement
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the forked project rr404/oss-governance-bot repository.

@erwanval
Copy link
Contributor Author

/kind bug
/area configuration

@github-actions github-actions bot added kind/bug Something isn't working and removed needs/kind Kind label required labels Dec 10, 2024
@srkoster
Copy link
Contributor

Hi @erwanval ,
Could you share you values.yaml ? The current code for the agent expects that both tls.enabled and tls.agent.tlsClientAuth are true in order to successfully use TLS in combination with an agent.

initContainers:
  {{- if or (not .Values.tls.enabled) (not .Values.tls.agent.tlsClientAuth) }}
  - name: wait-for-lapi-and-register
    image: "{{ .Values.image.repository | default "crowdsecurity/crowdsec" }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
    imagePullPolicy: {{ .Values.image.pullPolicy }}
    command: ['sh', '-c', 'until nc "$LAPI_HOST" "$LAPI_PORT" -z; do echo waiting for lapi to start; sleep 5; done; ln -s /staging/etc/crowdsec /etc/crowdsec && cscli lapi register --machine "$USERNAME" -u $LAPI_URL --token "$REGISTRATION_TOKEN" && cp /etc/crowdsec/local_api_credentials.yaml /tmp_config/local_api_credentials.yaml']
  {{- else }}
  - name: wait-for-lapi
    image: "{{ .Values.agent.wait_for_lapi.image.repository }}:{{ .Values.agent.wait_for_lapi.image.tag }}"
    imagePullPolicy: {{ .Values.agent.wait_for_lapi.image.pullPolicy }}
    command: ['sh', '-c', "until nc {{ .Release.Name }}-service.{{ .Release.Namespace }} 8080 -z; do echo waiting for lapi to start; sleep 5; done"]
  {{- end }}

@erwanval
Copy link
Contributor Author

I'm not sure I understand why this condition exists, but it looks like the init containers register the agent only if tls.enabled or tls.agent.tlsClientAuth is true (not AND, but OR, so either or both).
But then, if both are false, it doesn't even register the agents? Is that normal?

Anyway, that's not my case here. I have only TLS enabled, not mTLS (so tls.enabled: trueand tls.agent.tlsClientAuth: false).
In that case, it goes to the first condition, the init container named "wait-for-lapi-and-register". It also add the block a little below, where the LAPI_URL has "http" hardcoded:

       {{- if or (not .Values.tls.enabled) (not .Values.tls.agent.tlsClientAuth) }}
        volumeMounts:
          - name: crowdsec-config
            mountPath: /tmp_config
        env:
          - name: REGISTRATION_TOKEN
            valueFrom:
              secretKeyRef:
                name: crowdsec-lapi-secrets
                key: registrationToken
          - name: USERNAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: LAPI_URL
            value: http://{{ .Release.Name }}-service.{{ .Release.Namespace }}:8080
          - name: LAPI_HOST
            value: "{{ .Release.Name }}-service.{{ .Release.Namespace }}"
          - name: LAPI_PORT
            value: "8080"
        {{- end }}

If tls.enabled: true, then the LAPI_URL should be https, not http, that's the first issue, which is easy to solve, with something like this:

          - name: LAPI_URL
            value: {{ ternary "https" "http" .Values.tls.enabled  }}://{{ .Release.Name }}-service.{{ .Release.Namespace }}:8080

But my main issue is that it doesn't account for tls.insecureSkipVerify: true anyway. The initContainer doesn't add args or env to skip certificate verification.

Here are the values.yaml I tested with (simplified & anonymized) anyway:

container_runtime: containerd
config:
  config.yaml.local: |
    db_config:
      type:     postgresql
      user:     crowdsecuser
      password: ${DB_PASSWORD}
      db_name:  crowdsecdb
      host:     192.168.0.1
      port:     1234
      sslmode:  require

secrets:
  username: crowdsec-agent
  password: password

tls:
  enabled: true
  caBundle: false
  insecureSkipVerify: true
  certManager:
    duration: 2160h0m0s # 90d
    renewBefore: 360h0m0s # 15d
    issuerRef:
      name: cluster-internal-ca
      kind: ClusterIssuer
    secretTemplate:
      annotations:
        reloader.stakater.com/match: "true"
  agent:
    tlsClientAuth: false

agent:
  resources:
    requests:
      cpu: 10m
      memory: 100Mi
    limits:
      cpu: null
      memory: 100Mi
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
  acquisition:
    - namespace: ingress-nginx
      podName: ingress-nginx-controller-*
      program: nginx
  env:
    - name: PARSERS
      value: "crowdsecurity/cri-logs"
    - name: COLLECTIONS
      value: "crowdsecurity/nginx crowdsecurity/whitelist-good-actors"
    - name: LEVEL_DEBUG
      value: "false"

lapi:
  resources:
    requests:
      cpu: 10m
      memory: 100Mi
    limits:
      cpu: null
      memory: 100Mi
  strategy:
    type: RollingUpdate
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
  env:
    - name: BOUNCER_KEY_CLUSTER_NGINX_INGRESS
      valueFrom:
        secretKeyRef:
          name: crowdsec-lapi-secrets
          key: bouncerApiKey
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: crowdsec-lapi-secrets
          key: dbPassword
  secrets:
    csLapiSecret: cslapisecret
  extraSecrets:
    dbPassword: dbpass
    bouncerApiKey: bouncerkey
  persistentVolume:
    config:
      enabled: false
    data:
      enabled: false
  replicas: 2
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              k8s-app: crowdsec
              type: lapi
          topologyKey: topology.kubernetes.io/zone
        weight: 100
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            k8s-app: crowdsec
            type: lapi
        topologyKey: kubernetes.io/hostname
  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        k8s-app: crowdsec
        type: lapi
    maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway

@srkoster
Copy link
Contributor

@erwanval thanks for the detailed reply, I'll add a detailed one a bit later this evening.

For now, why not set tls.agent.tlsClientAuth to true to use TLS authentication if you've already set tls.enabled to true ?

@srkoster
Copy link
Contributor

Currently, the helmchart supports two authentication methods between the agent and the LAPI:

  1. TLS authentication
  2. Auto-registration (the only option when using appsec as it doesn't support https)

Password authentication is removed in favor of auto-registration, as both auto-registration and TLS authentication allow for easy scaling of the agents and can be used to create agents with meaningful names in the LAPI.

I'm not sure I understand why this condition exists, but it looks like the init containers register the agent only if tls.enabled or tls.agent.tlsClientAuth is true (not AND, but OR, so either or both).
But then, if both are false, it doesn't even register the agents? Is that normal?

Registering of the agent is only done when tls.enabled is false and/or tls.agent.tlsClientAuth is false. In that case the the auto-registration method is followed, which needs the init-container to register the agent before it can start. If TLS authentication is used (so both tls settings are true), no registration is needed as it uses pre-generated SSL certificates.

Anyway, that's not my case here. I have only TLS enabled, not mTLS (so tls.enabled: trueand tls.agent.tlsClientAuth: false).

This is not a supported scenario, as this leads to the LAPI only supporting https, while (as you've identified correctly), the cscli lapi register is not working. Passing environment variables like USE_TLS, INSECURE_SKIP_VERIFY don't resolve that, which is why they are not passed to the init container.

My main conclusion is that the documentation should be adjusted so that both authentication methods are explained better.

@erwanval
Copy link
Contributor Author

For now, why not set tls.agent.tlsClientAuth to true to use TLS authentication if you've already set tls.enabled to true ?

I only use TLS to cipher cluster internal traffic. Certificate validity itself is not really a concern in my case, and allows to ignore other issues that would arise otherwise, such as cert renewal (which is not handled by this chart, updating certs requires a restart of crowdsec pods).

By enabling client auth through TLS, certificate validity becomes a requirement. It would makes no sense to have both client auth and skip verify enabled, as it's the same as having no auth at all, not even a registration token. And short lived self signed certificate will cause issues during renewal in that case (which could be solved by relying on other tools, but still add complexity).

This is not a supported scenario, as this leads to the LAPI only supporting https, while (as you've identified correctly), the cscli lapi register is not working. Passing environment variables like USE_TLS, INSECURE_SKIP_VERIFY don't resolve that, which is why they are not passed to the init container.

In my opinion, TLS only (no client auth) should be supported by cscli lapi register, the same way it's already supported for other cscli commands. And in that case, rely on the auto-registration to authenticate the agents, just like non-tls setup.

@srkoster
Copy link
Contributor

Makes sense. I think your requirement of using TLS without TLS client auth can be accommodated with either:

  • Making sure that cscli lapi register is able to cope with TLS on the LAPI
  • Re-introducing password authentication for the agents, next to auto-registration and TLS auth

I don't know if the first is feasible and it's probably something for the crowdsec component itself, not this chart. Maybe @LaurenceJJones can comment on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working needs/triage Needs triage
Projects
None yet
Development

No branches or pull requests

3 participants