Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All recent elasticsearch-tagged builds are broken #164

Closed
teotwaki opened this issue Jul 3, 2018 · 15 comments
Closed

All recent elasticsearch-tagged builds are broken #164

teotwaki opened this issue Jul 3, 2018 · 15 comments

Comments

@teotwaki
Copy link

teotwaki commented Jul 3, 2018

Hi,

I might be doing something wrong, but I want to double check with the project first. I was using fluent/fluentd-kubernetes-daemonset:elasticsearch in my cluster very happilly, and then when the cluster autoscaled today (and only today) fluentd failed to start. The logs only showed this one line:

standard_init_linux.go:178: exec user process caused "no such file or directory"

I switched to a few other tags to try and cirumvent the issue, but encountered the same problem on all the following tags:

  • v0.12-alpine-elasticsearch
  • v0.12.33-elasticsearch
  • v0.12-elasticsearch
  • stable-elasticsearch
  • elasticsearch

A version that I found that worked was v1.2-debian-elasticsearch. My definition of "found" in this case being "booted and sent logs to elasticsearch".

@carlosjgp
Copy link
Contributor

carlosjgp commented Jul 3, 2018

I think the reason for this is this

1be6199#diff-9523e3b7e53f948647db88acc163b05b

Is there any particular reason why this project is not using releases? Github help

@carlosjgp
Copy link
Contributor

I build manually that commit and deployed on my K8s cluster but keeps failing... I had to go back to 30cc62d to make it work

@SchoIsles
Copy link

SchoIsles commented Jul 4, 2018

head /fluentd/entrypoint.sh
#!/usr/bin/dumb-init /bin/sh

# /usr/bin/dumb-init bash: /usr/bin/dumb-init: No such file or directory

@SchoIsles
Copy link

The latest builds has major mistake
container run failed with error standard_init_linux.go:178: exec user process caused "no such file or directory"

and the entrypoint.sh modify /bin/bash to /usr/bin/dumb-init, but the file is not exist

@m15o
Copy link

m15o commented Jul 4, 2018

It is because the base imagefluent/fluentd:v0.12.33 is old, and dumb-init is not included in the images.
And this is not only for elasticsearch-tagged builds. I am affected by the same problem for cloudwatch-tagged builds.

@repeatedly
Copy link
Member

repeatedly commented Jul 4, 2018

I see. So updating base image to latest v0.12 seems to resolve the problem.
Does anyone write a patch?

@teotwaki
Copy link
Author

teotwaki commented Jul 5, 2018

@carlosjgp @repeatedly Thanks for the quick turnaround. Much appreciated.

However, I'm now getting the classic user issue:

2018-07-05 09:22:54 +0000 [info]: reading config file path="/fluentd/etc/fluent.conf"
2018-07-05 09:22:54 +0000 [info]: starting fluentd-0.12.43
2018-07-05 09:22:54 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '1.17.0'
2018-07-05 09:22:54 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '1.1.0'
2018-07-05 09:22:54 +0000 [info]: gem 'fluent-plugin-record-reformer' version '0.9.1'
2018-07-05 09:22:54 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.6.0'
2018-07-05 09:22:54 +0000 [info]: gem 'fluent-plugin-secure-forward' version '0.4.5'
2018-07-05 09:22:54 +0000 [info]: gem 'fluentd' version '0.12.43'
2018-07-05 09:22:54 +0000 [info]: adding match pattern="fluent.**" type="null"
2018-07-05 09:22:54 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2018-07-05 09:22:54 +0000 [info]: adding match pattern="**" type="elasticsearch"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: adding source type="tail"
2018-07-05 09:22:54 +0000 [info]: using configuration file: <ROOT>
  <match fluent.**>
    type null
  </match>
  <source>
    type tail
    path /var/log/containers/*.log
    pos_file /var/log/fluentd-containers.log.pos
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    tag kubernetes.*
    format json
    read_from_head true
  </source>
  <source>
    type tail
    format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
    time_format %Y-%m-%d %H:%M:%S
    path /var/log/salt/minion
    pos_file /var/log/fluentd-salt.pos
    tag salt
  </source>
  <source>
    type tail
    format syslog
    path /var/log/startupscript.log
    pos_file /var/log/fluentd-startupscript.log.pos
    tag startupscript
  </source>
  <source>
    type tail
    format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
    path /var/log/docker.log
    pos_file /var/log/fluentd-docker.log.pos
    tag docker
  </source>
  <source>
    type tail
    format none
    path /var/log/etcd.log
    pos_file /var/log/fluentd-etcd.log.pos
    tag etcd
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/kubelet.log
    pos_file /var/log/fluentd-kubelet.log.pos
    tag kubelet
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/kube-proxy.log
    pos_file /var/log/fluentd-kube-proxy.log.pos
    tag kube-proxy
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/kube-apiserver.log
    pos_file /var/log/fluentd-kube-apiserver.log.pos
    tag kube-apiserver
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/kube-controller-manager.log
    pos_file /var/log/fluentd-kube-controller-manager.log.pos
    tag kube-controller-manager
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/kube-scheduler.log
    pos_file /var/log/fluentd-kube-scheduler.log.pos
    tag kube-scheduler
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/rescheduler.log
    pos_file /var/log/fluentd-rescheduler.log.pos
    tag rescheduler
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/glbc.log
    pos_file /var/log/fluentd-glbc.log.pos
    tag glbc
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format kubernetes
    multiline_flush_interval 5s
    path /var/log/cluster-autoscaler.log
    pos_file /var/log/fluentd-cluster-autoscaler.log.pos
    tag cluster-autoscaler
    format_firstline /^\w\d{4}/
    format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
    time_format %m%d %H:%M:%S.%N
  </source>
  <source>
    type tail
    format multiline
    multiline_flush_interval 5s
    format_firstline /^\S+\s+AUDIT:/
    format1 /^(?<time>\S+) AUDIT:(?: (?:id="(?<id>(?:[^"\\]|\\.)*)"|ip="(?<ip>(?:[^"\\]|\\.)*)"|method="(?<method>(?:[^"\\]|\\.)*)"|user="(?<user>(?:[^"\\]|\\.)*)"|groups="(?<groups>(?:[^"\\]|\\.)*)"|as="(?<as>(?:[^"\\]|\\.)*)"|asgroups="(?<asgroups>(?:[^"\\]|\\.)*)"|namespace="(?<namespace>(?:[^"\\]|\\.)*)"|uri="(?<uri>(?:[^"\\]|\\.)*)"|response="(?<response>(?:[^"\\]|\\.)*)"|\w+="(?:[^"\\]|\\.)*"))*/
    time_format %FT%T.%L%Z
    path /var/log/kubernetes/kube-apiserver-audit.log
    pos_file /var/log/kube-apiserver-audit.log.pos
    tag kube-apiserver-audit
  </source>
  <filter kubernetes.**>
    type kubernetes_metadata
  </filter>
  <match **>
    type elasticsearch
    log_level info
    include_tag_key true
    host <snip>
    port 443
    scheme https
    reload_connections false
    logstash_prefix logstash
    logstash_format true
    buffer_chunk_limit 10M
    buffer_queue_limit 64
    flush_interval 1s
    max_retry_wait 30
    disable_retry_limit 
    num_threads 8
  </match>
</ROOT>
2018-07-05 09:22:54 +0000 [error]: unexpected error error_class=Errno::EACCES error=#<Errno::EACCES: Permission denied @ rb_sysopen - /var/log/fluentd-containers.log.pos>
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/plugin/in_tail.rb:145:in `initialize'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/plugin/in_tail.rb:145:in `open'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/plugin/in_tail.rb:145:in `start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/root_agent.rb:115:in `block in start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/root_agent.rb:114:in `each'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/root_agent.rb:114:in `start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/engine.rb:237:in `start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/engine.rb:187:in `run'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:570:in `run_engine'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:162:in `block in start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:366:in `main_process'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:339:in `block in supervise'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:338:in `fork'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:338:in `supervise'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/supervisor.rb:156:in `start'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/lib/fluent/command/fluentd.rb:173:in `<top (required)>'
  2018-07-05 09:22:54 +0000 [error]: /usr/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2018-07-05 09:22:54 +0000 [error]: /usr/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/gems/fluentd-0.12.43/bin/fluentd:8:in `<top (required)>'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/bin/fluentd:23:in `load'
  2018-07-05 09:22:54 +0000 [error]: /fluentd/vendor/bundle/ruby/2.4.0/bin/fluentd:23:in `<main>'
2018-07-05 09:22:54 +0000 [info]: shutting down fluentd
2018-07-05 09:22:54 +0000 [info]: shutting down filter type="kubernetes_metadata" plugin_id="object:2abcb73fa504"
2018-07-05 09:22:54 +0000 [info]: shutting down output type="null" plugin_id="object:2abcb737f9bc"
2018-07-05 09:22:54 +0000 [info]: shutting down output type="elasticsearch" plugin_id="object:2abcb752efc4"
2018-07-05 09:22:54 +0000 [info]: process finished code=0
2018-07-05 09:22:54 +0000 [warn]: process died within 1 second. exit.

@jeffutter
Copy link

I am getting this error as well. I am using the helm chart for fulentd-cloudwatch and was getting the standard_init_linux.go:178: exec user process caused "no such file or directory" error. I found that I could switch the image that it used, however all of the 0.12.43 images and all of the 1.2 images would give the above permission error. The only image I have found that works at all is v1.2.2-debian-cloudwatch however it crashes frequently as indicated in this issue: #150

Any suggestions to get this stable and working would be appreciated.

@repeatedly
Copy link
Member

Issue is resolved by the patch. So close this issue.

@teotwaki
Copy link
Author

@repeatedly I appreciate that you're trying to close things that are fixed, but did you see #164 (comment)?

@marulm
Copy link

marulm commented Jul 16, 2018

@teotwaki set env variable FLUENT_UID to "0".
eg:
- name: FLUENT_UID value: "0"

@deitch
Copy link

deitch commented Jul 17, 2018

@marulm that does work, but the initial value has to be root. A non-root user (fluent) cannot elevate privileges to root (unless it has sudo rights, of course).

@carlosjgp
Copy link
Contributor

Hi @teotwaki, is this still a problem?

If it is, can you create a gist with your K8s deployment?

Make sure you are mounting '/var/log' with write privileges.

@Tikam02
Copy link

Tikam02 commented Jun 11, 2020

Getting the same error - using FLUENT_UID

env:
- name: FLUENT_UID
value: "0"

``` 2020-06-11 19:20:25 +0000 [error]: #0 unexpected error error_class=Errno::EACCES error="Permission denied @ rb_sysopen - /var/log/fluentd-containers.log.pos"

@ggrames
Copy link

ggrames commented Apr 15, 2024

@Tikam02
i have the same problem in my logging-operator installation with fluentd.
The env value FLUENT_UID is set to "0", but i get still Permission denied.
Have you found a solution some years ago?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants