A rule to view all user actions in a container (and bonus, host) - missing audit trail #224

jonny-wg2 · 2024-02-09T13:22:31Z

Motivation

We are missing logs for what a user is performing in a container. We have alerts if one does "dangerous" commands like nc but I want to use falco to generate a history of logs for all actions performed by a user in a container. This is similar to the Terminal shell in container (as shown below)

- rule: Terminal shell in container
  desc: >
    A shell was used as the entrypoint/exec point into a container with an attached terminal. Parent process may have
    legitimately already exited and be null (read container_entrypoint macro). Common when using "kubectl exec" in Kubernetes.
    Correlate with k8saudit exec logs if possible to find user or serviceaccount token used (fuzzy correlation by namespace and pod name).
    Rather than considering it a standalone rule, it may be best used as generic auditing rule while examining other triggered
    rules in this container/tty.
  condition: >
    spawned_process
    and container
    and shell_procs
    and proc.tty != 0
    and container_entrypoint
    and not user_expected_terminal_shell_in_container_conditions
  output: A shell was spawned in a container with an attached terminal (evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty exe_flags=%evt.arg.flags %container.info)
  priority: NOTICE
  tags: [maturity_stable, container, shell, mitre_execution, T1059]

Feature

Create a rule that will log all user traffic in a container. It would also be nice to have a second rule to log all user traffic on the host.

The text was updated successfully, but these errors were encountered:

incertum · 2024-02-10T07:07:04Z

Thanks @jonny-wg2 - I can confirm that it is one of the top desires among adopters. Given that we now have the concept of Sandbox rules, perhaps we can derive a generic rule that can serve as a template (disabled by default).

Terminology:

"Logs for all actions performed" is very broad. Adopters often initially consider logging all spawned processes, which is a great starting point. However, there are more subtleties to consider. Additionally, logging all file opens and other syscalls "performed by a user" could quickly result in very noisy rules.
Secondly, "user" is also a tricky and broad term. In such a context, "user" typically refers to an actual human operator who launched some commands. More on that below.

Host:

Mostly here, we would be referring to commands run over ssh, either manually or through remote ssh commands. spawned_process and interactive and proc.tty!=0 could be a first step. user.loginuid=%user.loginuid user.loginname=%user.loginname would give you the Linux audit user. proc.is_vpgid_leader can help you reverse engineer what might have been "typed into the terminal" / run directly in the foreground versus what processes Linux spawned in the background, in addition to or as a consequence. Remember, shell built ins do not cause a new spawned process, so "commands" like "echo" or "unset" an ENV variable are not logged in that way. That's why I think it would be important to clarify what "all activity" means. I am probably biased, but I believe "anomaly detection" is the future, focusing on logging all activity that is clearly not normal (also covering more syscalls beyond spawned_process).

Container:

In a container, we mostly associate interactivity with execing into a container. However, spawned_process and container and proc.tty!=0 is very broad and noisy, and currently, we cannot directly associate activity with a human operator user. Please follow this discussion here: falcosecurity/falco#2895 (we are still working on it). In Kubernetes, serviceaccounts further complicate attribution (even ips don't tend to help much). Everything else I mentioned above also applies here.

References:

max-frank · 2024-04-24T00:41:31Z

Sorry to chime into the conversation here 🙇

I feel like a feature like this would be better served via a separate processing engine rather than Falco as a rule engine.
To clarify as already mentioned by @incertum defining what is "user" activity is not exactly trivial. Also anything that would be accurate enough and not full of FP cannot really be done in the confines of the Faclo rule engine since it only supports a static view on single events (even with all the data augmentation the underlying library already supports). See for comparison how Sysdig-OSS tries to provide this kind of user actions stream by tracking everything they deem as interactive user session and logging child commands.

https://github.com/draios/sysdig/blob/e7fe148f81476edcf83414e5421ca8385fee97b5/userspace/sysdig/chisels/spy_users.lua

So long story short it would probably easier to implement this kind of use case by further separating Falco libs (i.e., libscap libsinsp) from the Falco rule engine. By running the libs in a specialized collector process that simply feeds the event stream some where it would be possible to have the Falco rule engine and other processes subscribe to that stream. Effectively you could run Falco and then what ever extra processing you want to run off of the same Syscall feed.

poiana · 2024-07-23T04:08:49Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana · 2024-08-22T04:09:41Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana · 2024-09-21T04:10:51Z

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

poiana · 2024-09-21T04:10:54Z

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jonny-wg2 added the kind/feature New feature or request label Feb 9, 2024

incertum self-assigned this Feb 10, 2024

poiana added the lifecycle/stale label Jul 23, 2024

poiana added lifecycle/rotten and removed lifecycle/stale labels Aug 22, 2024

poiana closed this as completed Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A rule to view all user actions in a container (and bonus, host) - missing audit trail #224

A rule to view all user actions in a container (and bonus, host) - missing audit trail #224

jonny-wg2 commented Feb 9, 2024

incertum commented Feb 10, 2024

max-frank commented Apr 24, 2024

poiana commented Jul 23, 2024

poiana commented Aug 22, 2024

poiana commented Sep 21, 2024

poiana commented Sep 21, 2024

A rule to view all user actions in a container (and bonus, host) - missing audit trail #224

A rule to view all user actions in a container (and bonus, host) - missing audit trail #224

Comments

jonny-wg2 commented Feb 9, 2024

incertum commented Feb 10, 2024

max-frank commented Apr 24, 2024

poiana commented Jul 23, 2024

poiana commented Aug 22, 2024

poiana commented Sep 21, 2024

poiana commented Sep 21, 2024