Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add docs on how use custom Python processors #753

Merged
merged 11 commits into from
Feb 26, 2025
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
= Loading custom components
= Custom `.nar` files

:description: Load custom NiFi components by using custom Docker images or mounting external volumes with nar files for enhanced functionality.
:nifi-docs-custom-components: https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#introduction

Expand Down Expand Up @@ -36,6 +37,7 @@ spec:
Also read the xref:guides:custom-images.adoc[Using customized product images] guide for additional information.

== Using the official image

If you don't want to create a custom image or don't have access to an image registry, you can use the extra volume mount functionality to mount a volume containing your custom components and configure NiFi to read these from the mounted volumes.

For this to work you'll need to prepare a PersistentVolumeClaim (PVC) containing your components.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
= Custom Python processors

In NiFi 2.0 support for custom processors written in Python have been added.
The Stackable images already contain the needed tools, such as - obviously - a supported Python version.

== General configuration

[source,yaml]
----
spec:
nodes:
configOverrides:
nifi.properties:
nifi.python.command: python3
# This property needs to be specified (otherwise a NullPointerException occurs)
nifi.python.working.directory: /nifi-python-working-directory
# This is needed to detect the Controller.py location (internally used by NiFi)
nifi.python.framework.source.directory: /stackable/nifi/python/framework/
# This is the folder where the Python scripts are sourced from
# We need to get the Python files in here
nifi.python.extensions.source.directory.custom: /nifi-python-extensions
----

== Getting Python scripts into NiFi

TIP: NiFi should hot-reload the Python scripts. You might need to refresh your browser window to see the new processor.

[#configmap]
=== 1. Mount as ConfigMap

The easiest way is defining a ConfigMap as follows and mount that.
This way the Python processors are stored and versioned alongside your NiFiCluster itself.

[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: nifi-python-extensions
data:
HelloWorldProcessor.py: |
from nifiapi.flowfiletransform import FlowFileTransform, FlowFileTransformResult

class WriteHelloWorld(FlowFileTransform):
class Java:
implements = ['org.apache.nifi.python.processor.FlowFileTransform']
class ProcessorDetails:
version = '0.0.1-SNAPSHOT'

def __init__(self, **kwargs):
pass

def transform(self, context, flowfile):
return FlowFileTransformResult(relationship = "success", contents = "Hello World", attributes = {"greeting": "hello"})
----

You can add multiple Python scripts in the ConfigMap.
Afterwards we need to mount the Python scripts into `/nifi-python-extensions`:

[source,yaml]
----
spec:
nodes:
podOverrides:
spec:
containers:
- name: nifi
volumeMounts:
- name: nifi-python-extensions
mountPath: /nifi-python-extensions
- name: nifi-python-working-directory
mountPath: /nifi-python-working-directory
volumes:
- name: nifi-python-extensions
configMap:
name: nifi-python-extensions
- name: nifi-python-working-directory
emptyDir: {}
----

[#git-sync]
=== 2. Use git-sync

As an alternative you can use `git-sync` to keep your Python processors up to date.
You need to add a sidecar using podOverrides that syncs into a shared volume between the `nifi` and `git-sync` container.

The following snippet can serve as a starting point (the Git repo has the folder `processors` with the Python scripts inside).

[source,yaml]
----
spec:
nodes:
podOverrides:
spec:
containers:
- name: nifi
volumeMounts:
- name: nifi-python-extensions
mountPath: /nifi-python-extensions
- name: nifi-python-working-directory
mountPath: /nifi-python-working-directory
- name: git-sync
image: registry.k8s.io/git-sync/git-sync:v4.2.3
args:
- --repo=https://github.com/stackabletech/nifi-talk
- --root=/nifi-python-extensions
- --period=10s
volumeMounts:
- name: nifi-python-extensions
mountPath: /nifi-python-extensions
volumes:
- name: nifi-python-extensions
emptyDir: {}
- name: nifi-python-working-directory
emptyDir: {}
----

Afterwards you need to update your source directory (you added previously) accordingly to point into the Git subfolder you have.

[source,yaml]
----
spec:
nodes:
configOverrides:
nifi.properties:
# Replace the property from the previous step
# Format is /nifi-python-extensions/<git-repo-name>/<git-folder>/
nifi.python.extensions.source.directory.custom: /nifi-python-extensions/nifi-talk/processors/
----

=== 3. Use PersistentVolume

You can also mount a PVC below `/nifi-python-extensions` using podOverrides and shell into the NiFi Pod to make changes.
However, the <<configmap>> or <<git-sync>> approach is recommended.

== Check processors have been loaded

NiFi logs every Python processor it found.
You can use that to check if the processors have been loaded.

[source,console]
----
$ kubectl logs nifi-2-0-0-node-default-0 -c nifi | grep -P 'Discovered Python Processor|Discovered or updated [0-9]+ Python Processors'
2025-02-14 14:40:20,694 INFO [main] o.a.n.n.StandardExtensionDiscoveringManager Discovered Python Processor PythonZgrepProcessor
2025-02-14 14:40:20,697 INFO [main] o.a.n.n.StandardExtensionDiscoveringManager Discovered Python Processor TransformOpenskyStates
2025-02-14 14:40:20,700 INFO [main] o.a.n.n.StandardExtensionDiscoveringManager Discovered Python Processor UpdateAttributeFileLookup
2025-02-14 14:40:20,700 INFO [main] o.a.n.n.StandardExtensionDiscoveringManager Discovered or updated 3 Python Processors in 60 millis
----
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
= Loading custom components
:description: Load custom NiFi components for enhanced functionality.

You can develop or use custom components for Apache NiFi, typically custom processors, to extend its functionality.

There are currently two types of custom components:

1. xref:nifi:usage_guide/custom-components/custom-nars.adoc[]
2. Starting with NiFi 2.0 you can also use xref:nifi:usage_guide/custom-components/custom-python-processors.adoc[]
4 changes: 3 additions & 1 deletion docs/modules/nifi/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
** xref:nifi:usage_guide/listenerclass.adoc[]
** xref:nifi:usage_guide/zookeeper-connection.adoc[]
** xref:nifi:usage_guide/extra-volumes.adoc[]
** xref:nifi:usage_guide/custom_processors.adoc[]
** xref:nifi:usage_guide/external_ports.adoc[]
** xref:nifi:usage_guide/security.adoc[]
** xref:nifi:usage_guide/resource-configuration.adoc[]
Expand All @@ -14,6 +13,9 @@
** xref:nifi:usage_guide/updating.adoc[]
** xref:nifi:usage_guide/overrides.adoc[]
** xref:nifi:usage_guide/writing-to-iceberg-tables.adoc[]
** xref:nifi:usage_guide/custom-components/index.adoc[]
*** xref:nifi:usage_guide/custom-components/custom-nars.adoc[]
*** xref:nifi:usage_guide/custom-components/custom-python-processors.adoc[]
** xref:nifi:usage_guide/operations/index.adoc[]
*** xref:nifi:usage_guide/operations/cluster-operations.adoc[]
*** xref:nifi:usage_guide/operations/pod-placement.adoc[]
Expand Down