Structure of Sinks #1314

flomickl · 2023-02-19T14:59:27Z

flomickl
Feb 19, 2023
Collaborator

in StreamPipes there are different Sink Types
https://github.com/apache/streampipes/blob/dev/streampipes-model/src/main/java/org/apache/streampipes/model/DataSinkType.java

Should we discuss if it is time to recreate these types?

I noticed that there is also a 1 class variant of sinks.
So it is time to refactor the "old" 3 class sinks?
There is a category called database flink with an Elasticsearch sink.
Would it be better so rename this and call is distributed sinks?
and add other storage like spark or Hadoop in the future?
Another topic on sinks?

For documentation there is a new dev section https://cwiki.apache.org/confluence/x/NppbDg

bossenti · 2023-02-19T15:24:17Z

bossenti
Feb 19, 2023
Collaborator

Why do want to recreate them? Do you think they are outdated?
Yes, this is the same like for processors. The three-class approach is deprecated and we need to migrate all pipeline elements to the one-class approach. This is just some work to do 😀 Maybe these are suitable tasks as good-first-issues if we provide in example on how to do?
I think this naming is tied to the actual runtime environment, like it is for some processor elements as well.
If we want to support further runtime types like Spark then we should create dedicated modules therefore.
As far as I know, we once supported Spark already here and there, but there was close to zero usage.
Since this increases complexity and maintenance effort significantly, we should carefully think about integration additional runtime environments.
@tenthe @dominikriemer please correct me if I'm wrong here

2 replies

bossenti Feb 19, 2023
Collaborator

Yes, this is the same like for processors. The three-class approach is deprecated and we need to migrate all pipeline elements to the one-class approach. This is just some work to do 😀 Maybe these are suitable tasks as good-first-issues if we provide in example on how to do?

I've created the corresponding issue: #1315

dominikriemer Feb 20, 2023
Collaborator

I think the question has two parts:

The DataSinkType refers to the category of the data sink (there is a similar class for processors as well). This is only used in the pipeline editor to better find pipeline elements of a certain group (select group in the pipeline element selection panel). We can extend the list but need to be careful when removing items due to backwards compatibility.
We have some modules with flink appendix which can be only used with the Flink wrapper. Anyways, we haven't maintained the Flink wrapper for quite some time and most recent pipeline elements use the default JVM wrapper. We can rework the distributed wrappers in the future, and then I'd be in favor of reusing the same set of processors for distributed and standalone wrappers instead of having separate implementations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structure of Sinks #1314

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Structure of Sinks #1314

flomickl Feb 19, 2023 Collaborator

Replies: 1 comment · 2 replies

bossenti Feb 19, 2023 Collaborator

bossenti Feb 19, 2023 Collaborator

dominikriemer Feb 20, 2023 Collaborator

flomickl
Feb 19, 2023
Collaborator

Replies: 1 comment 2 replies

bossenti
Feb 19, 2023
Collaborator

bossenti Feb 19, 2023
Collaborator

dominikriemer Feb 20, 2023
Collaborator