How to use seen.txt and updated.txt to generate graphs for comparison #22

HassebJ · 2016-11-19T04:03:47Z

@rmetzger I bumped the versions for Spark, Storm and Flink (also had to modify some pom files) to successfully run benchmarks for all of them but I am having a hard time making sense of the data produced by the benchmarks. Can you please give me some pointers as to how I can use the seen.txt and updated.txt data to generate graphs for the latency characteristics of the frameworks as shown in this post.

revans2 · 2016-11-21T14:54:22Z

Yes we are all having the same issue. There appears to be something odd happening with scala 2.11. If you upgarade all of the versions in the pom.xml, not just the shell script, and run mvn dependency:tree you can see that most things were upgraded to 2.11, but there are some dependencies still stuck on 2.10. The flink kafka integration is an example of this. Looking around I was able to find a 2.11 version, but it looks like it is for a much newer version of flink. Then I ran out of time.

If you want to try and play around with upgrading the version of flink to 1.x and along with it hopefully the kafka version. Then look at the spark side as well that would be great.

In the mean time I suggest you try to find/build the older releases of flink and spark then place them in the download cache instead.

HassebJ · 2016-11-22T21:48:47Z

@revans2 I have updated Spark to 1.6 and Flink to 1.1 in this PR. I am now able to run the benchmarks successfully on a multi-node cluster as well as locally.

Can you please help me make sense of seen.txt and updated.txt data ? So we have the following data in these files

seen	updated
23	6020
32	9996
29	10962
42	10926
44	9879
33	9842

What is the relation between the seen and updated values here? Specifically, how can I use this data to generate a latency graph, like the one below (source). Would really apprecite if you could help me with this.

mmozum · 2016-11-29T19:21:22Z

Ditto, it would be great to have the methodology and possibly necessary scripts posted to the repo.

HassebJ · 2016-12-08T22:00:33Z

@mmozum How about we try and make sense of these benchmarks together? Have you been able to make any progress ?

raj2261992 · 2017-01-19T12:41:26Z

I am also not able to understand the results of seen.txt and update.txt and how we can use it to generate report.
I ran the storm benchmark with 10000 load and for 100 seconds, also I have monitored it through storm UI. Below are some points which I have observed:-
1)As per my understanding seen.txt shows only number of events per campaign window after filter phase e.g if spout emits 1000000 records then only 30-33% of events would be emitted after filter phase and rest irrelevant events would be ignored. suppose if there are total 1000000 records then total count of events of all campaign window in seen.txtwould be 30%(i.e. 300000) of it.

2)Count of values in seen.txt and update.txt would be benchmark duration*(number of campaigns/window duration). e.g 100s*(100/10)=1000 in our case.

3)So the way you plot the given graph for processing time vs percent of tuples processed might be the percentage value(like 1,20,25,50 in your case) of total records which emitted after filter phase taken from seen.txt and its corresponding latency in updated.txt.

Please let me know if my understanding is correct or not?

supunkamburugamuve · 2017-03-22T23:18:50Z

I also try to make sense of the update.txt.

It gives the following according to code. This value is for the last event for a time window.

window.last_updated_at – window.timestamp

The documentation says, in order to calculate the latency we need to subtract window.duration from the above value in update.txt. The problem is this gives negative values for latency.

window.final_event_latency = (window.last_updated_at – window.timestamp) – window.duration

I'm not event sure the way they calculate latency is correct. Because they use the currentMillis in the tasks and the timestamp generated by the client.

According to code, the window.last_updated_at also written by a thread, which runs every 1 second. So I'm not convinced this is correct.

apsankar · 2017-04-24T07:35:01Z

Gentle bump, has anyone figured out what seen.txt and updated.txt are supposed to mean, and how to draw the graphs from them?

Yitian-Zhang · 2018-11-27T08:37:53Z

I am also confused about the meaning of the values in seen.txt and updated.txt. Any answer is helpful.

jm9e · 2019-03-08T11:42:54Z

I had the same troubles but I think I can shed some light on that.

We used the Streaming Benchmark to see how our solution performs and wanted to compare the results with those of Apache Spark and Apache Storm. For that, of course we needed to interpret the results correctly.

How it works

All events are pushed to Kafka and read from there by the processing engine (Storm, Spark, Flink and in our case Slang)
All events have an event_timestamp
The events are filtered (for type view) and assigned to a campaign window of 10 seconds (see fig. below)
These campaign windows count all events belonging to them
Each time a new event is added to a campaign window this window will be updated the next time windows are pushed to Redis (this happens every second as @supunkamburugamuve pointed out)
When windows are pushed to Redis the updated value of this window will be set to the current timestamp (which slightly distorts the results)

How we approximated the actual latency

The illustration below shows how the values in updated.txt and seen.txt should be interpreted.

We approximated the actual latency by this formula:

actual_latency = (updated_txt_value - 10000) + (10000 / seen_txt_value)

The problem here is that the time span between the last event and the window end will get lost (gray). Of course, the above formula only works if you assume evenly distributed events over time. Also, the fact that the measurement is distorted by only updated each second remains an issue. However, this applies to all tested frameworks so comparing results still tells differences in latency.

I hope this helps a bit.

If you are interested in how we used this benchmark check our blog post about it: https://bitspark.de/blog/benchmarking-slang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use seen.txt and updated.txt to generate graphs for comparison #22

How to use seen.txt and updated.txt to generate graphs for comparison #22

HassebJ commented Nov 19, 2016

revans2 commented Nov 21, 2016

HassebJ commented Nov 22, 2016

mmozum commented Nov 29, 2016

HassebJ commented Dec 8, 2016

raj2261992 commented Jan 19, 2017

supunkamburugamuve commented Mar 22, 2017

apsankar commented Apr 24, 2017

Yitian-Zhang commented Nov 27, 2018

jm9e commented Mar 8, 2019

How to use seen.txt and updated.txt to generate graphs for comparison #22

How to use seen.txt and updated.txt to generate graphs for comparison #22

Comments

HassebJ commented Nov 19, 2016

revans2 commented Nov 21, 2016

HassebJ commented Nov 22, 2016

mmozum commented Nov 29, 2016

HassebJ commented Dec 8, 2016

raj2261992 commented Jan 19, 2017

supunkamburugamuve commented Mar 22, 2017

apsankar commented Apr 24, 2017

Yitian-Zhang commented Nov 27, 2018

jm9e commented Mar 8, 2019

How it works

How we approximated the actual latency