You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to read two tables from Kudu and join them in the query.
I followed the example steps of reading the Table to DataFrame and registering it as a temp table. I repeat the same steps for a second table and then I query on them.
I have then use the dbGetQuery() method to pass a query joining the two tables and getting it in the data frame.
I get the following error:
Failed to fetch data: org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 8.0 failed 1 times, most recent failure: Lost task 19.0 in stage 8.0 (TID 163, localhost, executor driver): org.apache.kudu.client.NonRecoverableException: Scanner not found at org.apache.kudu.client.KuduException.transformException(KuduException.java:110) at org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:352) at org.apache.kudu.client.KuduScanner.nextRows(KuduScanner.java:58) at org.apache.kudu.spark.kudu.RowIterator.hasNext(KuduRDD.scala:120) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:148) at org.apache.spark.schedule
The sample query is:
`test_query <- paste("SELECT * FROM tbl1 n0 FULL OUTER JOIN tbl2 n1 on n0.id = n1.id WHERE n0.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') AND n1.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') LIMIT 100",sep="")
table_df <- dbGetQuery(sc, test_query)`
The text was updated successfully, but these errors were encountered:
I am trying to read two tables from Kudu and join them in the query.
I followed the example steps of reading the Table to DataFrame and registering it as a temp table. I repeat the same steps for a second table and then I query on them.
I have then use the dbGetQuery() method to pass a query joining the two tables and getting it in the data frame.
I get the following error:
Failed to fetch data: org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 8.0 failed 1 times, most recent failure: Lost task 19.0 in stage 8.0 (TID 163, localhost, executor driver): org.apache.kudu.client.NonRecoverableException: Scanner not found at org.apache.kudu.client.KuduException.transformException(KuduException.java:110) at org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:352) at org.apache.kudu.client.KuduScanner.nextRows(KuduScanner.java:58) at org.apache.kudu.spark.kudu.RowIterator.hasNext(KuduRDD.scala:120) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:148) at org.apache.spark.schedule
The sample query is:
`test_query <- paste("SELECT * FROM tbl1 n0 FULL OUTER JOIN tbl2 n1 on n0.id = n1.id WHERE n0.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') AND n1.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') LIMIT 100",sep="")
table_df <- dbGetQuery(sc, test_query)`
The text was updated successfully, but these errors were encountered: