Skip to content

MEMO: Upgrade embulk output orc to 0.4.0

Yukihiro Okada edited this page Jun 1, 2020 · 8 revisions

Upgrade embulk-output-orc to 0.4.0 by yuokada · Pull Request #22 · yuokada/embulk-output-orc

依存性解決するまでのメモ書き。

依存関係を眺めながら解決をしていくしかない。

  • emembulk-output-orcのコードの方も改修が必要かな。

2020/05/30

Hmm...

~/w/I/embulk-output-orc ❯❯❯ ./gradlew test

> Task :compileScala
Pruning sources from previous analysis, due to incompatible CompileSetup.
/Users/yuokada/works/IdeaProjects/embulk-output-orc/src/main/scala/org/embulk/output/orc/OrcColumnVisitor.scala:8: Class org.apache.hadoop.io.Writable not found - continuing with a stub.
class OrcColumnVisitor(val reader: PageReader, val batch: VectorizedRowBatch, val i: Integer) extends ColumnVisitor {
                                                          ^
one error found

> Task :compileScala FAILED

FAILURE: Build failed with an exception.

hive-storage-api がエラーの原因らしいのでこれに依存しているものを調べる。

~/w/I/embulk-output-orc ❯❯❯ ./gradlew dependencyInsight --configuration compile --dependency hive-storage-api 
Starting a Gradle Daemon, 1 busy and 1 incompatible Daemons could not be reused, use --status for details

> Task :dependencyInsight
org.apache.hive:hive-storage-api:2.7.1
   variant "runtime" [
      org.gradle.status          = release (not requested)
      org.gradle.usage           = java-runtime (not requested)
      org.gradle.libraryelements = jar (not requested)
      org.gradle.category        = library (not requested)
   ]

org.apache.hive:hive-storage-api:2.7.1
\--- org.apache.orc:orc-core:1.5.10
     \--- compile

A web-based, searchable dependency report is available by adding the --scan option.

BUILD SUCCESSFUL in 38s
1 actionable task: 1 executed

provided なのか新しいバージョンが使えそうではある。

~/w/I/embulk-output-orc ❯❯❯ ./gradlew dependencyInsight --configuration compile --dependency hive-storage-api

> Task :dependencyInsight
org.apache.hive:hive-storage-api:2.7.2
   variant "runtime" [
      org.gradle.status          = release (not requested)
      org.gradle.usage           = java-runtime (not requested)
      org.gradle.libraryelements = jar (not requested)
      org.gradle.category        = library (not requested)
   ]
   Selection reasons:
      - By conflict resolution : between versions 2.7.2 and 2.7.1

org.apache.hive:hive-storage-api:2.7.2
\--- compile

org.apache.hive:hive-storage-api:2.7.1 -> 2.7.2
\--- org.apache.orc:orc-core:1.6.3
     \--- compile

A web-based, searchable dependency report is available by adding the --scan option.

BUILD SUCCESSFUL in 30s
1 actionable task: 1 executed

Hadoop 3系にアップグレードするのはORCがボトルネックとなって難しい感じだな。。。

Hadoop awsだけアップグレードするって選択肢はあるんだろうか?

Clone this wiki locally