-
Notifications
You must be signed in to change notification settings - Fork 142
Open
Labels
questionFurther information is requestedFurther information is requested
Description
We are currently testing Zingg 0.5.0 with Databricks. However, when using DBR 14.3 or 15.4 (both with Spark 3.5.0), we encounter the following error:
Py4JError: zingg.common.client.util.ColName does not exist in the JVM
File <command-6215136635607487>, line 7
4 import time
5 import uuid
----> 7 from zingg.client import Arguments, ClientOptions, ZinggWithSpark
8 from zingg.pipes import Pipe, FieldDefinition, MatchType
10 from ipywidgets import widgets, interact
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/zingg/client.py:214
211 else:
212 setupJVMAndSpark()
--> 214 setupJVMBaseObjects()
217 def getDfFromDs(data):
218 """Method to convert spark dataset to dataframe
219
220 :param data: provide spark dataset
(...)
223 :rtype: DataFrame
224 """
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/zingg/client.py:202, in setupJVMBaseObjects()
200 global ZinggOptions
201 global LabelMatchType
--> 202 ColName = getJVM().zingg.common.client.util.ColName
203 MatchType = getJVM().zingg.common.client.MatchTypes
204 ZinggOptions = getJVM().zingg.common.client.ZinggOptions
Also, it appears in the logs this error.
ERROR DatabricksMain$DBUncaughtExceptionHandler: Uncaught exception in thread Thread-127!
java.lang.UnsupportedClassVersionError: zingg/common/client/util/ColName has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:152)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
at com.databricks.backend.daemon.driver.ClassLoaders$ReplWrappingClassLoader.loadClass(ClassLoaders.scala:65)
at java.lang.ClassLoader.loadClass(ClassLoader.java:406)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(CurrentThreadClassLoadingStrategy.java:40)
at py4j.reflection.ReflectionUtil.classForName(ReflectionUtil.java:51)
at py4j.reflection.TypeUtil.forName(TypeUtil.java:243)
at py4j.commands.ReflectionCommand.getUnknownMember(ReflectionCommand.java:175)
at py4j.commands.ReflectionCommand.execute(ReflectionCommand.java:87)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
at java.lang.Thread.run(Thread.java:750)
When using DBR 16,0, 16.2, and 16.4 with Spark 3.5.2 and Scala 2.12, we instead receive a serialization error:
Py4JJavaError: An error occurred while calling o549.execute.
: zingg.common.client.ZinggClientException: org.apache.spark.sql.types.StringType$; local class incompatible: stream classdesc serialVersionUID = 3796071416192072411, local class serialVersionUID = 7529903822443873529
at zingg.common.core.executor.Matcher.execute(Matcher.java:206)
at zingg.common.client.Client.execute(Client.java:281)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:197)
at py4j.ClientServerConnection.run(ClientServerConnection.java:117)
at java.base/java.lang.Thread.run(Thread.java:840)
Could you please advise which DBR version we should use?
Thanks.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested