Eric Rodriguez
03/12/2024, 4:32 PMspark-submit
it just stalls forever:
24/03/12 12:09:26 INFO StandaloneSchedulerBackend: Granted executor ID app-20240312110926-0002/1 on hostPort 172.20.0.4:42795 with 1 core(s), 1024.0 MiB RAM
24/03/12 12:09:26 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.55:58381 with 434.4 MiB RAM, BlockManagerId(driver, 192.168.1.55, 58381, None)
24/03/12 12:09:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.55, 58381, None)
24/03/12 12:09:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.1.55, 58381, None)
24/03/12 12:09:26 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Eric Rodriguez
03/12/2024, 4:34 PM$SPARK_HOME/bin/spark-submit \
--class "com.gearsofleo.platform.jobs.GeneratorApplication" \
--master <spark://localhost:7077> \
--packages org.jetbrains.kotlin:kotlin-reflect:1.8.20,org.jetbrains.kotlinx.spark:kotlin-spark-api_3.3.2_2.12:1.2.4,org.apache.hadoop:hadoop-aws:3.2.4,com.amazonaws:aws-java-sdk:1.12.595 \
target/ReportingLakehouse-1.0-SNAPSHOT.jar
Eric Rodriguez
03/12/2024, 4:34 PM<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.5.2</version>
<configuration>
<minimizeJar>true</minimizeJar>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>module-info.class</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/*.md</exclude>
<exclude>META-INF/*.markdown</exclude>
<exclude>**/*.txt</exclude>
<exclude>**/*.proto</exclude>
<exclude>**/pom.properties</exclude>
<exclude>**/pom.xml</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.somecomp.platform.jobs.GeneratorApplication</mainClass>
</transformer>
</transformers>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.apache.spark:*</exclude>
</excludes>
</artifactSet>
</configuration>
</execution>
</executions>
</plugin>
Eric Rodriguez
03/12/2024, 4:34 PMJolan Rensen [JB]
03/12/2024, 4:51 PMJolan Rensen [JB]
03/12/2024, 4:51 PMEric Rodriguez
03/12/2024, 4:52 PMEric Rodriguez
03/12/2024, 4:52 PMJolan Rensen [JB]
03/12/2024, 4:53 PMEric Rodriguez
03/12/2024, 4:53 PMJolan Rensen [JB]
03/12/2024, 4:54 PMEric Rodriguez
03/12/2024, 4:54 PMJolan Rensen [JB]
03/12/2024, 4:58 PMJolan Rensen [JB]
03/12/2024, 4:58 PMasm0dey
03/12/2024, 5:31 PMEric Rodriguez
03/12/2024, 5:57 PMasm0dey
03/12/2024, 5:58 PMEric Rodriguez
03/12/2024, 6:01 PMEric Rodriguez
03/12/2024, 6:01 PMSpark context Web UI available at <http://7cc13da6fbf1:4040>
Spark context available as 'sc' (master = local[*], app id = local-1710266430566).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.3.2
/_/
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 17.0.8)
asm0dey
03/12/2024, 6:01 PMEric Rodriguez
03/12/2024, 6:03 PMspark-submit
that stallsEric Rodriguez
03/12/2024, 6:05 PMminRegisteredResourcesRatio: 0.0
2024-03-11 15:52:38 INFO GeneratorApplication:25 - Starting spark job
2024-03-11 15:52:39 INFO StandaloneAppClient$ClientEndpoint:61 - Executor updated: app-20240311145238-0014/0 is now RUNNING
2024-03-11 15:52:39 INFO StandaloneAppClient$ClientEndpoint:61 - Executor updated: app-20240311145238-0014/1 is now RUNNING
2024-03-11 15:52:40 INFO SharedState:61 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir
Eric Rodriguez
03/12/2024, 6:05 PMEric Rodriguez
03/12/2024, 6:05 PMJolan Rensen [JB]
03/12/2024, 6:58 PMEric Rodriguez
03/12/2024, 7:00 PMJolan Rensen [JB]
03/12/2024, 7:00 PMJolan Rensen [JB]
03/12/2024, 7:01 PMEric Rodriguez
03/12/2024, 10:52 PMEric Rodriguez
03/12/2024, 11:41 PMExecutor app-20240312233908-0011/82 finished with state EXITED message Command exited with code 1 exitStatus 1
Eric Rodriguez
03/12/2024, 11:41 PMEric Rodriguez
03/12/2024, 11:50 PMNOTE: Picked up JDK_JAVA_OPTIONS: '--add-opens=java.base/java.lang=ALL-UNNAMED, --add-opens=java.base/java.lang.invoke=ALL-UNNAMED, --add-opens=java.base/java.lang.reflect=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.net=ALL-UNNAMED, --add-opens=java.base/java.nio=ALL-UNNAMED, --add-opens=java.base/java.util=ALL-UNNAMED, --add-opens=java.base/java.util.concurrent=ALL-UNNAMED, --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.base/sun.nio.cs=ALL-UNNAMED, --add-opens=java.base/sun.security.action=ALL-UNNAMED, --add-opens=java.base/sun.util.calendar=ALL-UNNAMED, --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED'
WARNING: Unknown module: --add-opens=java.base/java.lang.invoke=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.lang.reflect=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.io=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.net=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.nio=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.util=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.util.concurrent=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/sun.nio.ch=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/sun.nio.cs=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/sun.security.action=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.base/sun.util.calendar=ALL-UNNAMED specified to --add-opens
WARNING: Unknown module: --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED specified to --add-opens
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x3081f72c) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export <http://sun.nio.ch|sun.nio.ch> to unnamed module @0x3081f72c
at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:222)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:376)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:210)
at
Eric Rodriguez
03/13/2024, 1:07 PM24/03/13 14:05:55 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
+--------------------+--------------------+--------------------+------------------+----------+------------+---------------+------------------+-------------+---------------+----------------+------------+----------+----------------------------------------------------------------------------------------------------+-----------+--------------------+-------------------+-------------------+--------------------+---------------------+--------------------+-------------------+---------------------+----------------+---------------------+
|playerUid |idPremio |idAposta |estadoOrigemAposta|tipoAposta|statusAposta|motivoSuspensao|motivoCancelamento|codModalidade|competicao |eventoModalidade|statusEvento|qtdMercado|idMercado |tipoMercado|quotaFixaMercado |inicioEvento |fimEvento |quotaFixaTotal |valorAposta |valorIRPFRetido |dataHoraAposta |ganhoApostador |indicadorCashOut|valorCashOut |
+--------------------+--------------------+--------------------+------------------+----------+------------+---------------+------------------+-------------+---------------+----------------+------------+----------+----------------------------------------------------------------------------------------------------+-----------+--------------------+-------------------+-------------------+--------------------+---------------------+--------------------+-------------------+---------------------+----------------+---------------------+
|01006538959909174408|lm3YsqZ7PtETsxRjmcij|YPDtXmmlnAWw0LVQKsRy|74 |MULTIPLE |CANCELED |null |EVENT_CANCELLATION|315 |UEFA |Masters Cup |IN_PROGRESS |200 |eLxCSAx0crX0zaKI29DulLXakx4RA38ZyyltIHWpxXBxnWbKzmifTLBcwZaOMc9hka8WTJPMIBEypfEB0tdlZsQitjizuAYLcKvc|false |5.300000000000000000|2024-03-13 11:58:20|2024-03-13 13:58:10|5.300000000000000000|46.000000000000000000|3.220000000000000000|2024-03-13 13:28:17|null |false |null |
|03575822208816394959|ys9brAv1mikZkZ5rx1dl|F8myePRImL5j1bkSdKbA|56 |MULTIPLE |ONGOING |null |null |259 |NHL |NHL |POSTPONED |706 |eLxCSAx0crX0zaKI29DulLXakx4RA38ZyyltIHWpxXBxnWbKzmifTLBcwZaOMc9hka8WTJPMIBEypfEB0tdlZsQitjizuAYLcKvc|true |1.240000000000000000|2024-03-13 11:24:11|2024-03-13 14:06:47|1.240000000000000000|76.000000000000000000|5.320000000000000000|2024-03-13 12:26:18|null |true |76.000000000000000000|
|54630212552887067444|xw0Q2Vlf5ytQJpknpZo6|jSks02LyRSgRIow5jnAC|53 |SIMPLE |NOT_AWARDED |null |null |958 |NBA |NHL |FINISHED |null |eLxCSAx0crX0zaKI29DulLXakx4RA38ZyyltIHWpxXBxnWbKzmifTLBcwZaOMc9hka8WTJPMIBEypfEB0tdlZsQitjizuAYLcKvc|false |5.300000000000000000|2024-03-13 11:51:56|2024-03-13 13:57:22|null |68.000000000000000000|4.760000000000000000|2024-03-13 12:01:44|null |false |null |
|84150253222367488864|IYnWIdndntRfPjLoegk5|pPJHSUI4pKvFPtnIpQ8O|19 |MULTIPLE |ONGOING |null |null |838 |UEFA |Euro League Cup |LATE |930 |eLxCSAx0crX0zaKI29DulLXakx4RA38ZyyltIHWpxXBxnWbKzmifTLBcwZaOMc9hka8WTJPMIBEypfEB0tdlZsQitjizuAYLcKvc|true |3.450000000000000000|2024-03-13 11:20:27|2024-03-13 14:54:36|3.450000000000000000|55.000000000000000000|3.850000000000000000|2024-03-13 13:48:28|null |false |null |
|94856581322015931314|IR7BaHNgrSpcXuF4p4vD|QvNd0UGnw5EIUdoLURnv|42 |MULTIPLE |SUSPENDED |OTHER |null |960 |Masters Cup |NHL |CANCELLED |936 |J5o0FtwHulKD62mQLpz73Y
Eric Rodriguez
03/13/2024, 1:07 PMJolan Rensen [JB]
03/13/2024, 1:14 PMException in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x3081f72c) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export <http://sun.nio.ch|sun.nio.ch> to unnamed module @0x3081f72c
at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:222)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:376)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:210)
at
is interesting. Here on Stackoverflow, someone solves it by adding even more --add-opens: https://stackoverflow.com/questions/73465937/apache-spark-3-3-0-breaks-on-java-17-with-cannot-access-class-sun-nio-ch-directEric Rodriguez
03/13/2024, 1:17 PMEric Rodriguez
03/13/2024, 1:17 PMfun <T> writeToParquet(ds: Dataset<T>, parquetName: String, partitionColumnName: String): Unit =
ds.repartition(col(partitionColumnName))
.write()
.mode(SaveMode.Append)
.partitionBy(partitionColumnName)
.parquet("$parquetOutputPath/domains/$parquetName")
Eric Rodriguez
03/13/2024, 1:17 PMEric Rodriguez
03/13/2024, 1:18 PMsportDS.repartition(col("sports"))
.write()
.mode(SaveMode.Append)
.partitionBy("playerUid")
.parquet("$parquetOutputPath/domains/sports")
Eric Rodriguez
03/13/2024, 1:18 PMEric Rodriguez
03/13/2024, 1:19 PMwriteToParquet(sports.toDS(), "sports", "playerUid")
Eric Rodriguez
03/13/2024, 1:19 PMEric Rodriguez
03/13/2024, 1:19 PMJolan Rensen [JB]
03/13/2024, 1:22 PMEric Rodriguez
03/13/2024, 1:22 PMEric Rodriguez
03/13/2024, 1:23 PMEric Rodriguez
03/13/2024, 1:24 PMshow
before, executes fineJolan Rensen [JB]
03/13/2024, 1:25 PMJolan Rensen [JB]
03/13/2024, 1:26 PMEric Rodriguez
03/13/2024, 1:27 PMEric Rodriguez
03/15/2024, 2:19 PMEric Rodriguez
03/15/2024, 2:20 PMasm0dey
03/15/2024, 3:57 PMEric Rodriguez
03/15/2024, 3:58 PMEric Rodriguez
03/15/2024, 3:59 PMEric Rodriguez
03/15/2024, 4:00 PMEric Rodriguez
03/15/2024, 4:01 PMEric Rodriguez
03/15/2024, 4:01 PMJolan Rensen [JB]
03/15/2024, 4:43 PM