Upgrading Multivac Hadoop Cluster


(Maziyar Panahi) #1

Hi @multivac-dsl,

All the services related to Multivac DSL (Spark, HDFS, Hive, etc.) will be unreachable today due to a major upgrade.

Hope everything goes well without any loss :slight_smile:

(Maziyar Panahi) #2

I have successfully upgraded Cloudera to 6.1 which is based on Hadoop 3.x with many new changes. This was a major upgrade, therefore there might be some parts of your pipeline that doesn’t work as it should.

The full list of incompatible changes in 6.0.0:

Big problem: Spark 2.4 is not supported on the latest release of Zeppelin. I am working on this to see if I can build it manually and fix it.

Please let me know if you have any problem with your workflow, we’ll find a way to make it compatible again.

(Maziyar Panahi) #3

Issue with Zeppelin has been resolved. Now it supports Spark 2.4 and Hadoop 3.0!

(Maziyar Panahi) #6

There is a problem with spark.read.csv and spark.read.json in new Spark 2.4 and Zeppelin. I am trying to fix this issue.

(Noe Gaumont) #7


Thanks for the update.
I was using spark.read.csv and spark.read.json.
Is there another way to read files from hadoop?
Or there is no need to find a workaround as a fix should be available soon?