Hi all,
 
Thanks for the input but it looks like I'll have to build a custom notebook image after all.
 
Capitalizing the bucket gives off a "Bucket not found" error and running something along the lines of !PYSPARK_HADOOP_VERSION=3.2 pip install --upgrade --force-reinstall pyspark==2.4.5 doesn't seem to change the Hadoop version even after restarting the Kernel in the notebook.
 
import os os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:<VERSION> pyspark-shell' was also suggested but doesn't help either.
 
Kind regrads,
Petar Ivankovic Milosevic
Mob: +385996806080 | E-mail: pimilosevic@croz.net
                                                                                                                   
CROZ d.o.o. | Lastovska 23 | 10000 Zagreb | Hrvatska | www.croz.net
 
 
----- Original message -----
From: "Sherard Griffin" <shgriffi@redhat.com>
To: pimilosevic@croz.net
Cc: users@lists.opendatahub.io
Subject: [Users] Re: Upgrading Hadoop version on the s2i Spark notebook
Date: Fri, May 7, 2021 18:12
 
Hi Petar,
 
Yes this is a known issue with the version of Hadoop.  A simple workaround is to rename your bucket with all capital letters, such as "MYBUCKET".  That will force the URL to resolve properly.  Others on the mailing list may know if there is a notebook image available with a more recent version of Hadoop.  I thought there was something done in this area but will check to see.
 
Thanks,
Sherard
 
On Fri, May 7, 2021 at 11:11 AM <pimilosevic@croz.net> wrote:
Hi all,

I'm a little new at OpenShifting and I need to deploy a whole ODH stack.. We have OCS set up to use the s3 storage but the s2i-spark-notebook that comes with the Operator uses a Hadoop version that refuses to change the URL style with hadoopConf.set("fs.s3a.path.style.access", "true")
I get a big error log saying the URL of my bucket is unreachable and the URL that gets used is the bucket.s3.storage.whatever where it should be s3.storage.whatever/bucket

Upon looking around online i found that it could be a bug that was solved in Hadoop version 2.8 so I'd like to upgrade to that if at all possible but don't really understand how to do it. I appreciate any advice.

Stay good,
Petar
_______________________________________________
Users mailing list -- users@lists.opendatahub.io
To unsubscribe send an email to users-leave@lists.opendatahub.io
 
 
--
Thanks,
Sherard Griffin
_______________________________________________
Users mailing list -- users@lists.opendatahub.io
To unsubscribe send an email to users-leave@lists.opendatahub.io
 


Odricanje od odgovornosti - disclaimer