[Open Data Hub Contributors] Re: Using custom spark images in opendatahub operator

Monday, 19 August 2019

One side note about the spark image: Because Spark SQL Thrift server has an
open issue in spark 2.2 that breaks thrift server we need to use spark 2.4.
I already have the image built and I can push it when I have proper
permissions to do it.

As for Landon suggestion, I think it's a good idea and I can in advance
create a quay.io/opendatahub/spark-cluster-image:2.4 tag if needed so we
can reuse the same name but with a different tag.

On Mon, Aug 19, 2019 at 11:10 AM Landon LaSmith <llasmith(a)redhat.com&gt; wrote:

...
 Do we want to store all of the ODH spark images in the same
repository? We
 already have quay.io/opendatahub/spark-cluster-image
 <https://quay.io/repository/opendatahub/spark-cluster-image?tab=tags>;.
 Should we deprecate that repo and create a new one to store both?

 On Mon, Aug 19, 2019 at 10:06 AM Ricardo Martinelli de Oliveira <
 rmartine(a)redhat.com&gt; wrote:

> Thanks everyone for the feedback!
>
> As the winner is #2 (push the custom imago into quay.io operndatahub
> organization), who should I ask for permissions to push my image in this
> org?
>
> My quay.io username is the same as my kerberos user.
>
> On Mon, Aug 19, 2019 at 10:57 AM Landon LaSmith <llasmith(a)redhat.com&gt;
> wrote:
>
>> Agree with #2
>>
>> On Mon, Aug 19, 2019 at 9:53 AM Václav Pavlín <vasek(a)redhat.com&gt; wrote:
>>
>>> I agree with #2 - ODH should work out of box, so we need to provide the
>>> image (which is a no for #1), and #3 sounds like an overkill
>>>
>>> Thanks,
>>> V.
>>>
>>> On Mon, Aug 19, 2019 at 3:43 PM Alex Corvin <acorvin(a)redhat.com&gt;
wrote:
>>>
>>>> I think my vote is for #2. Option #1 will continue to be supported for
>>>> groups that need it, but we can make it easier for people to get up and
>>>> running by curating an official image.
>>>>
>>>>
>>>> On August 19, 2019 at 9:33:44 AM, Ricardo Martinelli de Oliveira (
>>>> rmartine(a)redhat.com) wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm integrating Spark SQL Thrift server into ODH operator and I need
>>>> to use a custom spark image (other than the RADAnalytics image) with
>>>> additional jars to access Ceph/S3 buckets. Actually, both thrift server
and
>>>> the spark cluster will need this custom spark image in order to access
the
>>>> buckets.
>>>>
>>>> With that being said, I'd like to discuss some options to get this
>>>> done. I am thinking about these options:
>>>>
>>>> 1) Let the customer specify the custom image in the yaml file (this is
>>>> already possible)
>>>> 2) Create that custom spark image and publish on quay.io opendarahub
>>>> organization
>>>> 3) Add a buildconfig object and make operator create the custom build
>>>> and set the image location into the deploymentconfig objects
>>>>
>>>> Although the third option automate everything and deliver the whole
>>>> set with the custom image, there's this thing about supporting
custom
>>>> images within operators. We'd need to add a spark_version variable
where
>>>> the build could download the spark distribution corresponding to that
>>>> version and the artifacts related and run the build. In the first
option,
>>>> we simply don't create the build objects and document that in order
to use
>>>> Thrift server in ODH operator, both spark cluster and thrift must use a
>>>> custom spark image containing the jars needed to access Ceph/S3. At
last,
>>>> the middle term between both is option two, so we don't need to worry
about
>>>> delegate this task to the user or the operator.
>>>>
>>>> What do you think? What could be the best option for this scenario?
>>>>
>>>> --
>>>>
>>>> Ricardo Martinelli De Oliveira
>>>>
>>>> Data Engineer, AI CoE
>>>>
>>>> Red Hat Brazil <https://www.redhat.com/>
>>>>
>>>> Av. Brigadeiro Faria Lima, 3900
>>>>
>>>> 8th floor
>>>>
>>>> rmartine(a)redhat.com    T: +551135426125
>>>> M: +5511970696531
>>>> @redhatjobs <https://twitter.com/redhatjobs>   redhatjobs
>>>> <https://www.facebook.com/redhatjobs> @redhatjobs
>>>> <https://instagram.com/redhatjobs>
>>>> <https://www.redhat.com/>
>>>> _______________________________________________
>>>> Contributors mailing list -- contributors(a)lists.opendatahub.io
>>>> To unsubscribe send an email to
>>>> contributors-leave(a)lists.opendatahub.io
>>>>
>>>> _______________________________________________
>>>> Contributors mailing list -- contributors(a)lists.opendatahub.io
>>>> To unsubscribe send an email to
>>>> contributors-leave(a)lists.opendatahub.io
>>>>
>>>
>>>
>>> --
>>> Open Data Hub, AI CoE, Office of CTO, Red Hat
>>> Brno, Czech Republic
>>> Phone: +420 739 666 824
>>>
>>> _______________________________________________
>>> Contributors mailing list -- contributors(a)lists.opendatahub.io
>>> To unsubscribe send an email to contributors-leave(a)lists.opendatahub.io
>>>
>>
>>
>> --
>> Landon LaSmith
>> Sr.Software Engineer
>> Red Hat, AI CoE - Data Hub
>>
>
>
> --
>
> Ricardo Martinelli De Oliveira
>
> Data Engineer, AI CoE
>
> Red Hat Brazil <https://www.redhat.com/>
>
> Av. Brigadeiro Faria Lima, 3900
>
> 8th floor
>
> rmartine(a)redhat.com    T: +551135426125
> M: +5511970696531
> @redhatjobs <https://twitter.com/redhatjobs>   redhatjobs
> <https://www.facebook.com/redhatjobs> @redhatjobs
> <https://instagram.com/redhatjobs>
> <https://www.redhat.com/>
>

 --
 Landon LaSmith
 Sr.Software Engineer
 Red Hat, AI CoE - Data Hub

-- 

Ricardo Martinelli De Oliveira

Data Engineer, AI CoE

Red Hat Brazil <https://www.redhat.com/>

Av. Brigadeiro Faria Lima, 3900

8th floor

rmartine(a)redhat.com    T: +551135426125
M: +5511970696531
@redhatjobs <https://twitter.com/redhatjobs>   redhatjobs
<https://www.facebook.com/redhatjobs> @redhatjobs
<https://instagram.com/redhatjobs>
<https://www.redhat.com/>

2026

2025

2024

2023

2022

2021

2020

2019

2018

[Open Data Hub Contributors] Re: Using custom spark images in opendatahub operator