This is great news Ricardo! Can we get volunteers to review this in the
next few days?
On Mon, Aug 19, 2019 at 3:39 PM Ricardo Martinelli de Oliveira <
rmartine(a)redhat.com> wrote:
Excelent! Thanks!
I went ahead and already created a merge request[1] that agregate Hive
Metastore, Spark SQL Thrift Server and Hue into ODH operator. If possible,
please review it.
Meanwhile, I'll make sure the image hosted in opendatahub org is enough to
deploy these components.
[1]
https://gitlab.com/opendatahub/opendatahub-operator/merge_requests/53
On Mon, Aug 19, 2019 at 4:00 PM Sherard Griffin <shgriffi(a)redhat.com>
wrote:
> Ok awesome. You should have the permissions to upload to that repo now.
>
> On Mon, Aug 19, 2019 at 2:57 PM Ricardo Martinelli de Oliveira <
> rmartine(a)redhat.com> wrote:
>
>> The customization is basically get a spark 2.4.3 distribution installed
>> in the image and add the required jars in $SPARK_HOME/jars directory to
>> access Ceph/S3 buckets. the fix for the issue you mention is added in the
>> code since spark 2.3 iirc.
>>
>> If the current image already has these things done, I can use the image
>> as is.
>>
>>
>> On Mon, Aug 19, 2019 at 3:31 PM Sherard Griffin <shgriffi(a)redhat.com>
>> wrote:
>>
>>> Ricardo,
>>>
>>> What are the customizations? Is this to fix retrieving the tables and
>>> databases from Hive metastore? I've fixed the permissions on the
>>> spark-cluster-image repo, although I'm not a fan of the name of it...
>>> Should just be "spark".
>>>
>>> Thanks,
>>> Sherard
>>>
>>> On Mon, Aug 19, 2019 at 12:02 PM Ricardo Martinelli de Oliveira <
>>> rmartine(a)redhat.com> wrote:
>>>
>>>> One side note about the spark image: Because Spark SQL Thrift server
>>>> has an open issue in spark 2.2 that breaks thrift server we need to use
>>>> spark 2.4. I already have the image built and I can push it when I have
>>>> proper permissions to do it.
>>>>
>>>> As for Landon suggestion, I think it's a good idea and I can in
>>>> advance create a quay.io/opendatahub/spark-cluster-image:2.4 tag if
>>>> needed so we can reuse the same name but with a different tag.
>>>>
>>>> On Mon, Aug 19, 2019 at 11:10 AM Landon LaSmith
<llasmith(a)redhat.com>
>>>> wrote:
>>>>
>>>>> Do we want to store all of the ODH spark images in the same
>>>>> repository? We already have quay.io/opendatahub/spark-cluster-image
>>>>>
<
https://quay.io/repository/opendatahub/spark-cluster-image?tab=tags>.
>>>>> Should we deprecate that repo and create a new one to store both?
>>>>>
>>>>> On Mon, Aug 19, 2019 at 10:06 AM Ricardo Martinelli de Oliveira <
>>>>> rmartine(a)redhat.com> wrote:
>>>>>
>>>>>> Thanks everyone for the feedback!
>>>>>>
>>>>>> As the winner is #2 (push the custom imago into quay.io
>>>>>> operndatahub organization), who should I ask for permissions to
push my
>>>>>> image in this org?
>>>>>>
>>>>>> My quay.io username is the same as my kerberos user.
>>>>>>
>>>>>> On Mon, Aug 19, 2019 at 10:57 AM Landon LaSmith
<llasmith(a)redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Agree with #2
>>>>>>>
>>>>>>> On Mon, Aug 19, 2019 at 9:53 AM Václav Pavlín
<vasek(a)redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I agree with #2 - ODH should work out of box, so we need
to
>>>>>>>> provide the image (which is a no for #1), and #3 sounds
like an overkill
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> V.
>>>>>>>>
>>>>>>>> On Mon, Aug 19, 2019 at 3:43 PM Alex Corvin
<acorvin(a)redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I think my vote is for #2. Option #1 will continue to
be
>>>>>>>>> supported for groups that need it, but we can make it
easier for people to
>>>>>>>>> get up and running by curating an official image.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On August 19, 2019 at 9:33:44 AM, Ricardo Martinelli
de Oliveira (
>>>>>>>>> rmartine(a)redhat.com) wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm integrating Spark SQL Thrift server into ODH
operator and I
>>>>>>>>> need to use a custom spark image (other than the
RADAnalytics image) with
>>>>>>>>> additional jars to access Ceph/S3 buckets. Actually,
both thrift server and
>>>>>>>>> the spark cluster will need this custom spark image
in order to access the
>>>>>>>>> buckets.
>>>>>>>>>
>>>>>>>>> With that being said, I'd like to discuss some
options to get
>>>>>>>>> this done. I am thinking about these options:
>>>>>>>>>
>>>>>>>>> 1) Let the customer specify the custom image in the
yaml file
>>>>>>>>> (this is already possible)
>>>>>>>>> 2) Create that custom spark image and publish on
quay.io
>>>>>>>>> opendarahub organization
>>>>>>>>> 3) Add a buildconfig object and make operator create
the custom
>>>>>>>>> build and set the image location into the
deploymentconfig objects
>>>>>>>>>
>>>>>>>>> Although the third option automate everything and
deliver the
>>>>>>>>> whole set with the custom image, there's this
thing about supporting custom
>>>>>>>>> images within operators. We'd need to add a
spark_version variable where
>>>>>>>>> the build could download the spark distribution
corresponding to that
>>>>>>>>> version and the artifacts related and run the build.
In the first option,
>>>>>>>>> we simply don't create the build objects and
document that in order to use
>>>>>>>>> Thrift server in ODH operator, both spark cluster and
thrift must use a
>>>>>>>>> custom spark image containing the jars needed to
access Ceph/S3. At last,
>>>>>>>>> the middle term between both is option two, so we
don't need to worry about
>>>>>>>>> delegate this task to the user or the operator.
>>>>>>>>>
>>>>>>>>> What do you think? What could be the best option for
this
>>>>>>>>> scenario?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Ricardo Martinelli De Oliveira
>>>>>>>>>
>>>>>>>>> Data Engineer, AI CoE
>>>>>>>>>
>>>>>>>>> Red Hat Brazil <
https://www.redhat.com/>
>>>>>>>>>
>>>>>>>>> Av. Brigadeiro Faria Lima, 3900
>>>>>>>>>
>>>>>>>>> 8th floor
>>>>>>>>>
>>>>>>>>> rmartine(a)redhat.com T: +551135426125
>>>>>>>>> M: +5511970696531
>>>>>>>>> @redhatjobs <
https://twitter.com/redhatjobs>
redhatjobs
>>>>>>>>> <
https://www.facebook.com/redhatjobs>
@redhatjobs
>>>>>>>>> <
https://instagram.com/redhatjobs>
>>>>>>>>> <
https://www.redhat.com/>
>>>>>>>>> _______________________________________________
>>>>>>>>> Contributors mailing list --
contributors(a)lists.opendatahub.io
>>>>>>>>> To unsubscribe send an email to
>>>>>>>>> contributors-leave(a)lists.opendatahub.io
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Contributors mailing list --
contributors(a)lists.opendatahub.io
>>>>>>>>> To unsubscribe send an email to
>>>>>>>>> contributors-leave(a)lists.opendatahub.io
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Open Data Hub, AI CoE, Office of CTO, Red Hat
>>>>>>>> Brno, Czech Republic
>>>>>>>> Phone: +420 739 666 824
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Contributors mailing list --
contributors(a)lists.opendatahub.io
>>>>>>>> To unsubscribe send an email to
>>>>>>>> contributors-leave(a)lists.opendatahub.io
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Landon LaSmith
>>>>>>> Sr.Software Engineer
>>>>>>> Red Hat, AI CoE - Data Hub
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Ricardo Martinelli De Oliveira
>>>>>>
>>>>>> Data Engineer, AI CoE
>>>>>>
>>>>>> Red Hat Brazil <
https://www.redhat.com/>
>>>>>>
>>>>>> Av. Brigadeiro Faria Lima, 3900
>>>>>>
>>>>>> 8th floor
>>>>>>
>>>>>> rmartine(a)redhat.com T: +551135426125
>>>>>> M: +5511970696531
>>>>>> @redhatjobs <
https://twitter.com/redhatjobs> redhatjobs
>>>>>> <
https://www.facebook.com/redhatjobs> @redhatjobs
>>>>>> <
https://instagram.com/redhatjobs>
>>>>>> <
https://www.redhat.com/>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Landon LaSmith
>>>>> Sr.Software Engineer
>>>>> Red Hat, AI CoE - Data Hub
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Ricardo Martinelli De Oliveira
>>>>
>>>> Data Engineer, AI CoE
>>>>
>>>> Red Hat Brazil <
https://www.redhat.com/>
>>>>
>>>> Av. Brigadeiro Faria Lima, 3900
>>>>
>>>> 8th floor
>>>>
>>>> rmartine(a)redhat.com T: +551135426125
>>>> M: +5511970696531
>>>> @redhatjobs <
https://twitter.com/redhatjobs> redhatjobs
>>>> <
https://www.facebook.com/redhatjobs> @redhatjobs
>>>> <
https://instagram.com/redhatjobs>
>>>> <
https://www.redhat.com/>
>>>> _______________________________________________
>>>> Contributors mailing list -- contributors(a)lists.opendatahub.io
>>>> To unsubscribe send an email to
>>>> contributors-leave(a)lists.opendatahub.io
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Sherard Griffin
>>>
>>
>>
>> --
>>
>> Ricardo Martinelli De Oliveira
>>
>> Data Engineer, AI CoE
>>
>> Red Hat Brazil <
https://www.redhat.com/>
>>
>> Av. Brigadeiro Faria Lima, 3900
>>
>> 8th floor
>>
>> rmartine(a)redhat.com T: +551135426125
>> M: +5511970696531
>> @redhatjobs <
https://twitter.com/redhatjobs> redhatjobs
>> <
https://www.facebook.com/redhatjobs> @redhatjobs
>> <
https://instagram.com/redhatjobs>
>> <
https://www.redhat.com/>
>>
>
>
> --
> Thanks,
> Sherard Griffin
>
--
Ricardo Martinelli De Oliveira
Data Engineer, AI CoE
Red Hat Brazil <
https://www.redhat.com/>
Av. Brigadeiro Faria Lima, 3900
8th floor
rmartine(a)redhat.com T: +551135426125
M: +5511970696531
@redhatjobs <
https://twitter.com/redhatjobs> redhatjobs
<
https://www.facebook.com/redhatjobs> @redhatjobs
<
https://instagram.com/redhatjobs>
<
https://www.redhat.com/>