Hello,

About the spark thrift server using spark cluster question: Initially, we used the Spark operator created by the RADAnalytics team (https://radanalytics.io/) to implement the Data Catalog architecture. RADAnalytics has this approach of long-running spark clusters as it was something in mind that customers would potentially look for. On the other hand, there's the Spark operator created by Google in which approach is app-centric, meaning that you only need to manage spark apps and they will use the k8s scheduler to provision the spark cluster. This last approach has the problem of being ephemeral by design, which needs other components like Spark History Server to track all spark app executions.

The change to use k8s scheduler with Spark thrift server was on our radar, but we decide to focus our development efforts on Trino because of some performance issues we found with Spark Thrift Server, as well as lack of Data Security implementation and others that blocks our path to a full Data Governance story. At last, we don't have a final decision if we still want to maintain a spark operator on ODH since kubeflow already does that, and RADAnalytics is in maintenance status.

On Tue, Oct 12, 2021 at 10:35 AM Ki Dong Lee <kidlee@redhat.com> wrote:
Hi everyone,

I am new to ODH.  

It seems that all the components in ODH will be deployed with kubeflow operator and kustomize manifests.
Could you tell me how to deploy such components in detail on OCP in ODH?

Another question is about spark on kubernetes. I have noticed that in ODH, If you want to deploy spark thrift server as hiver server2 , spark cluster needs to be deployed on OCP in ODH beforehand.
I think, there is a way to submit spark thrift server on kubernetes/OCP directly without having spark cluster deployed on OCP.
Is there any reason to do so?

Cheers,

- Kidong Lee.



 

_______________________________________________
Users mailing list -- users@lists.opendatahub.io
To unsubscribe send an email to users-leave@lists.opendatahub.io


--

Ricardo Martinelli De Oliveira

Senior Software Engineer, AI Managed Services

Red Hat Brazil

Av. Brigadeiro Faria Lima, 3900

8th floor

rmartine@redhat.com    T: +551135426125    
M: +5511970696531