Hello,
About the spark thrift server using spark cluster question: Initially, we
used the Spark operator created by the RADAnalytics team (
https://radanalytics.io/) to implement the Data Catalog architecture.
RADAnalytics has this approach of long-running spark clusters as it was
something in mind that customers would potentially look for. On the other
hand, there's the Spark operator created by Google in which approach is
app-centric, meaning that you only need to manage spark apps and they will
use the k8s scheduler to provision the spark cluster. This last approach
has the problem of being ephemeral by design, which needs other components
like Spark History Server to track all spark app executions.
The change to use k8s scheduler with Spark thrift server was on our radar,
but we decide to focus our development efforts on Trino because of some
performance issues we found with Spark Thrift Server, as well as lack of
Data Security implementation and others that blocks our path to a full Data
Governance story. At last, we don't have a final decision if we still want
to maintain a spark operator on ODH since kubeflow already does that, and
RADAnalytics is in maintenance status.
On Tue, Oct 12, 2021 at 10:35 AM Ki Dong Lee <kidlee(a)redhat.com> wrote:
Hi everyone,
I am new to ODH.
It seems that all the components in ODH will be deployed with kubeflow
operator and kustomize manifests.
Could you tell me how to deploy such components in detail on OCP in ODH?
Another question is about spark on kubernetes. I have noticed that in ODH,
If you want to deploy spark thrift server as hiver server2 , spark cluster
needs to be deployed on OCP in ODH beforehand.
I think, there is a way to submit spark thrift server on kubernetes/OCP
<
https://spark.apache.org/docs/3.0.3/running-on-kubernetes.html> directly
without having spark cluster deployed on OCP.
Is there any reason to do so?
Cheers,
- Kidong Lee.
_______________________________________________
Users mailing list -- users(a)lists.opendatahub.io
To unsubscribe send an email to users-leave(a)lists.opendatahub.io
--
Ricardo Martinelli De Oliveira
Senior Software Engineer, AI Managed Services
Red Hat Brazil <
https://www.redhat.com/>
Av. Brigadeiro Faria Lima, 3900
8th floor
rmartine(a)redhat.com T: +551135426125
M: +5511970696531
@redhatjobs <
https://twitter.com/redhatjobs> redhatjobs
<
https://www.facebook.com/redhatjobs> @redhatjobs
<
https://instagram.com/redhatjobs>
<
https://www.redhat.com/>