spark安装

Spark cluster共3篇,其余Spark安装、Spark - cluster
  • Spark
    v3.1
  • OS
    RHEL8(Alma / Rocky Linux)

一、基础软件

1. R-lang

$ sudo dnf -y install epel-release
$ sudo dnf  --enablerepo=powertools install R

注意:如果Java没有安装,会附带着安装OpenJDK。

2. Scala

$ wget https://downloads.lightbend.com/scala/2.12.17/scala-2.12.17.tgz
$ tar xvf scala-2.12.17.tgz
$ sudo mv scala-2.12.17 /usr/lib
$ sudo ln -s /usr/lib/scala-2.12.17 /usr/lib/scala

1) add path

$vi ~/.bashrc
PATH=$PATH:/usr/lib/scala/bin

二、Spark

1. down & unzip

v3.1.3

$ mkdir -p ~/app/spark
$ tar zxvf spark-3.1.3-bin-hadoop2.7.tgz -C ~/app/spark
$ mv ~/app/spark/spark-3.1.3-bin-hadoop2.7 ~/app/spark/3.1.3

2. test

$ cd ~/app/spark/3.1.3
$ ./bin/spark-shell --master local[2]

input:

for (i <- 1 to 3; j <- 1 to 3 if i != j) print(10 * i + j + "\t")

input:(exit)

:q

三、开启服务

exec:

$SPARK_HOME/sbin/start-master.sh

1. 添加系统变量

$ echo 'export SPARK_HOME=$HOME/app/spark/3.1.3' >> .bash_profile
$ echo 'export PATH=$PATH:$SPARK_HOME/bin' >> .bash_profile
$ source ~/.bash_profile

2. open port

sudo firewall-cmd --permanent --zone=public --add-port=6066/tcp
sudo firewall-cmd --permanent --zone=public --add-port=7077/tcp
sudo firewall-cmd --permanent --zone=public --add-port=8080-8088/tcp

take effect:

sudo firewall-cmd --reload

Reference