Using SQream in an HDFS Environment
Configuring an HDFS Environment for the User sqream
This section describes how to configure an HDFS environment for the user sqream and is only relevant for users with an HDFS environment.
To configure an HDFS environment for the user sqream:
Open your bash_profile configuration file for editing:
$ vim /home/sqream/.bash_profile
Verify that the edits have been made:
source /home/sqream/.bash_profile
Check if you can access Hadoop from your machine:
$ hadoop fs -ls hdfs://<hadoop server name or ip>:8020/
Verify that an HDFS environment exists for SQream services:
$ ls -l /etc/sqream/sqream_env.sh
If an HDFS environment does not exist for SQream services, create one (sqream_env.sh):
$ #!/bin/bash $ SQREAM_HOME=/usr/local/sqream $ export SQREAM_HOME $ export JAVA_HOME=${SQREAM_HOME}/hdfs/jdk $ export HADOOP_INSTALL=${SQREAM_HOME}/hdfs/hadoop $ export CLASSPATH=`${HADOOP_INSTALL}/bin/hadoop classpath --glob` $ export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SQREAM_HOME}/lib:$HADOOP_COMMON_LIB_NATIVE_DIR $ PATH=$PATH:$HOME/.local/bin:$HOME/bin:${SQREAM_HOME}/bin/:${JAVA_HOME}/bin:$HADOOP_INSTALL/bin $ export PATH
Authenticating Hadoop Servers that Require Kerberos
If your Hadoop server requires Kerberos authentication, do the following:
Create a principal for the user sqream.
$ kadmin -p root/[email protected] $ addprinc [email protected]
If you do not know yor Kerberos root credentials, connect to the Kerberos server as a root user with ssh and run kadmin.local:
$ kadmin.local
Running kadmin.local does not require a password.
If a password is not required, change your password to sqream@SQ.COM.
$ change_password [email protected]
Connect to the hadoop name node using ssh:
$ cd /var/run/cloudera-scm-agent/process
Check the most recently modified content of the directory above:
$ ls -lrt
Look for a recently updated folder containing the text hdfs.
The following is an example of the correct folder name:
cd <number>-hdfs-<something>
This folder should contain a file named hdfs.keytab or another similar .keytab file.
Copy the .keytab file to user sqream’s Home directory on the remote machines that you are planning to use Hadoop on.
Copy the following files to the sqream sqream@server:<sqream folder>/hdfs/hadoop/etc/hadoop: directory:
core-site.xml
hdfs-site.xml
Connect to the sqream server and verify that the .keytab file’s owner is a user sqream and is granted the correct permissions:
$ sudo chown sqream:sqream /home/sqream/hdfs.keytab $ sudo chmod 600 /home/sqream/hdfs.keytab
Log into the sqream server.
Log in as the user sqream.
Navigate to the Home directory and check the name of a Kerberos principal represented by the following .keytab file:
$ klist -kt hdfs.keytabThe following is an example of the correct output:
$ sqream@Host-121 ~ $ klist -kt hdfs.keytab $ Keytab name: FILE:hdfs.keytab $ KVNO Timestamp Principal $ ---- ------------------- ------------------------------------------------------ $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 HTTP/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected] $ 5 09/15/2020 18:03:05 hdfs/[email protected]
Verify that the hdfs service named hdfs/nn1@SQ.COM is shown in the generated output above.
Run the following:
$ kinit -kt hdfs.keytab hdfs/[email protected]
Verify that the output is correct:
$ klist
The following is an example of the correct output:
$ Ticket cache: FILE:/tmp/krb5cc_1000 $ Default principal: [email protected] $ $ Valid starting Expires Service principal $ 09/16/2020 13:44:18 09/17/2020 13:44:18 krbtgt/[email protected]
List the files located at the defined server name or IP address:
$ hadoop fs -ls hdfs://<hadoop server name or ip>:8020/
Do one of the following:
If the list below is output, continue with Step 18.
If the list is not output, verify that your environment has been set up correctly.
If any of the following are empty, verify that you followed Step 6 in the Configuring an HDFS Environment for the User sqream section above correctly:
$ echo $JAVA_HOME $ echo $SQREAM_HOME $ echo $CLASSPATH $ echo $HADOOP_COMMON_LIB_NATIVE_DIR $ echo $LD_LIBRARY_PATH $ echo $PATH
Verify that you copied the correct keytab file.
Review this procedure to verify that you have followed each step.