logo
down
shadow

Remote access to HDFS on Kubernetes


Remote access to HDFS on Kubernetes

By : minh
Date : November 29 2020, 01:01 AM
Any of those help I know this question is about just getting it to run in a dev environment, but HDFS is very much a work in progress on K8s, so I wouldn't by any means run it in production (as of this writing). It's quite tricky to get it working on a container orchestration system because:
You are talking about a lot of data and a lot of nodes (namenodes/datanodes) that are not meant to start/stop in different places in your cluster. You have the risk of having a constantly unbalanced cluster if you are not pinning your namenodes/datanodes to a K8s node (which defeats the purpose of having a container orchestration system) If you run your namenodes in HA mode and it for any reason your namenodes die and restart you run the risk of corrupting the namenode metadata which would make you lose all your data. It's also risky if you have a single node and you don't pin it to a K8s node. You can't scale up and down easily without running in an unbalanced cluster. Running an unbalanced cluster defeats one of the main purposes of HDFS.
code :


Share : facebook icon twitter icon
HDFS access from remote host through Java API, user authentication

HDFS access from remote host through Java API, user authentication


By : AS.DEV
Date : March 29 2020, 07:55 AM
wish help you to fix your issue After some studying I came to the following solution:
I don't actually need the full Kerberos solution, it is enough currently that clients can run HDFS requests from any user. Environment itself is considered secure. This gives me solution based on hadoop UserGroupInformation class. In future I can extend it to support Kerberos.
code :
package org.myorg;

import java.security.PrivilegedExceptionAction;

import org.apache.hadoop.conf.*;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileStatus;

public class HdfsTest {

    public static void main(String args[]) {

        try {
            UserGroupInformation ugi
                = UserGroupInformation.createRemoteUser("hbase");

            ugi.doAs(new PrivilegedExceptionAction<Void>() {

                public Void run() throws Exception {

                    Configuration conf = new Configuration();
                    conf.set("fs.defaultFS", "hdfs://1.2.3.4:8020/user/hbase");
                    conf.set("hadoop.job.ugi", "hbase");

                    FileSystem fs = FileSystem.get(conf);

                    fs.createNewFile(new Path("/user/hbase/test"));

                    FileStatus[] status = fs.listStatus(new Path("/user/hbase"));
                    for(int i=0;i<status.length;i++){
                        System.out.println(status[i].getPath());
                    }
                    return null;
                }
            });
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
 HADOOP_USER_NAME=hdfs hdfs fs -put /root/MyHadoop/file1.txt /
How to copy files from HDFS to remote HDFS

How to copy files from HDFS to remote HDFS


By : Rene Dohan
Date : March 29 2020, 07:55 AM
it fixes the issue I want to copy files from my Hadoop cluster to the remote cluster. , The typical way to copy between clusters is using distcp.
code :
$ hadoop distcp hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo
kubernetes remote access dashboard

kubernetes remote access dashboard


By : user3432061
Date : March 29 2020, 07:55 AM
will help you I did a UI tool to help you to forward any service to your local machine.
You can see an example of how to forward the dashboard:
Hdfs with Kerberos cannot access from remote server

Hdfs with Kerberos cannot access from remote server


By : suresh
Date : March 29 2020, 07:55 AM
Does that help Finally, I found the reason: when kerberos use AES-256 encryption, you should install JCE. I've installed JCE on the machine within the HDFS cluster, but I didn't realize that the client machine outside the cluster also need JCE. This is the reason why I can access HDFS on the machine within the HDFS cluster, but can't on machine outside the HDFS cluster.
Externally Access Hadoop HDFS deployed in Kubernetes

Externally Access Hadoop HDFS deployed in Kubernetes


By : Many Hong
Date : March 29 2020, 07:55 AM
With these it helps I would suggest you to try externalIP out.
suppose your datanode is listening at port 50000, you can create seperate service for every datanode and use the nodeip of the node it running on as the externalIP. something like this:
code :
apiVersion: v1
kind: Service
metadata:
  name: datanode-1
spec:
  externalIPs:
  - node1-ip
  ports:
  - name: datanode
    port: 50000
  selector:
    app: datanode
    id: "1"
---
apiVersion: v1
kind: Service
metadata:
  name: datanode-2
spec:
  externalIPs:
  - node2-ip
  ports:
  - name: datanode
    port: 50000
  selector:
    app: datanode
    id: "2"
---
apiVersion: v1
kind: Service
metadata:
  name: datanode-3
spec:
  externalIPs:
  - node3-ip
  ports:
  - name: datanode
    port: 50000
  selector:
    app: datanode
    id: "3"
shadow
Privacy Policy - Terms - Contact Us © ourworld-yourmove.org