聊聊 hadoop 与 sasl 安全框架

本文涉及的产品
数据传输服务 DTS,数据迁移 small 3个月
推荐场景:
MySQL数据库上云
数据传输服务 DTS,数据同步 1个月
简介: 聊聊 hadoop 与 sasl 安全框架

聊聊 hadoop 与 sasl 安全框架

1 从一个数据同步作业的 hadoop sasl 异常讲起

某数据同步作业使用 datax 从RDBMS 同步数据到开启了kerberos安全认证的hdfs文件系统,同步作业执行过程中报错,核心报错信息是,客户端与各个 datanode 创建 BlockOutputStream 时都有报错 “Exception in createBlockOutputStream javax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and server”, 最终所有 datanode 都因为这个原因被 exclude 排除了,所以写文件失败,作业失败退出,详细报错日志如下:

##  报错日志: 与某 datanode 创建 BlockOutputStream 时的报错
Jul 06, 2023 2:45:59 PM org.apache.hadoop.security.UserGroupInformation loginUserFromKeytab
## 这里可以看到kerberos认证是成功的
INFO: Login successful for user hs_dap@TDH using keytab file /etc/hs_dap.keytab
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer createBlockOutputStream
INFO: Exception in createBlockOutputStream
javax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and server
 at com.sun.security.sasl.digest.DigestMD5Client.checkQopSupport(DigestMD5Client.java:418)
 at com.sun.security.sasl.digest.DigestMD5Client.evaluateChallenge(DigestMD5Client.java:221)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslParticipant.evaluateChallengeOrResponse(SaslParticipant.java:113)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:452)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getSaslStreams(SaslDataTransferClient.java:391)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:263)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1289)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Abandoning BP-26777269-10.2.43.201-1686811609596:blk_1073808907_68083
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Excluding datanode DatanodeInfoWithStorage[10.2.43.203:50010,DS-be4ebd27-0a08-4240-8a71-4355b25aa04b,DISK]
## 报错日志: 各个 datanode 创建 BlockOutputStream 都有报错,各个 datanode 都被 exclude 了,所以最终客户端写文件失败
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Abandoning BP-26777269-10.2.43.201-1686811609596:blk_1073808909_68085
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Excluding datanode DatanodeInfoWithStorage[10.2.43.201:50010,DS-cff0b0d1-88da-4a64-b9f1-ca5696d8d2c5,DISK]
Jul 06, 2023 2:46:00 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hundsun/dap_pxqs/hive/hs_ods/xxx/part_date=20230704__f6414321_134d_41a4_8fa2_baaa2f89a6be/node1__1dcd5a30_7b67_4c3f_a269_583ba389945c could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1621)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3224)
 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3148)
 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:722)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
 at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2225)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2221)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2197)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2219)
 at org.apache.hadoop.ipc.Client.call(Client.java:1476)
 at org.apache.hadoop.ipc.Client.call(Client.java:1407)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
 at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

2 hadoop sasl 报错问题-原因分析与解决方法

  • 上述报错 “Exception in createBlockOutputStreamjavax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and server at com.sun.security.sasl.digest.DigestMD5Client.checkQopSupport(DigestMD5Client.java:418)” 的核心原因,其实其报错信息中已经告诉我们了,“No common protection layer between client and server”,即 sasl 客户端和 sasl服务端在建立连接的初始阶段,在协商 QoP (quality of protection)时,因为客户端和服务端支持的 QoP 没有交集,所以 QoP协商失败,从而创建连接失败,进而导致作业失败;
  • hadoop 客户端 (sasl客户端)对sasl相关配置参数如 hadoop.rpc.protection/dfs.data.transfer.protection 的配置,都应该以 namonode/datanode (sasl 服务端)为准,两者需要有交集,才能成功协商 QoP并建立连接;
  • hadoop 客户端(sasl客户端)可以将 hadoop.rpc.protection/dfs.data.transfer.protectio 配置跟服务端一致;
  • hadoop 客户端(sasl客户端)也可以将 hadoop.rpc.protection/dfs.data.transfer.protection 配置为逗号分隔的包含多个值的一个列表,此时客户端和服务端会协商确定具体的quality of protection (qop);
  • 有的hadoop服务端可能只配置了 hadoop.rpc.protection,而没有配置dfs.data.transfer.protection,此时客户端对前者的配置仍应以服务端为准,而后者则可以配置为任意有效值或有效值列表;
  • 如果需要更改服务端参数 hadoop.rpc.protection/dfs.data.transfer.protection,则参数修改后需要配置服务并重启hdfs;
  • For a successful SASL authentication, both the client and server need to negotiate and come to agreement on a quality of protection. The Hadoop configuration property hadoop.rpc.protection/dfs.data.transfer.protection supports a comma-separate list of values that map to the SASL QoP values. Both the client end and the server end must have at least one value in common in that list;
  • 可以通过如下命令查询hadoop服务端 sasl QoP相关参数的配置值:
- grep -C 2 'dfs.data.transfer.protection' /etc/hdfs1/conf/hdfs-site.xml;
- grep -C 2 'hadoop.rpc.protection' /etc/hdfs1/conf/core-site.xml;
- hdfs getconf -confKey "dfs.data.transfer.protection" ;
- hdfs getconf -confKey "hadoop.rpc.protection" ;


640.png

具体到 datax 作业,需要配置作业的hadoopConfi参数,datax代码底层会解析该参数并配置到 org.apache.hadoop.conf.Configuration 对象中:

## 配置1-配置为单个值,以服务端为准:
"hadoopConfig":{
    "dfs.data.transfer.protection":"integrity",
    "hadoop.rpc.protection":"authentication"
}
## 配置2-配置为list列表:
"hadoopConfig" : {
  "hadoop.rpc.protection" : "authentication,integrity,privacy",
  "dfs.data.transfer.protection" : "authentication,integrity,privacy"
}


640.png640.png640.png


3. hadoop sasl 报错问题 - datago相关配置方式

  • 具体到公司的数据同步工具 datago,其本质是对开源 datax 在功能和易用性上的增强;
  • datago在see的安装界面上,提供了对hadoop 原生 sasl 相关参数 dfs.data.transfer.protection/hadoop.rpc.protection的配置,两者可配置的值除了hadoop原生支持的authentication/integrity/privacy外,还额外支持配置为nosasl;

640.png640.png


  • 查看 datago 和源码可知,配置dfs.data.transfer.protection/hadoop.rpc.protection 为 nosasl时,其底层不会把对应的参数透传给datax的 hadoopConfig 参数,也就不会透传给 org.apache.hadoop.conf.Configuration 对象;
  • 查看 datago 和源码可知,配置dfs.data.transfer.protection/hadoop.rpc.protection 为非 nosasl 时,其底层会把对应的参数透传给 datax 的 hadoopConfig 参数,也就会进而透传给org.apache.hadoop.conf.Configuration 对象:

640.png


4 hadoop sasl 相关背景知识

  • Hadoop 的 RPC 通信是基于 SASL 框架的,可以通过参数 hadoop.rpc.protection配置 rpc 通信的 quality of protection(QoP)级别;
  • Hadoop DataNode 的数据数据传输协议使用的不是 Hadoop RPC framework;
  • 在 Hadoop 2.6.0 版本前,datanode 数据传输的安全性,是基于 root 权限和 privileged 端口的(hadoop假定攻击者无法获取 DataNode 节点的 root 权限):启动 datanode 时使用了 jsvc 工具程序,首先使用 root 用户启动 jsvc并绑定 privileged ports,然后才会切换使用 HDFS_DATANODE_SECURE_USER 指定的普通用户启动 datanode;
  • 在 Hadoop 2.6.0 版本后,datanode 的数据传输协议开始支持 sasl,并支持多种 sasl QoP 模式:可以配置参数 dfs.data.transfer.protection 指定具体的QoP(此时配置客户端的QoP时,应该参考 datanode 服务端并以后者为准,二者需要有交集);当没有配置参数 dfs.data.transfer.protection 时,datanode 数据传输的安全性,使用的仍是基于 root 权限和 privileged 端口以及 jsvc 的安全机制;
  • 在 Hadoop 2.6.0 版本后,也可以配置参数 dfs.encrypt.data.transfer 为 true,以开启读写 hdfs block 数据时的加密传输,该参数优先级高于 dfs.data.transfer.protection;
  • SASL QoP 可配置的具体值如下(适用于 hadoop.rpc.protection/dfs.data.transfer.protection):
  • authentication :authentication only;
  • integrity: integrity check in addition to authentication;
  • privacy: data encryption in addition to integrity.

640.png640.png640.png


  • 在 hadoop 中,配置sasl客户端和sasl服务端的原理和方法,概述如下:
  • hadoop 客户端对参数 hadoop.rpc.protection/dfs.data.transfer.protection 的配置,应参考服务端以服务端为准, 可以将客户端配置为跟服务端一致;
  • hadoop 客户端也可以将 hadoop.rpc.protection/dfs.data.transfer.protection 都配置为逗号分隔的包含多个值的一个列表,此时客户端和服务端会协商确定具体的quality of protection (qop);
  • 也可以配置参数 dfs.encrypt.data.transfer 为 true,以开启读写 hdfs block 数据时的加密传输,该参数优先级高于 dfs.data.transfer.protection;
  • hadoop rpc framework: To encrypt data that is transferred between Hadoop services and clients, set hadoop.rpc.protection to privacy in core-site.xml;
  • The DataNode data transfer protocol does not use the Hadoop RPC framework,to activate data encryption for the data transfer protocol of DataNode, set dfs.encrypt.data.transfer to true in hdfs-site.xml. (you can optionally set:dfs.encrypt.data.transfer.algorithm/dfs.encrypt.data.transfer.cipher.suites/dfs.encrypt.data.transfer.algorithm/dfs.encrypt.data.transfer.cipher.key.bitlength);
  • Data Encryption on HTTP: Data transfers between Web console and clients are protected using SSL (HTTPS), such as httpfs and webHDFS;

640.png


5 hadoop sasl 相关参数汇总

##相关参数:
- hadoop.security.authentication:default to simple.Possible values are simple (no authentication), and kerberos;
- hadoop.rpc.protection:default to authentication.A comma-separated list of protection values for secured sasl connections. Possible values are authentication, integrity and privacy. authentication means authentication only and no integrity or privacy; integrity implies authentication and integrity are enabled; and privacy implies all of authentication, integrity and privacy are enabled. hadoop.security.saslproperties.resolver.class can be used to override the hadoop.rpc.protection for a connection at the server side.The data transfered between hadoop services and clients can be encrypted on the wire. Setting hadoop.rpc.protection to privacy in core-site.xml activates data encryption.
- dfs.encrypt.data.transfer:default to false. only need to be set on nn/dn,client will deduce this. Whether or not actual block data that is read/written from/to HDFS should be encrypted on the wire. It is possible to override this setting per connection by specifying custom logic via dfs.trustedchannel.resolver.class.You need to set dfs.encrypt.data.transfer to true in the hdfs-site.xml in order to activate data encryption for data transfer protocol of DataNode.
- dfs.data.transfer.protection: This property is unspecified by default. Setting this property enables SASL for authentication of data transfer protocol. If this is enabled, then dfs.datanode.address must use a non-privileged port, dfs.http.policy must be set to HTTPS_ONLY and the HDFS_DATANODE_SECURE_USER environment variable must be undefined when starting the DataNode process.A comma-separated list of SASL protection values used for secured connections to the DataNode when reading or writing block data. Possible values are authentication, integrity and privacy. If dfs.encrypt.data.transfer is set to true, then it supersedes the setting for dfs.data.transfer.protection and enforces that all connections must use a specialized encrypted SASL handshake. This property is ignored for connections to a DataNode listening on a privileged port. In this case, it is assumed that the use of a privileged port establishes sufficient trust;
##参考链接:
https://docs.oracle.com/javase/8/docs/technotes/guides/security/sasl/sasl-refguide.html#DEBUG
https://issues.apache.org/jira/browse/HADOOP-10211
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html
https://commons.apache.org/proper/commons-daemon/jsvc.html
相关文章
|
存储 分布式计算 资源调度
Hadoop系列之一:MAC安装Hadoop大数据框架
Hadoop是一个用Java开发的开源框架,它允许使用简单的编程模型在跨计算机集群的分布式环境中存储和处理大数据。它的设计是从单个服务器扩展到数千个机器,每个都提供本地计算和存储。特别适合写一次,读多次的场景。
Hadoop系列之一:MAC安装Hadoop大数据框架
|
1天前
|
存储 分布式计算 监控
Hadoop【基础知识 01+02】【分布式文件系统HDFS设计原理+特点+存储原理】(部分图片来源于网络)【分布式计算框架MapReduce核心概念+编程模型+combiner&partitioner+词频统计案例解析与进阶+作业的生命周期】(图片来源于网络)
【4月更文挑战第3天】【分布式文件系统HDFS设计原理+特点+存储原理】(部分图片来源于网络)【分布式计算框架MapReduce核心概念+编程模型+combiner&partitioner+词频统计案例解析与进阶+作业的生命周期】(图片来源于网络)
131 2
|
1天前
|
分布式计算 Ubuntu Hadoop
【分布式计算框架】hadoop全分布式及高可用搭建
【分布式计算框架】hadoop全分布式及高可用搭建
6 1
|
1天前
|
存储 分布式计算 Hadoop
【分布式计算框架】Hadoop伪分布式安装
【分布式计算框架】Hadoop伪分布式安装
6 2
|
1天前
|
存储 分布式计算 Hadoop
【专栏】Hadoop,开源大数据处理框架:驭服数据洪流的利器
【4月更文挑战第28天】Hadoop,开源大数据处理框架,由Hadoop Common、HDFS、YARN和MapReduce组成,提供大规模数据存储和并行处理。其优势在于可扩展性、容错性、高性能、灵活性及社区支持。然而,数据安全、处理速度、系统复杂性和技能短缺是挑战。通过加强安全措施、结合Spark、自动化工具和培训,Hadoop在应对大数据问题中保持关键地位。
|
1天前
|
分布式计算 监控 Hadoop
Hadoop【基础知识 02】【分布式计算框架MapReduce核心概念+编程模型+combiner&partitioner+词频统计案例解析与进阶+作业的生命周期】(图片来源于网络)
【4月更文挑战第3天】Hadoop【基础知识 02】【分布式计算框架MapReduce核心概念+编程模型+combiner&partitioner+词频统计案例解析与进阶+作业的生命周期】(图片来源于网络)
58 0
|
7月前
|
存储 分布式计算 Hadoop
【大数据处理框架】Hadoop大数据处理框架,包括其底层原理、架构、编程模型、生态圈
【大数据处理框架】Hadoop大数据处理框架,包括其底层原理、架构、编程模型、生态圈
155 0
|
9月前
|
分布式计算 Hadoop 数据处理
Hadoop基础学习---6、MapReduce框架原理(二)
Hadoop基础学习---6、MapReduce框架原理(二)
|
9月前
|
存储 分布式计算 Hadoop
Hadoop基础学习---6、MapReduce框架原理(一)
Hadoop基础学习---6、MapReduce框架原理(一)
|
SQL 存储 分布式计算
Storm与Spark、Hadoop三种框架对比
Storm与Spark、Hadoop这三种框架,各有各的优点,每个框架都有自己的最佳应用场景。所以,在不同的应用场景下,应该选择不同的框架。
404 0
Storm与Spark、Hadoop三种框架对比

相关实验场景

更多
http://www.vxiaotou.com