在集群安装 Hadoop 的过程中,出现了这样的问题。

所有 Node 都起来了,工作正常,唯独 secondary namenode 在 doCheckpoint 的时候报错,而且是诡异的 403 http error。

    // secondary namenode log
    2011-10-24 17:09:12,255 INFO org.apache.hadoop.security.UserGroupInformation: Initiating re-login for hadoop/hz169-92.i.site.com@I.SITE.COM
    2011-10-24 17:09:22,917 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint: 2011-10-24 17:09:22,918 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Server returned HTTP response code: 403 for URL: https://hz169-91.i.site.com:50475/getimage?getimage=1
    ...

于是怀疑 kerberos 认证问题,可是 secondary namenode 已经通过 Kerberos 验证了;

又怀疑 secondary namenode 向 namenode 请求服务被拒绝,可是 namenode 的 log 显示已经通过验证了。(hadoop/hz169-92.i.site.com@I.SITE.COM 是 secondary namenode 的 kerberos principal,hadoop/hz169-91.i.site.com@I.SITE.COM 是 namenode 的 kerberos principal)

    // namenode log
    2011-10-25 11:24:33,927 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: Received non-NN/SNN request for image or edits from 123.58.169.92
    2011-10-25 11:27:40,033 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successfull for hadoop/hz169-92.i.site.com@I.SITE.COM 2011-10-25 11:27:40,100 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successfull for hadoop/hz169-92.i.site.com@I.SITE.COM for protocol=interface org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol 2011-10-25 11:27:40,101 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 123.58.169.92
    ...

为了便于测试,可以将 checkpoint 的周期调小:

    // hdfs-site.xml
    <property>
      <name>fs.checkpoint.period</name>
      <value>5</value>
    </property>

然后各种怀疑,各种猜测,各种尝试,无果。

网上 Hadoop 的资料很多,但使用 Kerberos 做 Hadoop 安全验证的很少。

决定自给自足,找出出错的这段代码,加 log,先从 doGet 开始分析:

代码下载自这里,吐槽下 Cloudera,cdh3u1 的代码真难找,我这是根据目录结构蒙出来的。

      // org.apache.hadoop.hdfs.server.namenode.GetImageServlet.java L52
      public void doGet(final HttpServletRequest request,
                        final HttpServletResponse response
                        ) throws ServletException, IOException {
        Map<string ,String[]> pmap = request.getParameterMap();
        try {
          ServletContext context = getServletContext();
          final FSImage nnImage = (FSImage)context.getAttribute("name.system.image");
          final TransferFsImage ff = new TransferFsImage(pmap, request, response);
          final Configuration conf = (Configuration)getServletContext().getAttribute(JspHelper.CURRENT_CONF);

          if(UserGroupInformation.isSecurityEnabled() &&
              !isValidRequestor(request.getRemoteUser(), conf)) {
            // 这段 warn log 在 namenode 中打出来了,所以就意味着上面 if 里面的两个条件都为 true
            // 就是说 isSecurityEnabled = true 并且 isValidRequestor() = false
            // 前者猜测应该问题不大,可以 log 看下值,重点分析后者
            response.sendError(HttpServletResponse.SC_FORBIDDEN,
                "Only Namenode and Secondary Namenode may access this servlet");
            LOG.warn("Received non-NN/SNN request for image or edits from "
                + request.getRemoteHost());
            return;
          }

      // org.apache.hadoop.hdfs.server.namenode.GetImageServlet.java L126
      private boolean isValidRequestor(String remoteUser, Configuration conf)
          throws IOException {
        if(remoteUser == null) { // This really shouldn't happen...
          // 这个 log 没有打印出来,所以代码没有进入这个 if block
          LOG.warn("Received null remoteUser while authorizing access to getImage servlet");
          return false;
        }

        String[] validRequestors = {
            SecurityUtil.getServerPrincipal(conf
                .get(DFS_NAMENODE_KRB_HTTPS_USER_NAME_KEY), NameNode.getAddress(
                conf).getHostName()),
            SecurityUtil.getServerPrincipal(conf.get(DFS_NAMENODE_USER_NAME_KEY),
                NameNode.getAddress(conf).getHostName()),
            SecurityUtil.getServerPrincipal(conf
                .get(DFS_SECONDARY_NAMENODE_KRB_HTTPS_USER_NAME_KEY),
                SecondaryNameNode.getHttpAddress(conf).getHostName()),
            SecurityUtil.getServerPrincipal(conf
                .get(DFS_SECONDARY_NAMENODE_USER_NAME_KEY), SecondaryNameNode
                .getHttpAddress(conf).getHostName()) };

        // 在这里添加 log,打印 remoteUser 值
        for(String v : validRequestors) {
          // 在这里添加 log,打印每个 v 的值
          if(v != null && v.equals(remoteUser)) {
            // 出错的时候这个函数应该返回 false,所以代码执行没有进入这个 if block 里面
            // 也就是说 v.equals(remoteUser) 是 false,v == null 的可能性不大
            if(LOG.isDebugEnabled()) LOG.debug("isValidRequestor is allowing: " + remoteUser);
            return true;
          }
        }
        if(LOG.isDebugEnabled()) LOG.debug("isValidRequestor is rejecting: " + remoteUser);
        return false;
      }

编译完之后,直接替换 hadoop-core-0.20.2-cdh3u1.jar 里面的 .class 文件就行。 替换的三个 .class 文件:

org/apache/hadoop/hdfs/server/namenode/GetImageServlet$1$1.class org/apache/hadoop/hdfs/server/namenode/GetImageServlet$1.class org/apache/hadoop/hdfs/server/namenode/GetImageServlet.class

打印结果如下:

    // namenode log
    2011-10-25 15:53:33,927 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: Received non-NN/SNN request for image or edits from 123.58.169.92
    2011-10-25 15:53:38,969 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successfull for hadoop/hz169-92.i.site.com@I.SITE.COM
    2011-10-25 15:53:39,067 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successfull for hadoop/hz169-92.i.site.com@I.SITE.COM for protocol=interface org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol
    2011-10-25 15:53:39,068 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 123.58.169.92
    2011-10-25 15:53:39,083 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: *********** RemoteUser is hadoop/hz169-92.i.site.com@I.SITE.COM
    2011-10-25 15:53:49,296 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: ******** validRequestors = hadoop/hz169-91.i.site.com@I.SITE.COM
    2011-10-25 15:53:49,297 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: ******** validRequestors = hadoop/hz169-91.i.site.com@I.SITE.COM
    2011-10-25 15:53:49,297 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: ******** validRequestors = host/hz169-91.i.site.com@I.SITE.COM
    2011-10-25 15:53:49,297 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: ******** validRequestors = hadoop/hz169-91.i.site.com@I.SITE.COM
    2011-10-25 15:53:49,298 WARN org.apache.hadoop.hdfs.server.namenode.GetImageServlet: Received non-NN/SNN request for image or edits from 123.58.169.92

很明显,remoteUser (hadoop/hz169-92.i.site.com@I.SITE.COM) 跟 validRequestor (hadoop/hz169-91.i.site.com@I.SITE.COM) 不一样的。

这时候,想起来了,hdfs-site.xml 里面的 principal 是如下设置的:

    // hdfs-site.xml
    <property>
      <name>dfs.secondary.namenode.kerberos.principal</name>
      <value>hadoop/_HOST@I.SITE.COM</value>
    </property>
    <property>
      <name>dfs.secondary.namenode.kerberos.https.principal</name>
      <value>host/_HOST@I.SITE.COM</value>
    </property>

肯定是这个 _HOST 解析出了问题,尝试把 _HOST 改成 hz169-92.i.site.com,重启,问题解决!

虽然问题解决了,但是为什么这个解析会出错呢?因为一开始,secondary namenode 启动的时候,kerberos 验证是通过了的,登陆用户是 hadoop/hz169-92.i.site.com@I.SITE.COM,也就是说那个时候 _HOST 解析应该是正确的。

继续看代码:

      // org.apache.hadoop.hdfs.server.namenode.GetImageServlet.java L142
      // hostname 的值是通过 getHttpAddress 获取的
      SecurityUtil.getServerPrincipal(conf
          .get(DFS_SECONDARY_NAMENODE_USER_NAME_KEY), SecondaryNameNode
          .getHttpAddress(conf).getHostName()) };

      // org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.java L154
      /*
       * Handle the transition from pairs of attributes specifying a host and port
       * to a single colon separated one.
       */
      // 这段代码其实就是根据配置生成 host:port 串,即 InetSocketAddress 实例
      // 如果没有设这些配置,我在上面那段 code 的加了 log 来显示 getHttpAddress() 的结果
      // 显示的结果是 getHttpAddress().getHostName() == "0.0.0.0"
      // 也就是说这个函数返回的结果是 0.0.0.0:port
      public static InetSocketAddress getHttpAddress(Configuration conf) {
        String infoAddr = NetUtils.getServerAddress(conf,
            "dfs.secondary.info.bindAddress", "dfs.secondary.info.port",
            // 似曾眼熟啊...
            "dfs.secondary.http.address");
        return NetUtils.createSocketAddr(infoAddr);
      }

那么来看 SecurityUtil.getServerPrincipal() 拿到 0.0.0.0 是做了什么?

      // org.apache.hadoop.security.SecurityUtil.java L128
      public static String getServerPrincipal(String principalConfig,
          String hostname) throws IOException {
        String[] components = getComponents(principalConfig);
        if (components == null || components.length != 3
            || !components[1].equals(HOSTNAME_PATTERN)) {
          return principalConfig;
        } else {
          // 程序进入到这个函数
          return replacePattern(components, hostname);
        }
      }

      // org.apache.hadoop.security.SecurityUtil.java L174
      private static String replacePattern(String[] components, String hostname)
          throws IOException {
        String fqdn = hostname;
        if (fqdn == null || fqdn.equals("") || fqdn.equals("0.0.0.0")) {
          // Magic happens here!
          // 如果是 0.0.0.0,这里会做替换
          fqdn = getLocalHostName();
        }
        return components[0] + "/" + fqdn + "@" + components[2];
      }

      // 这个函数返回当前主机的 localhost,因为出问题时候这段代码执行在 namenode 上面
      // 所以 secondary namenode 的 hostname 被替换成了 namenode 的 hostname
      // 跟 log 打印相符和,也再次印证了为什么 secondary namenode 启动的时候 hostname 是正确的
      // 因为那个时候这段代码执行在 secondary namenode 上面
      static String getLocalHostName() throws UnknownHostException {
        return InetAddress.getLocalHost().getCanonicalHostName();
      }

OK, 配置上 dfs.seconary.http.address,还原 principle instance(hostname) 为 _HOST,重启,问题解决!

    // hdfs-site.xml
    <property>
      <name>dfs.secondary.http.address</name>
      <value>hz169-92.i.site.com:50090</value>
      <description>
        The secondary namenode http server address and port.
        If the port is 0 then the server will start on a free port.
      </description>
    </property>