Uploaded image for project: 'Ranger'
  1. Ranger
  2. RANGER-4745

Enhance handling of subAccess authorization in Ranger HDFS plugin

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • Ranger
    • None

    Description

      Currently Ranger performs authorization of the HDFS commands which require access to the hierarchy of files/directory rooted at the argument passed to the HDFS command as described below. Some examples of such commands are :

       

       

      hdfs dfs -count -q -h -v <directory>; hdfs dfs -R <directory>

      HDFS Authorization Interface

      When these commands are invoked, HDFS Namenode builds a tree of i-nodes corresponding to <directory>, and passes it to the authorizer with a flag indicating that subAccess (access to the directory hierarchy rooted at <directory>) is to be checked.

      Ranger implementation

      For each directory in the hierarchy rooted at <directory>, Ranger code checks if the requested permissions (typically read and execute) are allowed using only Ranger policies. If any directory in the top-down path starting from <directory> does not allow access, then the authorization steps done until then are discarded, and the HDFS default authorizer is called upon to check the access with the same arguments. The default authorizer only checks the HDFS ACLs (and not any Ranger policies) on each directory in the hierarchy to determine the access.

      Design of new Ranger implementation

      For each directory in the hierarchy rooted at <directory>, new Ranger design
      1. Checks if the requested permissions are allowed using only Ranger policies
      2. If the access is denied, the authorization steps done until this point are discarded, and the HDFS default authorizer is called upon to check the access with the original set of argument, and the result of default authorizer is returned to Namenode.
      3. Otherwise, if the access is not determined, a new set of arguments are constructed for the directory being processed and HDFS default authorizer is called to check the access with the modified set of arguments.
      4. If the default authorizer does not allow the access, then the result is returned to Namenode.
      5. Otherwise, the processing continues with the next directory.

      Performance considerations

      The new implementation may have some impact on the performance. A few cases are as follows.
      1. There is a Ranger policy that allows requested permissions recursively to some directory in the hierarchy. Depending on how deep this directory is in the hierarchy, the number of directories for which the access evaluation is requested will change. Higher this directory in the hierarchy, lesser the number of evaluations. In the existing implementation, a short-circuiting of calls for evaluating Ranger policies will, in general, happen earlier, and the default authorizer will be called upon the handle the authorization.
      2. In the worst case, if there is no Ranger policy for any directory in the hierarchy, then each directory in the hierachy there will be a target of access evaluation by Ranger and by the default authorizer (if the HDFS ACLs for each directory allow requested accesses).

      Attachments

        Activity

          People

            abhayk Abhay Kulkarni
            abhayk Abhay Kulkarni
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: