Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27713 Iceberg: metadata location overrides can cause data breach
  3. HIVE-27714

Iceberg: metadata location overrides can cause data breach - handling default locations

    XMLWordPrintableJSON

Details

    Description

      With current Iceberg location authorization one explicit ranger policy is required for every tables to prevent the cross-reference of metadata_location exploit as any wildcard based policy to cover set of tables would open up cross-referencing locations between tables covert by the wildcard.

      This is nearly impossible in a production environment. 

      The proposal is to handle the Iceberg table RWStorage authorization a different way when the table is created/altered with it's default location as in this case there is no attempt for cross-referencing another table. There are two options for this:

      When?

      • If no custom metadata_location is set/given in the CREATE/ALTER calls
      • If the given metadata_locaiton's path (e.g. without the metadata json file name) is the same as the current metadata_location's path in the ALTER calls
      • If the given metadata_location's path set/given in CREATE/ALTEER calls is the same as the default location would be for the table based on the warehouse and/or database locations

      What

      1. Either do not call the RWStorage Authorizer for this case
      2. Or set the location to a constant value that can be easily handled with one single access policy on the Authorizer side

      Pros/Cons:

      • Option-1 would not call authorizer so it would not generate an audit even for these on RWStorage level policies but it would omit the Authorization step so it would be more performant
      • Option-2 would end up in the Authorizer which means also would generate an audit event. It also needs a pre-agreed constant for such cases that can be differentiated from normal custom location based authorizations.

      If the Option-2 is chosen:

      • The following policy syntax could be used for custom locations: 
        iceberg://mydatabase/mytable/snapshot=/my/custom/location/whatever/* 
      • While the pre-agreed default location constant based policy format could be:
        iceberg://*/*/snapshot=default_location 

       

      There could be even a new property introduced to decide if the Authorization for default locations should be skipped at-all, or not (and use the e.g. snapshot=default_location constant). This way everyone can decide whether audit events or the performance w/o the authorization step are preferred. 

      Attachments

        Activity

          People

            ayushtkn Ayush Saxena
            jkovacs@HW Janos Kovacs
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: