Uploaded image for project: 'Apache IoTDB'
  1. Apache IoTDB
  2. IOTDB-41 Refactor FileNode module
  3. IOTDB-90

[discuss] take FileNodeProcessorStore away

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • None
    • None
    • File Node Refactor Sprint

    Description

      While each Storage Group (SG) has a FileNodeProcessor in an IoTDB instance, FileNodeProcessorStore is for saving some information about the FileNodeProcessor. 

      I conjecture that using FileNodeProcessorStore is for accelerating the startup process of IoTDB, because it stores:

      // code placeholder
      private boolean isOverflowed;
      private Map<String, Long> lastUpdateTimeMap;
      private TsFileResource emptyTsFileResource;
      private List<TsFileResource> newFileNodes;
      private int numOfMergeFile;
      private FileNodeProcessorStatus fileNodeProcessorStatus;
      

       Using the above info, we know whether a SG has overflow files, the last update time  for each devices*, all tsfiles**, and whether the filenode is in a merge process when the last IoTDB instance was shutdown.

      • last update time != last flush time: last update time is the max timestamp for a device and the corresponding data point may be in memtable, while the last flush time means the max timestamp for a device on disk (TsFiles).  

       ** The most useful info in each TsFileResource is that it stores the start time and the end time of each device. 

      If we have a ProcessorStore file like the above, then we can quickly restore all the above info.  However, we can get all the above info without expensive cost even if we have no such a Store file:

      • isOverflowed: if the corresponding overflow folder has files, it is true.
      • lastUpdateTimeMap: if we must recover all data in WAL first, and then provide service for CRUD, then this field is unnecessary.
      • emptyTsFileResource and newFileNodes: for the most important info (the start time and the end time of each device), we can get it from each TsFile by just reading its fileMetadata;
      • fileNodeProcessorStatus: seems also unnecessary if we discard all unfinished process.

      So, I think we can have a try to remove this class. In this way, I think the write speed can be better and the IOPS of disk can be reduced. But it is not in hurry to do that.

      Maybe the class plays other roles that I do not know, so I leave this discussion here.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            jixuan1989 xiangdong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: