Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7001

Documentation - renaming columns name in csv header

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.15.0
    • None
    • None

    Description

      Don't know how if this is the best place for this request but,

      Some operation are realized that eventually change the name of the column when requesting a csvh file (with header),
      These operations are not documented.
      Although it's possible to read HeaderBuilder.java, It will be interesting to create a section in documentation to explain at least the principle of these different cases to avoid stupid problems/difficulties

      List of operations (maybe not exhaustive) :

      • Trim() on CSV column name
         Name , Age,PoB  , Info
        =>
        `Name`, `Age`, `PoB` and `Info`
      • Others characters than [a-zA-Z0-9_] are replace by '_' (underscore)
        Name,Sum$,em@il
        =>
        `Name`,'`Sum_`,`em_il`
      • Fieldname starting with '_' (underscore) are prefixed by 'col'
        _name,_age_,pob_,_col_
        =>
        `col_name`, `col_age_`, `pob_`, `col_col_`
      • Fieldname starting with [^a-zA-Z] are prefixed 'col_'
        0_name, 1_age,@pob,#other1,'other2'
        =>
        `col_0_name`, `col_1_age`, `col_pob`, `col_other1`, `col_other2_`
      •  Quotation marks are removed
      • If char is unique
        • if [a-zA-Z] do nothing
        • elif [0-9] prefix with col_
        • else reanme in column_[0-9]+ where [0-9]+ designs the position of the column
      • Duplicate columns names (case insensitive) are suffixed with _[0-9]+ (starting from "_2")
        0_name,col_0_name,colx,COLX,colx,colx_2
        =>
        `col_0_name`, `col_0_name_2`, `colx`, `COLX_2`, `colx_3`, `colx_2_2`

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            benj641 benj

            Dates

              Created:
              Updated:

              Slack

                Issue deployment