Description
Here are some examples:
create table test_comment (id1 string comment 'full_\tname1', id2 string comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile;
When execute `show create table test_comment`, we can see the following content in the console,
createtab_stmt
CREATE TABLE `test_comment`(
`id1` string COMMENT 'full_
`id2` string COMMENT 'full_
`id3` string COMMENT 'full_
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment'
TBLPROPERTIES (
'transient_lastDdlTime'='1513095570')
And the output of `desc formatted table ` is a little similar,
col_name data_type comment
# col_name data_type commentid1 string full_
id2 string full_
id3 string full_# Detailed Table Information
(ignore)...
When execute `desc extended test_comment`, the problem is more obvious,
col_name data_type comment
id1 string full_
id2 string full_
id3 string full_Detailed Table Information Table(tableName:test_comment, dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, comment:full_ name1), FieldSchema(name:id2, type:string, comment:full_
the rest of the content is lost.
The content is not really lost, it's just can not display normal. Because hive store the result in LazyStruct, and LazyStruct use '\t' as field separator:
// LazyStruct.java#parse() // Go through all bytes in the byte[] while (fieldByteEnd <= structByteEnd) { if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { // Reached the end of a field? if (lastColumnTakesRest && fieldId == fields.length - 1) { fieldByteEnd = structByteEnd; } startPosition[fieldId] = fieldByteBegin; fieldId++; if (fieldId == fields.length || fieldByteEnd == structByteEnd) { // All fields have been parsed, or bytes have been parsed. // We need to set the startPosition of fields.length to ensure we // can use the same formula to calculate the length of each field. // For missing fields, their starting positions will all be the same, // which will make their lengths to be -1 and uncheckedGetField will // return these fields as NULLs. for (int i = fieldId; i <= fields.length; i++) { startPosition[i] = fieldByteEnd + 1; } break; } fieldByteBegin = fieldByteEnd + 1; fieldByteEnd++;