Let’s discuss on enabling Hive Metastore NULL result for an empty string.
In Hive, while inserting values, if some columns have empty strings and you want to display it as NULL when queried the table.
From the below image, you can see the empty string are displayed just like that. And this property applies to only String column type.
To display it as a NULL value, like the below image, you have to set 2 properties.
- Hive Metastore Server (Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml) from Cloudera manager
- Table properties
Hive Metastore Property
Cloudera manager -> Hive -> Configuration -> Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml
<property>
<name>hive.metastore.orm.retrieveMapNullsAsEmptyStrings</name>
<value>true</value>
</property>
In Cloudera Manager 5.10, you have good UI to set those values, see below image.
If you set those value to False, this will not work.
After setting NULLAsEmptyStringsMetastore, property, you have to restart Hive service to take effect.
Once you have enabled property in Hive Metastore. Let’s create a table.
Table creation:
CREATE TABLE display_null (first_name string, second_name string);
Here we are creating the table with 2 string column.
Insert Values:
INSERT INTO TABLE display_null VALUES (“Boopathi”, “”), (“”, “Boopathi”);
we have insterted 2 rows with empty string on each.
Display as an empty string:
TABLE PROPERTIES:
ALTER TABLE display_null SET TBLPROPERTIES(‘serialization.null.format’=”);
setting “serialization.null.format“
Display result as NULL :
Table description:
DESCRIBE FORMATTED display_null;
you will see the properties in the below image.
Serialization property to set null for the empty string is working only for the hive table with TEXTFILE format or RCFILE format. Do we have any configuration to set empty sting as Null for the Avro/Parquet or ORC file format?
Please suggest..