Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (manually by hadoop fs -put command), the metastore will not be aware of these partitions.
you need to add partition for every partition
ALTER TABLE <tablename> ADD PARTITION
Or in short you can run
MSCK REPAIR TABLE <tablename>; available since Hive 0.11
It will add any partitions that exist on HDFS but not in metastore to the metastore.
- Copy the partition folders and data to a table folder.
- Create a table with dynamic partitions on the table folder.
- Run MCSK Repair on the new table.
For example,
CREATE TABLE IF NOT EXISTS new_table ( host STRING, agent STRING ) PARTITIONED BY ( year INT, month INT ) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" LOCATION '/user/me/path/table_folder'; MSCK REPAIR TABLE new_table;
MSCK REPAIR is
Reference:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)
http://www-01.ibm.com/support/knowledgecenter/#!/SSPT3X_3.0.0/
com.ibm.swg.im.infosphere.biginsights.commsql.doc/doc/biga_msckrep.html
http://stackoverflow.com/questions/26753602/hive-0-13-msck-repair-table-only-lists-partitions-not-in-metastore
No comments:
Post a Comment