With Pig 0.10, we now have an option to pass PigStorage the argument ‘-schema’ while storing data. This will create a ‘.pig_schema’ file in the output directory which is a JSON file containing the schema.
store B into 'output' using PigStorage('\t', '-schema');
So the next time you load ‘output’, you only need to specify the location of output to LOAD.
- PigStorage always tries to load the .pig_schema file, unless you explicitly say -noschema.
- If you don’t specify anything at all, PigStorage will try to load a schema, and silently fail (behave as before) if it’s not present or unreadable.
- If you specify -schema during loading, PigStorage will fail if a schema is not present.
- If you specify -noschema during loading, PigStorage will ignore the .pig_schema file.
- PigStorage will only *store* the schema if you specify -schema.
No comments:
Post a Comment