Tuesday, 24 June 2014

Create Sqoop Action in Oozie using Hue WEB UI


1. Copy the DB driver into ooze's share lib
sudo -u oozie mkdir /tmp
sudo -u oozie mv sqljdbc4-3.0.jar /tmp
sudo -u oozie  hadoop fs -put /tmp/sqljdbc4-3.0.jar /user/oozie/share/lib/sqoop

or
put sqljdbc.jar to $AppPath/lib with the jars of pig script.

2. Error: Error parsing arguments for import

This is because Hue will submit this script through Oozie Sqoop Action. It has a particular way to specify the arguments. 
The Sqoop command can be specified either using the command element or multiple arg elements.
When using the command element, Oozie will split the command on every space into multiple arguments.
When using the arg elements, Oozie will pass each argument value as an argument to Sqoop.
The arg variant should be used when there are spaces within a single argument.
For example, --query "select * from A"

1. Argument way
Params Value:
import
--connect
jdbc:sqlserver://xxxxxx
--username
hadoopuser
--password
mypwd(no '')
--query
"select * from A where $CONDITIONS"
--fields-terminated-by
\t

2. Command way
<command>import --connect jdbc:sqlserver://xxxx --username hadoopuser -password mypwd</command>
<arg>--query</arg>
<arg>"select * from A where $CONDITIONS"<arg>

3. --options-file way
<arg>import</arg>
<arg>--options-file</arg>
<arg>${optionFile}</arg>
<file>${optionFile}</file>

in job.properties file
optionFile=option.par

option.par is in the same folder with workflow.xml


4. Import workflow.xml in Hue, but missing arguments tags.
  • The "argument" tags are not imported
  • The "exec" and "file" tags are imported but, add strings incorrectly

This is a bug of Hue. And will be fixed in Hue 2.5


Reference:

https://oozie.apache.org/docs/4.0.1/DG_SqoopActionExtension.html
https://issues.cloudera.org/browse/HUE-1321

https://gist.github.com/tmusabbir/7013556
http://grokbase.com/t/cloudera/cdh-user/1334q57427/issue-with-sqoop-options-file-option-in-oozie

2 comments:

  1. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai


    Spring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Corporate TRaining Spring Framework the

    authors explore the idea of using Java in Big Data platforms.

    Spring Training in Chennai


    The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai

    ReplyDelete