Sqoop Java Client
Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Sqoop successfully graduated from the Incubator in March of 2012 and is now a Top-Level Apache project: More information
Latest stable release is 1.4.4 (download, documentation). Latest cut of Sqoop2 is 1.99.2 (download, documentation).
Here is the Java Client for Apache Sqoop import data from MySql to Hadoop hdfs :)
//Here I am using a table Persons, with columns PersonID and LastName import org.apache.sqoop.client.SqoopClient; import org.apache.sqoop.model.MConnection; import org.apache.sqoop.model.MConnectionForms; import org.apache.sqoop.model.MJob; import org.apache.sqoop.model.MJobForms; import org.apache.sqoop.model.MSubmission; import org.apache.sqoop.validation.Status; /** * @author devan * @date 19-Sep-2013 * @mail msdevanms@gmail.com */ public class SqoopImport { public static void main(String[] args) { String connectionString = "jdbc:mysql://YourMysqlIP:3306/test"; String username = "YourMysqUserName"; String password = "YourMysqlPassword"; String schemaName = "YourMysqlDB"; String tableName = "Persons"; String columns = "PersonID,LastName"; //comma seperated column names String partitionColumn = "PersonID"; String outputDirectory = "/output/Persons"; String url = "http://YourSqoopIP:12000/sqoop/"; SqoopClient client = new SqoopClient(url); //client.setServerUrl(newUrl); //Dummy connection object MConnection newCon = client.newConnection(1); //Get connection and framework forms. Set name for connection MConnectionForms conForms = newCon.getConnectorPart(); MConnectionForms frameworkForms = newCon.getFrameworkPart(); newCon.setName("MyConnection"); //Set connection forms values conForms.getStringInput("connection.connectionString").setValue(connectionString); conForms.getStringInput("connection.jdbcDriver").setValue("com.mysql.jdbc.Driver"); conForms.getStringInput("connection.username").setValue(username); conForms.getStringInput("connection.password").setValue(password); frameworkForms.getIntegerInput("security.maxConnections").setValue(0); Status status = client.createConnection(newCon); if(status.canProceed()) { System.out.println("Created. New Connection ID : " +newCon.getPersistenceId()); } else { System.out.println("Check for status and forms error "); } //Creating dummy job object MJob newjob = client.newJob(newCon.getPersistenceId(), org.apache.sqoop.model.MJob.Type.IMPORT); MJobForms connectorForm = newjob.getConnectorPart(); MJobForms frameworkForm = newjob.getFrameworkPart(); newjob.setName("ImportJob"); //Database configuration connectorForm.getStringInput("table.schemaName").setValue(schemaName); //Input either table name or sql connectorForm.getStringInput("table.tableName").setValue(tableName); //connectorForm.getStringInput("table.sql").setValue("select id,name from table where ${CONDITIONS}"); connectorForm.getStringInput("table.columns").setValue(columns); connectorForm.getStringInput("table.partitionColumn").setValue(partitionColumn); //Set boundary value only if required //connectorForm.getStringInput("table.boundaryQuery").setValue(""); //Output configurations frameworkForm.getEnumInput("output.storageType").setValue("HDFS"); frameworkForm.getEnumInput("output.outputFormat").setValue("TEXT_FILE");//Other option: SEQUENCE_FILE / TEXT_FILE frameworkForm.getStringInput("output.outputDirectory").setValue(outputDirectory); //Job resources frameworkForm.getIntegerInput("throttling.extractors").setValue(1); frameworkForm.getIntegerInput("throttling.loaders").setValue(1); status = client.createJob(newjob); if(status.canProceed()) { System.out.println("New Job ID: "+ newjob.getPersistenceId()); } else { System.out.println("Check for status and forms error "); } //Now Submit the Job MSubmission submission = client.startSubmission(newjob.getPersistenceId()); System.out.println("Status : " + submission.getStatus()); } }
I tried to import data from MYSQL to HDFS through your program but program throws exception on the below line which looks irrelevant exception. Sqoop Server is running properly.
ReplyDeleteMConnection newCon = client.newConnection(1);
Exception: Exception in thread "main" java.lang.IllegalArgumentException: No enum const class org.apache.sqoop.model.MInputType.BOOLEAN
at java.lang.Enum.valueOf(Enum.java:214)
Added my project to github. Here is the link https://github.com/msdevanms/2013.git please check it
ReplyDeleteHave you tried setting the table.sql input instead of pulling the entire set of rows in the table? I tried something like this and it always gave 'unacceptable' status on createJob.
ReplyDeleteselect id,value from table where ${conditions} and groupid=123
Good tutorial. Is there a way to add --enclosed-by and --escape-by properties(or any kind of delimiters) to the above given code sample.
ReplyDeleteThanks in advance
ReplyDeleteMConnection newCon = client.newConnection(1);
Exception in thread "main" java.lang.NullPointerException
at org.apache.sqoop.json.ConnectorBean.restore(ConnectorBean.java:111)
at org.apache.sqoop.client.request.ConnectorRequest.read(ConnectorRequest.java:43)
at org.apache.sqoop.client.request.SqoopRequests.readConnector(SqoopRequests.java:94)
at org.apache.sqoop.client.SqoopClient.retrieveConnector(SqoopClient.java:188)
at org.apache.sqoop.client.SqoopClient.getConnector(SqoopClient.java:140)
at org.apache.sqoop.client.SqoopClient.newConnection(SqoopClient.java:273)
Please help me in this?
DeleteThanks in advance
found any solution? i am also stuck at same error
DeleteIam using url as "http://localhost:12000"
ReplyDeleteSqoopClient client = new SqoopClient(url);
MConnection newCon = client.newConnection(1);
and iam getting this below exception while getting client.newConnection(1)
Exception in thread "main" com.sun.jersey.api.client.UniformInterfaceException: GET http://localhost:12000/v1/connector/1 returned a response status of 404 Not Found
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:676)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:503)
at org.apache.sqoop.client.request.Request.get(Request.java:63)
at org.apache.sqoop.client.request.ConnectorRequest.read(ConnectorRequest.java:38)
at org.apache.sqoop.client.request.SqoopRequests.readConnector(SqoopRequests.java:94)
at org.apache.sqoop.client.SqoopClient.retrieveConnector(SqoopClient.java:188)
at org.apache.sqoop.client.SqoopClient.getConnector(SqoopClient.java:140)
at org.apache.sqoop.client.SqoopClient.newConnection(SqoopClient.java:273)
at com.zoho.mysqlbackup.Sqoop2Client.createConnection(Sqoop2Client.java:24)
at com.zoho.mysqlbackup.Sqoop2Client.main(Sqoop2Client.java:49)
Please try"http://localhost:12000/sqoop/" as url
ReplyDeleteSyed Akram, is it working now ?
ReplyDeleteIts also not working!!
DeleteIam getting same error even i tried with above given url.
DeleteException in thread "main" com.sun.jersey.api.client.UniformInterfaceException: GET http://localhost:12000/sqoop/v1/connector/1 returned a response status of 404 Not Found
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:676)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:503)
at org.apache.sqoop.client.request.Request.get(Request.java:63)
at org.apache.sqoop.client.request.ConnectorRequest.read(ConnectorRequest.java:38)
at org.apache.sqoop.client.request.SqoopRequests.readConnector(SqoopRequests.java:94)
at org.apache.sqoop.client.SqoopClient.retrieveConnector(SqoopClient.java:188)
at org.apache.sqoop.client.SqoopClient.getConnector(SqoopClient.java:140)
at org.apache.sqoop.client.SqoopClient.newConnection(SqoopClient.java:273)
at com.zoho.mysqlbackup.Sqoop2Client.createConnection(Sqoop2Client.java:24)
at com.zoho.mysqlbackup.Sqoop2Client.main(Sqoop2Client.java:49)
try wget http://localhost:12000/sqoop and post the resutl please. I think your sqoop server is not in running mode.
Deletewhich sqoop version you are using ??
Deleteresult of wget http://localhost:12000/sqoop
Deletewget http://localhost:12000/sqoop
--2015-01-05 10:40:26-- http://localhost:12000/sqoop
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:12000... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://localhost:12000/sqoop/ [following]
--2015-01-05 10:40:26-- http://localhost:12000/sqoop/
Connecting to localhost|::1|:12000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 889 [text/html]
Saving to: `index.html'
100%[======================================================================================================>] 889 --.-K/s in 0s
2015-01-05 10:40:26 (272 MB/s) - `index.html' saved [889/889]
Iam using latest sqoop2 release 1.99.4
DeleteHi
ReplyDeleteI am getting null pointer at this line
MConnection newCon = client.newConnection(1);
Is there anything that I am missing
i am using this version Sqoop 1.4.6.2.6.0.3-8
ReplyDeletei am using Hearton Works (hdp cluster)
we dont have sqoop ui
how can i get url connection
any solution for this? I am also unable to connect to using sqoop client
ReplyDeleteYour posts are always insightful and encouraging. Thank you for motivating me to take positive actions in my life. Find extra info in this article. Feel the rhythm in Geometry Dash Free, where every level’s obstacles and jumps are perfectly timed to dynamic, energetic music.
ReplyDelete