Friday, November 29, 2013

About the book "Instant Apache Sqoop"

Instant Apache Sqoop 


Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. Sqoop successfully graduated from the Incubator in March of 2012 and is now a Top-Level Apache project.

Developers may need to import data from Sql to Hadoop HDFS, Hive, HBase. Sqoop is the best tool for it. “Instant Apache Sqoop” is describing how to use Sqoop. “Instant Apache Sqoop” the title is accurate and self describing. Introduction itself is in a good and informative one. Even the layman can use Sqoop by using this book, the author “Ankit Jain ” wrote this book as simple. This book covers almost every apsects of Sqoop, import to HDFS, Hive, HBase and the exports as well.

This book is well illustrated especially in “How it works” is added in each and every part. This helped me alot to understand the back-stage things of Sqoop. Actually i am blind about the various connectors supporting by the Sqoop. This book helps me to find out these.

  • MySQL
  • Oracle
  • SQL Server
  • PostGre
  • DB2
  • HSQLDB

The important thing is I learned “Incremental Import”. Incremental import means importing the new version of records or the latest inserted records from the RDBMS table into HDFS . I think that is a very good option in Sqoop.

I can't find a Sqoop client  (Sqoop-Java client) in this book. I expect that also. But as the name implies it is Instant.

So my friends if you need a quick start on Sqoop, “Instant Apache Sqoop” will help you.

All codes used in this book are available at http://www.PacktPub.com . 
You can buy this book from : 



Monday, October 21, 2013

HTML File input tag for specific file type


FILE UPLOAD

Sometimes we may need to specify the exact file type for upload/select.
This is one way to handle that.

Using File Extension(Here .csv is using. You can use any extension like .jar, .ppt etc.)
<input type="file" name="pic" id="pic" accept=".csv" />

Using File Type
<input type="file" name="pic" id="pic" accept="text/plain" />
You can use this link to find out the file types. http://www.iana.org/assignments/media-types

For selecting video files only
<input type="file" name="pic" id="pic" accept="video/*" />


For selecting audio files only
<input type="file" name="pic" id="pic" accept="audio/*" />

For   Excel Files 2010 (.xlsx) 
<input type="file" accept="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" />


Here is the Working Example : http://jsfiddle.net/LzLcZ/371/

Thursday, October 17, 2013

Tomcat – Java.Lang.OutOfMemoryError

Apache Tomcat server may spit this error. To solve this error you need to update catalina.sh file.

1. Locate catalina.sh in bin directory. (Eg : apache-tomcat-7.0.22/bin/catalina.sh).
2. Edit it and append this line,


JAVA_OPTS="-Djava.awt.headless=true -Dfile.encoding=UTF-8 
-server -Xms1536m -Xmx1536m
-XX:NewSize=256m -XX:MaxNewSize=256m -XX:PermSize=256m 
-XX:MaxPermSize=256m -XX:+DisableExplicitGC"

3. Save file and restart the server.

Thats it.




Partial example of the catalina.sh file
#   JSSE_HOME       (Optional) May point at your Java Secure Sockets Extension
#                   (JSSE) installation, whose JAR files will be added to the
#                   system class path used to start Tomcat.
#
#   CATALINA_PID    (Optional) Path of the file which should contains the pid
#                   of catalina startup java process, when start (fork) is used
#
# $Id: catalina.sh 609438 2008-01-06 22:14:28Z markt $
# ----------------------------------------------------------------------------
 
JAVA_OPTS="-Djava.awt.headless=true -Dfile.encoding=UTF-8 -server -Xms1536m 
-Xmx1536m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:PermSize=256m 
-XX:MaxPermSize=256m -XX:+DisableExplicitGC"
 
 
# OS specific support.  $var _must_ be set to either true or false.
cygwin=false
os400=false
darwin=false
case "`uname`" in
CYGWIN*) cygwin=true;;
OS400*) os400=true;;
Darwin*) darwin=true;;
esac
 
# resolve links - $0 may be a softlink
PRG="$0"


Wednesday, October 2, 2013

String array Intersect

String array Intersect 






import java.util.Arrays;
import java.util.LinkedHashSet;
import java.util.Set;
/**
 * @author devan
 * @date 03-Oct-2013
 * @email devanms@am.amrita.edu
 */

public class ArrayIntersect {
 public static void main(String[] args) {

  String[] strings1 = { "a", "b", "c", "f" };
  String[] strings2 = { "b", "c", "d", "f" };
  Set set = new LinkedHashSet(Arrays.asList(strings1));
  set.retainAll(Arrays.asList(strings2));
  System.out.println(set.toString());
 /* 
  String[] stringIntersect = set.toArray(new String[0]); //For storing the result in another array 
  System.out.println(Arrays.toString(stringIntersect));
 */
 
 }
}


Output
======
[b, c, f]

Monday, September 30, 2013

Get json object data without knowing the keys :)

Dynamic Json Object Keyset Finding



import java.util.Iterator;
import java.util.Set;

import org.json.simple.JSONObject;

public class Json {
public static void main(String[] args) {
	JSONObject jsonObject = new JSONObject();
	jsonObject.put("a", "aaa");
	jsonObject.put("b", "bbb");
	
    Set keys = jsonObject.keySet();
    Iterator a = keys.iterator();
    while(a.hasNext()) {
    	String key = (String)a.next();
        // loop to get the dynamic key
        String value = (String)jsonObject.get(key);
        System.out.print("key : "+key);
        System.out.println(" value :"+value);
    }
}
}

Wednesday, September 18, 2013

Sqoop Java Client

Sqoop Java Client

Apache Sqoop

Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Sqoop successfully graduated from the Incubator in March of 2012 and is now a Top-Level Apache project: More information
Latest stable release is 1.4.4 (downloaddocumentation). Latest cut of Sqoop2 is 1.99.2 (downloaddocumentation).

Here is the Java Client for Apache Sqoop import data from MySql to Hadoop hdfs :)

//Here I am using a table Persons, with columns PersonID and LastName
import org.apache.sqoop.client.SqoopClient;
import org.apache.sqoop.model.MConnection;
import org.apache.sqoop.model.MConnectionForms;
import org.apache.sqoop.model.MJob;
import org.apache.sqoop.model.MJobForms;
import org.apache.sqoop.model.MSubmission;
import org.apache.sqoop.validation.Status;

/**
 * @author  devan
 * @date 19-Sep-2013
 * @mail msdevanms@gmail.com
 */

public class SqoopImport {
 public static void main(String[] args) {
  
  
  String connectionString = "jdbc:mysql://YourMysqlIP:3306/test";
  String username = "YourMysqUserName";
  String password = "YourMysqlPassword";
  String schemaName = "YourMysqlDB";
  String tableName = "Persons";
  String columns = "PersonID,LastName"; //comma seperated column names
  String partitionColumn = "PersonID";
  String outputDirectory = "/output/Persons";
  String url = "http://YourSqoopIP:12000/sqoop/";

  
  SqoopClient client = new SqoopClient(url);
  //client.setServerUrl(newUrl);
  //Dummy connection object
  MConnection newCon = client.newConnection(1);

  //Get connection and framework forms. Set name for connection
  MConnectionForms conForms = newCon.getConnectorPart();
  MConnectionForms frameworkForms = newCon.getFrameworkPart();
  newCon.setName("MyConnection");

  //Set connection forms values
  conForms.getStringInput("connection.connectionString").setValue(connectionString);
  conForms.getStringInput("connection.jdbcDriver").setValue("com.mysql.jdbc.Driver");
  conForms.getStringInput("connection.username").setValue(username);
  conForms.getStringInput("connection.password").setValue(password);

  frameworkForms.getIntegerInput("security.maxConnections").setValue(0);

  Status status  = client.createConnection(newCon);
  if(status.canProceed()) {
   System.out.println("Created. New Connection ID : " +newCon.getPersistenceId());
  } else {
   System.out.println("Check for status and forms error ");
  }

  //Creating dummy job object
  MJob newjob = client.newJob(newCon.getPersistenceId(), org.apache.sqoop.model.MJob.Type.IMPORT);
  MJobForms connectorForm = newjob.getConnectorPart();
  MJobForms frameworkForm = newjob.getFrameworkPart();

  newjob.setName("ImportJob");
  //Database configuration
  connectorForm.getStringInput("table.schemaName").setValue(schemaName);
  //Input either table name or sql
  connectorForm.getStringInput("table.tableName").setValue(tableName);
  //connectorForm.getStringInput("table.sql").setValue("select id,name from table where ${CONDITIONS}");
  
  
  connectorForm.getStringInput("table.columns").setValue(columns);
  connectorForm.getStringInput("table.partitionColumn").setValue(partitionColumn);
  
  //Set boundary value only if required
  //connectorForm.getStringInput("table.boundaryQuery").setValue("");

  //Output configurations
  frameworkForm.getEnumInput("output.storageType").setValue("HDFS");
  frameworkForm.getEnumInput("output.outputFormat").setValue("TEXT_FILE");//Other option: SEQUENCE_FILE / TEXT_FILE
  frameworkForm.getStringInput("output.outputDirectory").setValue(outputDirectory);
  //Job resources
  frameworkForm.getIntegerInput("throttling.extractors").setValue(1);
  frameworkForm.getIntegerInput("throttling.loaders").setValue(1);

  status = client.createJob(newjob);
  if(status.canProceed()) {
   System.out.println("New Job ID: "+ newjob.getPersistenceId());
  } else {
   System.out.println("Check for status and forms error ");
  }
  //Now Submit the Job
  MSubmission submission = client.startSubmission(newjob.getPersistenceId());
  System.out.println("Status : " + submission.getStatus());
 
 }

 
}

Tuesday, September 10, 2013

Save a webpage using JAVA

Save a webpage using JAVA



import java.io.InputStream;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;


public class DownloadWebpage {
public static void main(String[] args) {
    Path path = Paths.get("PATH TO SAVE WEB PAGE"); // Eg: /home/devan/google
       URI u = URI.create("https://www.google.co.in/");
       try (InputStream in = u.toURL().openStream()) {
           Files.copy(in, path);
   } catch (Exception e) {
    e.printStackTrace();
   }
}
}

Thursday, August 22, 2013

/sbin/start-stop-daemon: unable to open pidfile '/var/run/hive/hive-metastore.pid' for writing (No such file or directory)

unable to open pidfile '/var/run/hive/hive-metastore.pid'


Hi friends,

This error eat my 3 hours. Finally i fixed it.

The problem in my side was  the post install scripts didn't create directories /var/run/hive and /var/lock/subsys. Or it may removed :).

So i just created these directories manually.

sudo mkdir /var/run/hive
sudo mkdir /var/lock/subsys

Saturday, July 27, 2013

Start a program using Linux Commands from Java & get its Process ID

Get Process ID of Linux Process that started using Java (Runtime.getRuntime().exec(command))


Hi my friends, we know how to start a sub-program from java using Runtime.getRuntime().exec(command). But how to get process id ?

Here is the answer,


java.lang.Process class is abstract and that you actually get a concrete subclass depending on your platform. On Linux, you'll get a java.lang.UnixProcesswhich has a private field int pid.



Process proc = Runtime.getRuntime().exec(command);
Field f = proc.getClass().getDeclaredField("pid");
f.setAccessible(true);
System.out.println("Process ID : " + f.get(proc));


Example :


import java.lang.reflect.Field;

public class TestCMD {
 public static void main(String[] args) {
  String cmd = "echo HelloAll";
  int processId = execute(cmd);
  System.out.println("Process ID : " + processId);
 }

 private static int execute(String command) {
  int pid = 0;
  try {
   Process proc = Runtime.getRuntime().exec(command);
   Field f = proc.getClass().getDeclaredField("pid");
   f.setAccessible(true);
   pid = (int) f.get(proc);

  } catch (Exception e) {

   e.printStackTrace();
  }
  return pid;
 }
}

Thursday, July 25, 2013

SERVER SENT EVENTS

SERVER SENT EVENTS 



Server-sent events is a technology for where a browser gets automatic updates from a server via HTTP connection. The Server-Sent Events EventSource API is standardized as part of HTML5 by the W3C.

Examples: Facebook/Twitter updates, stock price updates, news feeds, sport results, etc.



Web browser support for Server-Sent Events
Browser
Supported
Notes
NO
[3]
Yes
Starting with Firefox 6.0 [4]
Yes
[3]
Yes
Starting with Opera 11 [3]
Yes
Starting with Safari 5.0 [3]



SERVER SENT EVENTS IN SPRING MVC

Controller for mapping events

@Controller
@RequestMapping("sseTest.htm")
public class SseController {
@RequestMapping(method = RequestMethod.GET)
public @ResponseBody String sendMessage(HttpServletResponse response) {
                Random r = new Random();
                response.setContentType("text/event-stream, charset=UTF-8");
                try {
                        Thread.sleep(10000);
                } catch (InterruptedException e) {
                        e.printStackTrace();
                }   
                return "data:Testing 1,2,3" + r.nextInt() +"\n\n" + "retry: 100\n" ;
      }

}

Controller for mapping jsp page

@Controller
@RequestMapping("viewSSE.htm")
public class ViewSSEController {
 
@RequestMapping(method = RequestMethod.GET)
public String home(ModelMap model){
 
 return "viewSSE";
}
}


In view (viewSSE.jsp)

Wednesday, July 3, 2013

Handle IP and IP number in MySQL

Handle IP and IP number in MySQL




You can use int unsigned for storing IP address (like 127.0.0.1)  in MySQL.

But HOW ???

INET_ATON() and INET_NTOA() Functions in MySQL will help you....


SELECT INET_ATON('127.0.0.1');

+------------------------+
| INET_ATON('127.0.0.1') |
+------------------------+
|             2130706433 | 
+------------------------+
1 row in set (0.00 sec)


SELECT INET_NTOA('2130706433');

+-------------------------+
| INET_NTOA('2130706433') |
+-------------------------+
| 127.0.0.1               | 
+-------------------------+
1 row in set (0.02 sec)
Store the converted IP number as int unsigned in MySQL using INET_ATON() and get the exact IP addess by using INET_NTOA()

MySQL persistence connection settings (Change connection Time out values)

MySQL persistence connection settings


1. Start the DB server as 'root' user with  the comandline option and check the values of 'net_read_timeout' and 'wait_timeout'



The result may be like this :

mysql> SHOW SESSION VARIABLES LIKE 'net_read_timeout';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| net_read_timeout | 30    |
+------------------+-------+
1 row in set (0.00 sec)


mysql> SHOW SESSION VARIABLES LIKE 'wait_timeout';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
wait_timeout | 30    |
+------------------+-------+
1 row in set (0.00 sec)


2. Change the values as you need

Here i am setting 28800 instead of 30


mysql> SET SESSION net_read_timeout=28800;
Query OK, 0 rows affected (0.00 sec)

mysql> SET SESSION wait_timeout=28800;
Query OK, 0 rows affected (0.00 sec)

3. Now you can check the variable values updated or not..

mysql> SHOW SESSION VARIABLES LIKE 'net_read_timeout';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
net_read_timeout | 28800 |
+------------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION VARIABLES LIKE 'wait_timeout';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
wait_timeout | 28800 |
+------------------+-------+
1 row in set (0.00 sec)

4. Now restart your mysql server

If you are using mysql on RedHat Linux (Fedora Core/Cent OS) then use following command:

/etc/init.d/mysqld restart

If you are using mysql on Debian / Ubuntu Linux then use following command:

/etc/init.d/mysql restart
Thats it  !

Tuesday, July 2, 2013

How to give IP instead of localhost - tomcat ?

How to give IP instead of localhost



The problem is in your server's 'server.xml' file. (in Apache installed directory) 

  <Connector port="8080" protocol="HTTP/1.1" 
      address="127.0.0.1"
               connectionTimeout="20000" 
               redirectPort="8443" />

change it as 

  <Connector port="8080" protocol="HTTP/1.1" 
      address="0.0.0.0"
               connectionTimeout="20000" 
               redirectPort="8443" />



Thats it !

Monday, July 1, 2013

Restlet Auth

Example For Restlet Auth 



1. Create a Restlet Server 

package server;

import org.restlet.Application;
import org.restlet.Component;
import org.restlet.data.Protocol;

public class TestServer extends Application {

public static void main(String[] args) throws Exception {
Component component = new Component();
component.getServers().add(Protocol.HTTP, 8182);
component.getDefaultHost().attach("/trace", Part3.class); 
component.start();
}

}

2. Create a Restlet Resource

package server;
import org.restlet.Request;
import org.restlet.data.MediaType;
import org.restlet.representation.Representation;
import org.restlet.representation.StringRepresentation;
import org.restlet.resource.Get;
import org.restlet.resource.ServerResource;

public class RestletResource extends ServerResource {  
@Get
public Representation doPost(Representation entity) {
Request request = getRequest();
Check check = new Check();
String result = check.authCheck(request);
if(result.equals("success")){
return new StringRepresentation("Success : ", MediaType.TEXT_PLAIN);
}
else{
return new StringRepresentation("Fail : ", MediaType.TEXT_PLAIN);
}
}
}  

3. Create a Auth Check method

import org.restlet.Request;
import org.restlet.data.ChallengeResponse;

public class Check {
public String authCheck(Request request) {
String result;
ChallengeResponse challengeResponse = request.getChallengeResponse();
if (challengeResponse == null) {
throw new RuntimeException("not authenticated");
}
String userName = challengeResponse.getIdentifier();
System.out.println("Identifier :" + userName);
String password = new String(challengeResponse.getSecret());
System.out.println("Secret" + password);
                 /*
* Here you can check the username and password. I am considering 'Dev' and 'Dev123'
*/
if (userName.equals("dev") && password.equals("Dev123")) {
result = "success";
} else {
result = "fail";
}
return result;

}
}

4. Restlet Client

import org.restlet.data.ChallengeScheme;
import org.restlet.resource.ClientResource;
import org.restlet.resource.ResourceException;

public class TestClient {

public static void main(String[] args) throws Exception {
ClientResource resource = new ClientResource("http://localhost:8182/trace");
resource.setChallengeResponse(ChallengeScheme.HTTP_BASIC, "dev", "Dev123");
// Send the first request with unsufficient authentication.
try {
System.out.println(resource.get(String.class));
} catch (ResourceException re) {
re.printStackTrace();
}

}
}



Friday, June 28, 2013

How to solve “Unable to load native-hadoop library” in Eclipse



How to solve “Unable to load native-hadoop library” in Eclipse


WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Running jobs via the hadoop cli command worked fine; this only happened when I tried to run jobs directly from Eclipse, in local mode. After doing a little investigation, I found that the reason behind this is a java property called java.library.path that did not include the correct path.
Running from the hadoop cli command, the java.library.path property was properly set to /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop/lib/native (I am using the CDH 4.2.0 distribution of Hadoop). When the job was started from inside Eclipse, the java.library.path held it’s system default value:
native1
In order to correctly set this property you can configure Eclipse to load the Java Virtual Machine with this setting, or (and this is the better way) add the native library under the respective library from the Java Build Path. In order to do this, first right click on your project and open the Buid Path configuration screen:
native2
In this screen, find the hadoop-common library, expand the row and add the native library by pointing to the correct location:
native3