Pythian Blog: Technical Track

Connection Resets When Importing from Oracle with Sqoop

I’ve been using Sqoop to load data into HDFS from Oracle. I’m using version 1.4.3 of Sqoop, running on a Linux machine and using the Oracle JDBC driver with JDK 1.6.

I was getting intermittent connection resets when trying to import data. After much troubleshooting, I eventually found the problem to be related to a known issue with the JDBC driver and found a way to work around it, which is described in this post.

The problem

I noticed that when I was importing data at times where the machine I was using to run the Sqoop client was mostly idle, everything would run just fine. However, at times when others started to work on the same machine and it became a bit busier, I would start to get the errors below intermittently:

[araujo@client ~]$ sqoop import --connect jdbc:oracle:thin:user/pwd@host/orcl -m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test
13/07/12 09:35:39 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/12 09:35:39 INFO tool.CodeGenTool: Beginning code generation
13/07/12 09:37:53
ERROR manager.SqlManager: Error executing statement: java.sql.SQLRecoverableException: IO Error: Connection reset
	at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:467)
	at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:546)
        ...
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
	at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
	... 24 more

After some troubleshooting and googling, I found that the problem seemed to be related to the issue described in the following articles:

https://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/
https://forums.oracle.com/message/3701989/

Confirming the problem

To ensure the problem was the same as the one described in the articles and not something else intrinsic to Sqoop, I created a small Java program that simply connected to the database. I was able to reproduce the issue using it:

[araujo@client TestConn]$ time java TestConn
Exception in thread "main" java.sql.SQLRecoverableException: IO Error: Connection reset
...
Caused by: java.net.SocketException: Connection reset
...
	... 8 more

real    1m20.481s
user	0m0.491s
sys	0m0.051s

The workaround suggested in the articles also worked:

[araujo@client TestConn]$ time java -Djava.security.egd=file:/dev/../dev/urandom TestConn
Connection successful!

real    0m0.419s
user	0m0.498s
sys	0m0.036s

Applying the fix to Sqoop

It took me a while to figure out how to use the workaround above with Sqoop. Many tentatives to specify the parameter in the Sqoop command line, in many different forms, didn’t work as expected.

The articles mention that the java.security.egd parameter can be centrally set in the $JAVA_HOME/jre/lib/security/java.security file. Unfortunately, this didn’t work for me. Using strace, I confirmed that Sqoop was actually reading the java.security file but the setting just didn’t take effect. I couldn’t figure out why not and eventually gave up that alternative.

After a bit more of stracing and troubleshooting, though, I finally figured a way out.

Sqoop seems to use the JDBC driver in two different ways:

  • First, it connects to the Oracle database directly. It does that to gather more information about the tables (or query) from where the data is extracted and generate the map reduce job that it will run.
  • Second, the map reduce job generated by Sqoop uses the JDBC driver to connect to the database and perform the actual data import.

I was hitting the problem in the first case above, but I believe that in both cases there’s a potential for the problem to occur. So ideally, we should apply the workaround to both cases.

The Sqoop documentation clearly gives us an option to address the second case: using the following parameter to Sqoop allows us to pass Java command line options to the map reduce job:

sqoop import -D mapred.child.java.opts="\-Djava.security.egd=file:/dev/../dev/urandom" ...

Even though I couldn’t fully prove the above, since I couldn’t consistently reproduce the problem for the map reduce tasks, I believe (and hope) it should work well.

The Sqoop direct connection to Oracle

The problem with the direct connection from Sqoop to Oracle, though, wasn’t resolved by that option. Trying to pass the “-Djava.security.egd=file:/dev/../dev/urandom” option directly to Sqoop didn’t work either.

After digging up a bit I found that the sqoop command eventually calls ${HADOOP_COMMON_HOME}/bin/hadoop to execute the org.apache.sqoop.Sqoop class. Since the hadoop executable is used, it accepts Java command line options through the HADOOP_OPTS environment variable.

A quick test confirmed that the case was closed:

[araujo@client STimport]$ export HADOOP_OPTS=-Djava.security.egd=file:/dev/../dev/urandom
[araujo@client STimport]$ sqoop import -D mapred.child.java.opts="\-Djava.security.egd=file:/dev/../dev/urandom" --connect jdbc:oracle:thin:user/pwd@host/orcl -m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test 
13/07/12 10:08:17 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/12 10:08:17 INFO tool.CodeGenTool: Beginning code generation
13/07/12 10:08:18 INFO manager.OracleManager: Time zone has been set to GMT
13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement: select 1 from dual where  (1 = 0) 
13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement: select 1 from dual where  (1 = 0) 
13/07/12 10:08:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce
13/07/12 10:08:18 INFO orm.CompilationManager: Found hadoop core jar at: /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/07/12 10:08:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.jar
13/07/12 10:08:19 INFO mapreduce.ImportJobBase: Beginning query import.
13/07/12 10:08:19 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 10:08:21 INFO mapred.JobClient: Running job: job_201306141710_0075
13/07/12 10:08:22 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 10:08:31 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 10:08:33 INFO mapred.JobClient: Job complete: job_201306141710_0075
13/07/12 10:08:33 INFO mapred.JobClient: Counters: 23
13/07/12 10:08:33 INFO mapred.JobClient:   File System Counters
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes read=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes written=179438
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes read=87
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes written=2
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of read operations=1
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of write operations=1
13/07/12 10:08:33 INFO mapred.JobClient:   Job Counters 
13/07/12 10:08:33 INFO mapred.JobClient:     Launched map tasks=1
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=7182
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 10:08:33 INFO mapred.JobClient:     Map input records=1
13/07/12 10:08:33 INFO mapred.JobClient:     Map output records=1
13/07/12 10:08:33 INFO mapred.JobClient:     Input split bytes=87
13/07/12 10:08:33 INFO mapred.JobClient:     Spilled Records=0
13/07/12 10:08:33 INFO mapred.JobClient:     CPU time spent (ms)=940
13/07/12 10:08:33 INFO mapred.JobClient:     Physical memory (bytes) snapshot=236580864
13/07/12 10:08:33 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=34998603776
13/07/12 10:08:33 INFO mapred.JobClient:     Total committed heap usage (bytes)=1013252096
13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Transferred 2 bytes in 14.4198 seconds (0.1387 bytes/sec)
13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Retrieved 1 records.

No Comments Yet

Let us know what you think

Subscribe by email