Problem with process on Linux using Runtime.exec() and output, input and error streams.
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Kevin_Shuk
Posted On:   Monday, October 15, 2001 11:53 AM

I've encountered a problem in using Runtime.exec() and the outputStream and inputStreams with Java on Linux. Apologies in advance for the length of this, but I've done a great deal of analysis and have a lot of detail to present. The summary of the problem is that I am trying to start an external process using Runtime.exec() and then attach to this process's stdin, stdout and stderr using Process.getInputStream , Process.getOutputStream and Process.getErrorStream . I wish to keep the process alive, and to continually be able to shovel input into the stdin stream and to capture the output on stdout and stderr streams. The complication here    More>>

I've encountered a problem in using Runtime.exec() and the
outputStream and inputStreams with Java on Linux. Apologies in advance
for the length of this, but I've done a great deal of analysis and
have a lot of detail to present.



The summary of the problem is that I am trying to start an external
process using Runtime.exec() and then attach to this
process's stdin, stdout and stderr using
Process.getInputStream , Process.getOutputStream and
Process.getErrorStream . I wish to keep the process alive, and
to continually be able to shovel input into the stdin stream and to
capture the output on stdout and stderr streams. The complication here
is that one or more of the pipes seem to be closing prematurely
causing the child process to go away and the master to begin throwing
IOexceptions.



I've included test code which will demonstrate the problem I'm
describing. The test code consists of a single java source file
(ProcessMaster.java) and a short Perl script (ProcessSlave.pl). This
test code is at the very bottom of this file.



The slave process (Perl script) simply does blocking line reads on
stdin. It then checks for some special-case tokens (write an end
marker, flush a stream, or exit), and acts on them if it sees them,
otherwise it writes the line it saw back out on stdout and stderr with
a leading tag to identify the stream.



The master starts the slave process and gets the process's stdin,
stdout and stderr streams. Then, it pumps a multi-line command into
the slave's stdin. Using a pair of threads, we wait for output on both
the stdout and stderr of the slave, and we continue reading output
until either we see an end marker or hit the end of the stream,
thereby exiting the thread.



I will describe the configurations I've used and provide some detail
into what I've discovered on the varying configurations in a
bit. First, let me describe the error I'm seeing.



The noticeable symptom of the problem is seeing an IOException thrown
by the master (beware - line numbers may be skewed a couple from the
test code):



			
java.io.IOException: Bad file descriptor
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:212)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:72)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:130)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:245)
at java.io.BufferedWriter.flush(BufferedWriter.java:233)
at ProcessMaster.runCommand(ProcessMaster.java:115)
at ProcessMaster.main(ProcessMaster.java:217)


I believe this to be an aftereffect - there's nothing to debug here:
the pipe we had to the slave's stdin has gone away, so trying to flush
a command we just wrote to the outputStream's BufferedWriter
causes this exception to be thrown - no surprises here. (Note - using
IBM's java 1.3.0 implementation, the exception thrown is an
Illegal seek rather than Bad File Descriptor . Still,
it's thrown in the same place in the code).



The reason the pipe's broken and I get the exception above is that the
slave process has gone away! But why? To help with that, I've attached
an strace to the slave process while the master is blocking for input
asking for the delay in milliseconds that it should insert in some
places.



The slave strace shows me a couple scenarios, both of which would
result in the slave exiting. The mystery that has me stumped is why
either of these eventualities are happening in the first place.



First case is the slave getting a null on stdin. Presumably this
happens when the pipe between the slave's stdin and the master has
gone away. Here's an strace that shows this:



			
16:04:24.858985 read(0, "Four score...
write [ STDERR, \'{"..., 4096) = 104
16:04:26.272951 fstat64(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
16:04:26.273061 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x401e2000
16:04:26.273142 write(1, "SLAVE STDOUT> Four score...
", 28) = 28
16:04:26.273396 write(2, "SLAVE STDERR> Four score...
", 28) = 28
16:04:26.273592 write(2, "{END#1}
", 8) = 8
16:04:26.273687 write(1, "{END#1}
", 8) = 8
16:04:26.273774 read(0, "It was a dark and stormy night.
"..., 4096) = 122
16:04:27.364158 write(1, "SLAVE STDOUT> It was a dark and "..., 46) = 46
16:04:27.364232 write(2, "SLAVE STDERR> It was a dark and "..., 46) = 46
16:04:27.364338 write(2, "{END#2}
", 8) = 8
16:04:27.364396 write(1, "{END#2}
", 8) = 8
16:04:27.364454 read(0, "", 4096) = 0
16:04:27.365131 rt_sigprocmask(SIG_SETMASK, [QUIT RT_0], NULL, 8) = 0
16:04:27.365397 munmap(0x401e2000, 4096) = 0
16:04:27.365448 _exit(0)


Second case is the slave getting broken pipe(s) on stderr or
stdout. The strace below shows only one attempt to write to a broken
pipe. There may be several before it finally gives up:

			
16:19:46.750919 read(0, "Four score...
write [ STDERR, \'{"..., 4096) = 104
16:19:48.337176 fstat64(0x1, 0xbffff710) = 0
16:19:48.337279 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40194000
16:19:48.337420 write(1, "SLAVE STDOUT> Four score...
", 28) = 28
16:19:48.337782 write(2, "SLAVE STDERR> Four score...
", 28) = 28
16:19:48.338022 write(2, "{END#1}
", 8) = 8
16:19:48.338160 write(1, "{END#1}
", 8) = 8
16:19:48.338298 read(0, "It was a dark and stormy night.
"..., 4096) = 122
16:19:48.369457 write(1, "SLAVE STDOUT> It was a dark and "..., 46) = 46
16:19:48.369642 write(2, "SLAVE STDERR> It was a dark and "..., 46) = 46
16:19:48.370105 write(2, "{END#2}
", 8) = -1 EPIPE (Broken pipe)
16:19:48.370196 --- SIGPIPE (Broken pipe) ---



Here are the specific hardware/RedHat/kernel/Java vm/JIT combinations
I've used:

			
# Proc. RedHat Kernel Java JIT
1 uni RH6.2 2.2.14-5 Sun 1.2.2 nojit
2 Sun 1.3.1_01 hotspot

3 dual RH6.2 2.2.19-3hpc Sun 1.2.2 nojit
4 Sun 1.3.1_01 hotspot
5 Sun 1.3.1_01 classic(nojit)
6 Sun 1.4.0 B2 hotspot

7 uni RH7.1 2.4.3-12 Sun 1.2.2 nojit
8 Sun 1.3.1_01 hotspot
9 Sun 1.3.1_01 classic(nojit)
10 IBM 1.3.0 jitc
11 Blackdown 1.3.1 hotspot
12 Blackdown 1.3.1 classic(nojit)


I will refer to the configurations in the table above by the number in
the first column to (hopefully) be less confusing.



The target configuration is 4, a dual Pentium processor box running an
SMP kernel, ideally using Sun java 1.3.1 with the hotspot vm.



The most successful configurations were 1, 7 and 9. I have yet to
see these fail with the test case, however, they are not the target
platform (at least hardware-wise).



Configurations 3 and 5 were moderately successful, though they would
either hang outright or throw the IOException detailed above, usually
before 1000 iterations. Seems to hint that using a JIT is problematic,
but it's not clear why. A problem with the JIT, or a race condition
caused by the increased speed?



Configurations 2, 4, 6, 8, 10, 11 and 12 simply did not work. They'd
usually fail by the third or fourth iteration. Interesting to note
here is that configuration 12 used the classic vm (nojit) of
Blackdown's java 1.3.1, yet it would still fail early. Using the
classic vm of Sun's 1.3.1 would get it to act at least as well as Sun
java 1.2.2.



I haven't tried the IBM java 1.3.0 without jitc - haven't yet
determined how to turn off the JIT. Sun's java 1.4.0 beta 2 doesn't
seem to offer a way to run without the hotspot JIT.



An additional data point is that I ran the test code under Windows
2000 under Sun java 1.3.1_01 + hotspot with no errors. This would seem
to suggest that the problem is some native code or at the OS level (or
my code's interaction with either).





The questions I have out of all this are:




  • Why do the pipes seem to be closing?
  • Is the additional speed of the JIT compilers what is exacerbating the
    problem when they're used? Remember, I've seen this problem under
    Sun java 1.2.2, and Sun java 1.3.1 with nojit, only it will be less frequent.
  • Is there a better way I should be doing this type of process control
    given the tools that java provides me?
  • Is there a problem in how I'm dealing with the process and streams, or is
    it a bug in Java or in Linux?
  • Sun java 1.3.0 had a problem noted where it failed on SMP kernels (The
    message I saw was posted to the Blackdown java/Linux board around
    August 2000. Do current versions of java still have a problem with SMP
    kernels?



Below are two source files: They're set up to live in
/tmp/JavaProcessTest, although this can be changed by editing the
proc.startProcess line for non-Windows platforms near the top
of main() in ProcessMaster.pl and recompiling.



Run the class with a simple "java ProcessMaster", adding a classpath
argument if necessary. You'll be prompted for a delay time in
milliseconds. The delay will sleep between select steps. Also, while
the ProcessMaster is blocking for input of the delay time is a great
time to turn an strace onto the pid of the Perl process of the child.





BEGIN ProcessSlave.pl
---------------------------------------------------

			
#!/usr/bin/perl

# Ensure that both STDOUT and STDERR autoflush
my $oldfh = select STDOUT; $| = 1; select $oldfh;
my $oldfh = select STDERR; $| = 1; select $oldfh;

while ($line = ) {
if ($line =~ /^exit/) {
last;
} elsif ($line =~ /^write/) {
$line =~ /#(d+)}/;
$cmdnum = $1;

if ($line =~ /STDERR/) {
print STDERR "{END#" . $cmdnum . "}
";
} elsif ($line =~ /STDOUT/) {
print STDOUT "{END#" . $cmdnum . "}
";
}
} elsif ($line =~ /^flush/) {
if ($line =~ /STDERR/) {
# STDERR already autoflushes
} elsif ($line =~ /STDOUT/) {
# STDOUT already autoflushes
}
} else {
print STDOUT "SLAVE STDOUT> " . $line;
print STDERR "SLAVE STDERR> " . $line;
}
}

---------------------------------------------------
END ProcessSlave.pl







BEGIN ProcessMaster.pl
---------------------------------------------------
			
import java.io.*;
import java.util.*;

/**
* Wrapper class for running specified command and environment in
* a subprocess.
*
*/
public class ProcessMaster {

// ----- package constuctors ---------------------------------------------


/**
* Constructs ProcessMaster.
*
*/
public ProcessMaster() {
}

/**
* Starts the Process
*
*/
public synchronized void startProcess(String processCmdLine, String [] env) throws Exception {
if( env != null)
m_Process = Runtime.getRuntime().exec(processCmdLine,env);
else
m_Process = Runtime.getRuntime().exec(processCmdLine);

// Writes to slave process's STDIN
m_OutWriter = new BufferedWriter( new OutputStreamWriter( m_Process.getOutputStream()));

// Reads from slave process's STDOUT and STDERR
m_InReader = new BufferedReader( new InputStreamReader(m_Process.getInputStream()), 4096);
m_ErrReader = new BufferedReader( new InputStreamReader(m_Process.getErrorStream()), 4096);
}

/**
* Stops the Process
*/
public synchronized void stopProcess(String exitCmdLine) throws Exception {

if(m_Process == null)
throw new IllegalStateException("The Process is not running. Try starting the process before running the command.");
try{
if(!exitCmdLine.endsWith("
"))
exitCmdLine += "
";
m_OutWriter.write(exitCmdLine,0,exitCmdLine.length());
System.err.println("DEBUG: Exiting with command "+ exitCmdLine);
m_OutWriter.flush();
m_InReader.close();
m_OutWriter.close();
m_ErrReader.close();

System.err.println("DEBUG: Waiting for process to terminate...");
m_Process.waitFor();
System.err.println("DEBUG: Process is terminated.");
}
finally{
m_Process.destroy();
System.gc();
}
}

/**
* Runs the command as given by the input parameter string
*/
public synchronized void runCommand(String runCmdLine) throws Exception{

if(m_Process == null)
throw new IllegalStateException("The Process is not running. Try starting the process before running the command.");
// Make sure the command line terminates with a

if(!runCmdLine.endsWith("
"))
runCmdLine += "
";

// Initialize all Buffers
m_OutputBuffer.setLength(0);
m_ErrBuffer.setLength(0);
String inputCommand = "";

m_Counter++; // Increment the counter
m_commandInProgress = true; // set command in progress to false

// Set Timed out Variable to false;
m_isTimedOut = false;

// Write the command
inputCommand += runCmdLine;

// Write the command to generate END Markers and flush
inputCommand += "write [ STDERR, '{{" + cEndMarker + m_Counter +"}\n'];
";
inputCommand += "write [ STDOUT, '{{" + cEndMarker + m_Counter +"}\n'];
";
inputCommand += "flush STDERR;
flush STDOUT;
";

System.err.println("DEBUG: ProcessMaster.runCommand executing command:
" + inputCommand + "
");

try {
if (sleepTime > 0)
Thread.sleep(sleepTime);

m_OutWriter.write(inputCommand,0,inputCommand.length());
System.err.println("DEBUG: Wrote command to Slave stdin, about to flush");

if (sleepTime > 0)
Thread.sleep(sleepTime);

m_OutWriter.flush();
System.err.println("DEBUG: flushed writer to Slave stdin");

if (sleepTime > 0)
Thread.sleep(sleepTime);

// Spawn new thread for reading from the Error Stream
ProcessReader errReader = new ProcessReader(m_Counter,m_ErrReader,m_ErrBuffer);
System.err.println("DEBUG: created errReader");

Thread errThread = new Thread(errReader);
System.err.println("DEBUG: Created errThread");

errThread.start();
System.err.println("DEBUG: started errThread");

if (sleepTime > 0)
Thread.sleep(sleepTime);

// Spawn a new thread for reading from the Input Stream
ProcessReader outReader = new ProcessReader(m_Counter,m_InReader, m_OutputBuffer);
System.err.println("DEBUG: created outReader");
Thread outThread = new Thread(outReader);
System.err.println("DEBUG: Created outThread");
outThread.start();
System.err.println("DEBUG: started outThread");

// Join the threads
threadJoin(errThread,"errorReaderThread");
System.err.println("DEBUG: joined errThread");
threadJoin(outThread,"outReaderThread");
System.err.println("DEBUG: joined outThread");
// This will kill all other lagging threads
m_commandInProgress = false;

} catch(Exception exc) {
System.err.println("Exception:" + exc);
exc.printStackTrace();
System.exit(1);
}
}


/**
* Joins threads.
*/
private void threadJoin(Thread thread, String tag) {
System.err.println("DEBUG: ProcessWrapper.threadJoin waiting for "+tag+" to terminate...");
long start = System.currentTimeMillis();
try {
thread.join(cTimeoutInterval);
} catch (InterruptedException exc) {
System.err.println("Warning: ProcessWrapper.threadJoin "+tag+" got interrupted, how odd...");
} finally {
long elapsed = System.currentTimeMillis() - start;
if (elapsed > cTimeoutInterval / 2) {
int elapsedSec = (int) (elapsed / 1000);
System.err.println("Warning: ProcessWrapper.threadJoin timeout after "+elapsedSec+
" seconds for "+tag+", continuing...");
m_isTimedOut=true;
}
}
}

/**
* Starts testing the Program
*/

public static void main(String [] args) throws Exception {
try{
ProcessMaster proc = new ProcessMaster();

if (File.separatorChar == '\\') {
// assume we're on Windows
proc.startProcess("perl c:/ProcessSlave.pl",null);
} else {
// assume we're on not-Windows
proc.startProcess("/tmp/JavaProcessTest/ProcessSlave.pl",null);
}

BufferedReader reader = new BufferedReader( new InputStreamReader(System.in));

System.out.print("Thread.sleep() delay in ms (0 for none)>");
String wait = reader.readLine().trim();
try {
sleepTime = Integer.parseInt(wait);
} catch(NumberFormatException nfe) {
System.err.println("
Invalid sleep delay. Defaulting to no delay.");
sleepTime = 0;
}

while (true) {
proc.runCommand("Four score and seven years ago...");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("It was a dark and stormy night.");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("Far out in the uncharted backwaters...");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("D'oh!");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("Call me Ishmael.");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("In the beginning there was the void.");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");

proc.runCommand("I've had quite enough of this.");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
System.err.println("-----------------------------------");
}

proc.stopProcess("exit");
System.err.println("OUT:"+proc.getOutput());
System.err.println("ERR:"+proc.getError());
} catch(Exception exc) {
exc.printStackTrace();
}
}

/**
* Returns stdout of process, so far.
*/
public String getOutput() {
synchronized (this) {
return m_OutputBuffer.toString();
}
}

/**
* Returns error.
*/
public String getError() {
synchronized (this) {
return m_ErrBuffer.toString();
}
}
/**
* Returns the timed out status.
*/
public final boolean isTimedOut(){
return m_isTimedOut;
}

//------------------- Inner Class -------------------

/**
* A class that reads the Stderr of the process.
*/
private class ProcessReader implements Runnable {

/**
* Holds the Counter's value.
*/
private int m_CounterCopy=0; // Holds the Counter's value

/**
* Holds the reader's reference.
*/
private BufferedReader m_Reader; // Holds the reader's reference

/**
* Holds the buffer reference.
*/
private StringBuffer m_Buffer; // Holds the buffer reference

/**
* Constructs the ProcessReader.
*/
public ProcessReader(int counter, BufferedReader reader,
StringBuffer buffer) {
this.m_CounterCopy = counter;
this.m_Reader = reader;
this.m_Buffer = buffer;
}

/**
* The Run method of the thread
*/
public synchronized void run(){
String line=""; // Temporary var
int occurenceOfEndMarker=0;
while(m_CounterCopy==m_Counter && m_commandInProgress) {
// Loop 'til command is in progress
try{
line = m_Reader.readLine();
if(line == null) {
// If null terminate the thread
System.err.println("WHOA!: Read null, terminating thread");
break;
}
} catch( IOException exc ) {
System.err.println("Error: IOException at ProcessReader.run()" + exc.getMessage());
break;
}

System.err.println("PIPE LINE:" + line);
occurenceOfEndMarker = line.indexOf("{"+cEndMarker+m_CounterCopy+"}");

if(occurenceOfEndMarker != -1){ // Check for End Marker
/* output lines may not be coming in new line,
* so check if the line contains the end marker
* and remove the end marker from being written to
* the buffer
*/
line = line.substring(0,occurenceOfEndMarker);
System.err.println("Saw EndMarker: {" + cEndMarker + m_CounterCopy + "}");
m_Buffer.append(line);
break;
}
occurenceOfEndMarker = line.indexOf(cCompleteMarker);
if(occurenceOfEndMarker != -1){ // Check for End Marker
System.err.println("Saw cCompleteMarker: " + cCompleteMarker);
m_Buffer.append(line);
break;
}
// And the ordinary line gets written on the buffer.
m_Buffer.append(line);
}

}

}



// ----- package member variables ----------------------------------------
/**
* Process handle for executing process.
*/
private Process m_Process;

/**
* Buffered contents of stdout.
*/
private StringBuffer m_OutputBuffer = new StringBuffer();

/**
* Buffered contents of stderr.
*/
private StringBuffer m_ErrBuffer = new StringBuffer();

/**
* A final variable that stores the seconds for timeout for now 2 hours
*/
private final int cTimeoutInterval=2* 60 * 60*1000;

/**
* A variable that stores the counter value of the currently
* executing command.
*/
private int m_Counter = 0;

/**
* A boolean variable that specifies if a command is currently in progress.
*/
private boolean m_commandInProgress;

/**
* A final variable that stores the BEGIN marker.
*/
private final String cBeginMarker="BEGIN#";

/**
* A final variable that stores the BEGIN marker.
*/
final static String cCompleteMarker="COMPLETE";

/**
* A final variable that stores the END marker.
*/
private final String cEndMarker ="END#";

/**
* A final variable that stores the IO Reader of the Process.
*/
private BufferedReader m_InReader;

/**
* A final variable that stores the IO Writer of the Process.
*/
private BufferedWriter m_OutWriter;

/**
* A final variable that stores IO error reader of the Process.
*/
private BufferedReader m_ErrReader;

/**
* The boolean variable indiacates if a process command had timed-out.
*/
private boolean m_isTimedOut=false;

private static int sleepTime = 0;

} // end class

---------------------------------------------------
END ProcessMaster.pl    <<Less

Re: Problem with process on Linux using Runtime.exec() and output, input and error streams.

Posted By:   Nathan_Meyers  
Posted On:   Monday, October 15, 2001 05:06 PM

I'd recommend submitting your question to the Blackdown mailing list - lots of additional brains there to help on problems like this. Details from the Blackdown Web site.
About | Sitemap | Contact