dcsimg
Java/Linux: "Real-time signal 0"
0 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Ole_Streicher
Posted On:   Friday, April 2, 2004 07:47 AM

Hi, I have two programs (one Java, one C++) running on the same Linux machine and communicating via Corba. The Java program uses the standard ORB, the C++ program OmniORB. The communication consists mainly of the call of some Java methods with structures as "In" parameters. These calls occur quite often (some 10 times per second) and in parallel from several C++ threads (up to ca. 200). For some reason the Java program tends to crash after some time (about 10 minutes) with a "Real-time signal 0"; exit code 163. This even happens when the body of the Corba called methods in Java are empty. What could be the reason for this? How can one avoid this behaviour?    More>>

Hi,


I have two programs (one Java, one C++) running on the same Linux machine and communicating via Corba. The Java program uses the standard ORB, the C++ program OmniORB.


The communication consists mainly of the call of some Java methods with structures as "In" parameters. These calls occur quite often (some 10 times per second) and in parallel from several C++ threads (up to ca. 200).


For some reason the Java program tends to crash after some time (about 10 minutes) with a "Real-time signal 0"; exit code 163. This even happens when the body of the Corba called methods in Java are empty.


What could be the reason for this? How can one avoid this behaviour?


What I tried up to now:


  • other JVM ("client" and "server" VM)
  • other JRE (1.4.1, 1.4.2 b28, 1.4.2_04)
  • java -Xrs
  • synchronize the Corba methods on the Java side
  • other OS version (SuSE Linux 8.2 and 9.0); this changes the OmniORB version (4.0.2, 4.0.3) and the kernel version (2.4.20, 2.4.21).


Nothing helps. With "strace -e sig_action -f java ..." I get a lot of these real-time signals, but no crash. Running within a debugger (gdb) does not really help; I get a lot of signals which immediately stop running the program.


If I use both programs on different machines, everything works perfect, so it seems to be a problem of the local communication.


At OmniORB (C++) one can enable a quite detailed tracing of the whole CORBA stuff. Level 30 returns the following when the crash occurrs:


			
omniORB: sendChunk: to giop:tcp:192.168.1.148:37319 88 bytes
omniORB:
4749 4f50 0102 0100 4c00 0000 9e8d 0400 GIOP....L.......
0300 0000 0000 2030 2100 0000 afab cb00 ...... 0!.......
0000 0020 ab34 0393 0000 0001 0000 0000 ... .4..........
0000 0000 0000 0004 0000 000c 0a30 3034 .............004
0d00 0000 7075 744f 7574 4f66 4461 7461 ....putOutOfData
0000 0000 0000 0000 ........
omniORB: inputMessage: from giop:tcp:192.168.1.148:37319 40 bytes
omniORB:
4749 4f50 0102 0001 0000 001c 0004 8eec GIOP............
0000 0000 0000 0001 4e45 4f00 0000 0002 ........NEO.....
000a 0000 0000 0000 ........
omniORB: sendChunk: to giop:tcp:192.168.1.148:37319 88 bytes
omniORB:
4749 4f50 0102 0100 4c00 0000 ee8e 0400 GIOP....L.......
0300 0000 0000 733c 2100 0000 afab cb00 ......s <!.......
0000 0020 ab34 0393 0000 0001 0000 0000 ... .4..........
0000 0000 0000 0004 0000 0004 0a22 6265 ............."be
0d00 0000 7075 744f 7574 4f66 4461 7461 ....putOutOfData
0000 0000 0000 0000 ........
omniORB: throw giopStream::CommFailure from giopStream.cc:828(0,MAYBE,COMM_FAILURE_WaitingForReply)
omniORB: Client connection refcount = 0
omniORB: Client close connection to giop:tcp:192.168.1.148:37319
omniORB: throw COMM_FAILURE from omniObjRef.cc:754 (MAYBE,COMM_FAILURE_WaitingForReply)
omniORB: send codeset service context: (ISO-8859-1,UTF-16)


This looks as if the java side crashes alread with the realtime-signal 0 before the last putOutOfData() is called. Call and return values of the last calls do not differ from before.


PutOutOfData() is quite simple; its IDL is the following:


			
interface DataTarget {
/* ... */
void putOutOfData();
};


What can cause this "Real-time signal 0" on the Java side? Is there a possibility to enable Corba tracing with the standard Java ORB?


What is the right way to proceed here? What is the component I could blame for these crashes? Java? One of the ORBs? The Linux glibc? The Linux kernel? The C++ program?


Can a C++ program crash its communication partner when it behaves itself odd? How can one detect this?


Best regards


Ole

   <<Less
About | Sitemap | Contact