Welcome to the "trac"-ing site of soap4r!
[soap4r] [httpclient] [openpgp4u] [pkcs1] [logger] [csv] [vtr]

Ticket #334 (new defect)

Opened 2 years ago

Last modified 2 years ago

sockets stuck in CLOSE_WAIT with soap4r and http-access2?

Reported by: user Assigned to: nahi
Priority: normal Milestone: undefined
Component: soap4r Version: 1.5
Keywords: Cc:

Description

from soap4r-ml

Hi,

I've found a problem with my RoR website, and I don't know if the bug
is in my code, soap4r, or http-access2.

My site uses soap4r to talk to a CRM application on another server, to
log tickets etc. It worked fine for ages, then the mongrel cluster
members started hanging. I could see (using 'lsof') hundreds of TCP
socket connections to the CRM server stuck in CLOSE_WAIT.

I have now worked out that at some point I must have installed a Gem
which depended on the rubyforge gem, which includes http-access2. When
soap4r starts up, StreamHandler.rb tries to use http-access2, and
falls back to using SOAP::NetHttpClient if it can't. So, by installing
the rubyforge gem, I inadvertently changed the way soap4r worked.

I have for now got my application working reliably again by putting
the following lines in environment.rb:

    # Force SOAP4r to use Net/http instead of http-access2
    require 'soap/netHttpClient'
    SOAP::HTTPStreamHandler::Client = SOAP::NetHttpClient
    SOAP::HTTPStreamHandler::RETRYABLE = false

Now, I have found that I can prevent the CLOSE_WAITs by changing my
code so that it calls ::SOAP::RPC::Driver#reset_stream after I've
finished with any SOAP calls I'm making. So my question to the list
is:

Should I modify all of my code so that it calls reset_stream after any
SOAP call?
 -or-
should soap4r itself be doing something to prevent the problem
occurring?
 -or-
is there a bug in http-access2? Is it failing to notice the CLOSE_WAIT
sockets?

I've tried using the latest http-access2 code from svn thinking it
might be a bug in the version included in the rubyforge gem. That
didn't help.

The SOAP server I'm talking to is hosted on Apache Tomcat on Windows,
the RoR site is running on either Mac OS X (development) or Linux
(staging / production).

Change History

05/09/07 14:21:45 changed by nahi

Interesting. You are creating a SOAP::RPC::Driver or WSDLDriver for each request to the CRM application or you are creating a driver for each request to RoR application, right?

I think http-access2 client and the server are communicating using HTTP keep-alive. Each driver keeps one HTTPAccess2::Client object and HTTPAccecc2::Client tries to keep a TCP socket in preparation to reuse it. The CRM server sends TCP FIN (close) packet after timeout duration but HTTPAccess2::Client isn't aware of it until the next request comes.

# soap4r's net/http client does not try to do HTTP keep-alive for now.
# But once someone send a patch to do HTTP keep-alive with soap4r +
# net/http in the future, I'll incorporate it and your application
# should be broken after upgrading soap4r.

Do you think this is the case? Is it possible that hundreds of SOAP::RPC::Driver are live (not GC-ed) for some reason in your application?

For a workaround, please invoke SOAP::RPC::Driver#reset_stream after each request. For primary fix, please consider to change your application to keep only 1 SOAP::RPC::Driver in the application container.

Beside this, I start to think that http-access2 should check if a socket is closed or not at fixed intervals... in the future release.

06/01/07 14:33:05 changed by nahi

from soap4r-ml

Just a quick message to say that we've encountered this problem as
well (since we moved to rails 1.2 - I imagine one of its dependencies
pulled in http-access2).
We've been working round it by nor reusing drivers, changing
SOAP::HTTPStreamHandler is clearly much nicer (in particular we don't
need to worry about remembering to do this in future code (as we would
if we went the SOAP::RPC::Driver#reset_stream way).