Intermittent SLL connection errors

We’ve been using the Ruby xmlrpc library to communicate with InfusionSoft for many years now without a problem. Recently, we started receiving sporadic connection errors when communicating with InfusionSoft. The specific error is OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3 read server session ticket A.

Our code looks something like this:

xmlrpc = XMLRPC::Client.new3({
  'host' => api.host, 'path' => '/api/xmlrpc',
  'port' => api.port, 'use_ssl' => true
})
xmlrpc.call action, secret_key, *args

Ruby xmlrpc uses Net::HTTP to perform the request, and the error originates in the connect method.

Since nothing has changed on our end in terms of how we connect to InfusionSoft (either in credentials, code, or environment) this seems to be an issue in the InfusionSoft side of things. I did try changing some of the Net::HTTP options as per this StackOverflow question but that just produced a different error.

The odd thing is that the problem is intermittent. It isn’t happening every day, although it has happened more frequently through February 2018. Most of the time our code works as intended and the requests are successfully made.

Can anyone at InfusionSoft confirm this problem? If so, is there a resolution being worked on?

Doing some googling, people see that when the connection times out. Is that possibly going on?

Net::HTTP has some timeout handling code and should raise the error as such (I would expect). We handle timeouts gracefully in our own code, so I would lean toward something other than a timeout.

I was able to correlate one of the errors with the request that triggered it and the time between request start and the error being reported was 2 seconds, so not enough time for a timeout to be triggered.

We have our own timeouts on the API Gateway. On being a timeout on the initial connection which is short (2 sec) and one for the response back from out backend server (much longer). So maybe there is something similar in Net:HTTP, you might be seeing a timeout on establishing the initial connection to the API Gateway.

If this is a timeout, what would be the cause? High load on IFS servers? If so, are you able to see these connections being timed out on your end?

It would most likely be some sort of network issue between your app and the our API Gateway (Mashery), if that is indeed what is happening.

So is there a way to get an IFS developer to look into this issue? This issue is popping up not only during read-only operations but also when our users are trying to purchase things, so we’d really like for that stuff to happen when the user requests it.

As an aside, I really dislike the way there is no ticketing system for technical problems. Having a forum for technical support is by far the worst system I’ve seen from any service provider. (not blaming @bradb, just venting)

The connection is failing between you and Mashey (which is AWS). Nothing we can do on our end to troubleshoot that. I would suggest testing the network connection to Mashery to see what you can find. Maybe a bad hop or something.

There is a ticketing system for API related issues. Get Support - Keap Developer Portal

Thanks for the link. Any time I do a support chat they always tell me to ask in the developer forum which is why I was unaware of the ability to create tickets.

I guess I’ll do some more digging.

Yeah the normal support doesn’t handle API stuff. The API ticketing system though is handled by advanced reps familiar with the API. Also if you continue to see this, I can possibly open a ticket with Mashery, but it won’t go far without details like trace routes and stuff. So if you can collect that stuff I can see Mashey can take a look at it.

Good to know about the API ticketing system.

I’ll dig into this more when I have the time and see if I can reproduce the error reliably. If I can I’ll submit a ticket with more information. Thanks again!

@Steve_Iannopollo - Did you ever find a solution for this? We’re seeing the same issue with our Ruby implementation. If you have any insight that can set me down the right course it would be hugely appreciated.

@Joe_Peduto I just started handling these like timeout errors (rescue and retry a few times) and that seems to have cleared up the issue. Here is what my code looks like while calling IFS using the xmlrpc/client library:

def call(*args)
  result, retries = nil, 0
  begin
    silence_stream(STDERR) do
      result = xmlrpc.call *args
    end
  rescue OpenSSL::SSL::SSLError => e
    # According to IFS devs this is a timeout in the SSL handshake between us and them
    # So we should be able to retry this without ill effects
    if e.message =~ /SSL_connect SYSCALL/
      retries += 1
      if retries > 3
        log "*** INFUSION SSL ERROR: Handshake timeout (max retries reached) *** #{e.message}"
        raise e
      else
        log "*** INFUSION SSL ERROR: Handshake timeout (retry #{retries} of 3) *** #{e.message}"
        retry
      end
    else
      raise e
    end
  end
  
  result
end

10-4 @Steve_Iannopollo Thanks much. We’ll work towards the same.