Ticket #285 (closed defect: fixed)

Opened 3 years ago

Last modified 1 month ago

active SSL connection loss (SSL3_WRITE_PENDING:bad write retry) (CVE-2008-1531)

Reported by: sean@gigave.com Assigned to: jan
Priority: highest Milestone: 1.5.0
Component: core Version: 1.5.x-svn
Severity: blocker Keywords: patch security
Cc: Blocking: 1461; 1042; 217; yes; "mod_secdownload"
Need Feedback: 0

Description

I'm seeing a gazillion log entries like these:

2005-09-22 17:41:44: (network_openssl.c.102) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2005-09-22 17:41:44: (connections.c.494) connection closed: write failed on fd 11

The page fails to complete writing. When I view the page in non-ssl mode, the page runs to completion.

Attachments

Fix-285-Remove-workaround-for-buggy-Opera-version.patch (1.5 kB) - added by stbuehler on 02/16/2008 02:29:12 PM.
lightty-ssl_shutdown-fix.patch (0.8 kB) - added by marton.illes@balabit.com on 03/22/2008 11:23:46 AM.
06_all_lighttpd-1.4.19-closing_foreign_ssl_connections-dos.diff (1.9 kB) - added by hoffie on 03/26/2008 11:13:10 PM.
alternative, hopefully better patch (against 1.4.19, not svn!)
lighttpd-1.5-ssl-dos.patch (1.2 kB) - added by hoffie on 03/26/2008 11:30:57 PM.
similar patch against svn trunk (1.5)
fix-ssl-again.patch (1.8 kB) - added by stbuehler on 03/27/2008 11:17:40 PM.
against lighty-1.4 svn; hopefully fixed the error handling for ssl-shutdown in a clean way
fix-ssl-again-1.4.19.patch (2.9 kB) - added by hoffie on 03/28/2008 04:07:14 PM.
same patch as fix-ssl-again.patch (by stbuehler) against 1.4.19
committed-patch-1.4.19.patch (2.9 kB) - added by hoffie on 03/28/2008 05:00:57 PM.
backport to 1.4.19 of the patch which actually got committed

Change History

09/23/2005 09:43:25 AM changed by jan

  • status changed from new to assigned.

What ? we fixed this 1.4.1 and fixed the fix in 1.4.2. Even OpenBSD was happy afterwards.

Please verify that you are really using 1.4.2 or higher.

09/24/2005 05:42:09 PM changed by sean@gigave.com

Confirmed, status-config reports 1.4.3.

10/20/2005 05:16:10 PM changed by sean@gigave.com

Any luck with this? What other info could I get that would be helpful?

09/24/2006 01:27:00 PM changed by jan

  • status changed from assigned to closed.
  • resolution set to fixed.

fixed in 1.4.12

11/10/2007 02:21:30 AM changed by anonymous

  • status changed from closed to reopened.
  • version changed from 1.4. to 1.4.13.
  • resolution deleted.
  • blocking changed.
  • pending changed.

Debian server

# uname -a Linux 2.6.18-3-amd64 #1 SMP Mon Dec 4 17:04:37 CET 2006 x86_64 GNU/Linux

# openssl version OpenSSL 0.9.8c 05 Sep 2006

# lighttpd -v lighttpd-1.4.13 (ssl) - a light and fast webserver Build-Date: Sep 21 2007 15:20:00

Got this kind of error

2007-11-09 15:05:35: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure 2007-11-09 15:17:04: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer 2007-11-09 15:17:04: (connections.c.588) connection closed: write failed on fd 17 2007-11-09 15:17:16: (network_openssl.c.133) SSL: 5 -1 104 Connection reset by peer 2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 36 2007-11-09 15:17:16: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-11-09 15:17:16: (connections.c.588) connection closed: write failed on fd 15 2007-11-09 15:17:17: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-11-09 15:17:17: (connections.c.588) connection closed: write failed on fd 16

12/07/2007 12:13:56 PM changed by oliver@realtsp.com

also getting this error

2007-12-07 10:32:22: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:35:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:37:43: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 10:53:00: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:22:36: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:45:12: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 11:59:52: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-07 12:00:32: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 

I got rid of the 99.9% of the ssl handshake errors with the IE/SSL/keepalive = 60s fix. But these remained.

FreeBSD 6.1-RELEASE-p10 FreeBSD amd64

root@long# openssl version
OpenSSL 0.9.7e-p1 25 Oct 2004

root@long# lighttpd -v
lighttpd-1.4.18 (ssl) - a light and fast webserver
Build-Date: Nov 23 2007 13:51:35

12/07/2007 12:14:16 PM changed by anonymous

  • version changed from 1.4.13 to 1.4.18.

12/17/2007 09:11:26 PM changed by anonymous

2007-12-17 22:09:12: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 22:09:12: (connections.c.603) connection closed: write failed on fd 136

12/18/2007 10:12:14 AM changed by Bruno

  • priority changed from normal to highest.
  • blocking set to yes; "mod_secdownload".
  • pending set to 1.

Hello, we still have this problem in the last version available in Debian Lenny, this is very blocking with a token plug-in ("mod_secdownload") because link are not valid anymore.

2007-12-17 18:46:10: (log.c.75) server started 2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 8 2007-12-17 18:47:06: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 18:47:06: (connections.c.588) connection closed: write failed on fd 10 2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 9 2007-12-17 18:52:34: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 18:52:34: (connections.c.588) connection closed: write failed on fd 8 2007-12-17 19:00:59: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 19:00:59: (connections.c.588) connection closed: write failed on fd 8 2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 9 2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 8 2007-12-17 19:01:44: (network_openssl.c.256) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2007-12-17 19:01:44: (connections.c.588) connection closed: write failed on fd 11

This breaks the connection, and client will get an 408 error because the link is deprecated

Thanks a lot for your help :)

12/18/2007 08:08:54 PM changed by ziemkowski

  • blocking changed from yes; "mod_secdownload" to 1461; 1042; 217; yes; "mod_secdownload".
  • severity changed from major to blocker.

Same for us, except it's also dying on a development server with only the latest Firefox 2.* browsers hitting it. Appears to only happen for us on PHP pages; static files do not appear to be failing.

2007-12-18 19:34:18: (network_openssl.c.154) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2007-12-18 19:34:18: (connections.c.603) connection closed: write failed on fd 9 
lighttpd-1.4.18 (ssl) - a light and fast webserver
Build-Date: Oct 20 2007 08:34:49

OpenSSL 0.9.7m 23 Feb 2007

Linux 2.6.18-028stab031 #2 SMP Mon Aug 13 13:45:16 MDT 2007 i686 i686 i386 GNU/Linux

PHP 5.2.4 (cgi-fcgi) (built: Oct 21 2007 05:44:24)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies

This appears to be a common problem... linking related bugs as blocked, although some claim to be fixed only to return again later. Increasing to Blocker as this blocks production viability.

12/21/2007 05:57:36 PM changed by anonymous

I also get SSL errors with lighttpd 1.4.13-etch8 on Debian Etch using the standard configuration: 2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca 2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure 2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca 2007-12-21 18:52:50: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure 2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca 2007-12-21 18:52:51: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

12/28/2007 02:20:14 PM changed by anonymous

Can confirm this bug on Debian Etch:

lighttpd-1.4.18 - a light and fast webserver Build-Date: Dec 28 2007 15:01:37

2007-12-28 15:09:56: (connections.c.279) SSL: 1 error:140940E5:SSL routines:SSL3_READ_BYTES:ssl handshake failure

Any solution for this in sight?

01/17/2008 02:03:28 PM changed by anonymous

the same on gentoo lighttpd-1.4.18

02/05/2008 06:09:53 AM changed by toomas.aas@raad.tartu.ee

I am also seeing this.

# uname -srm FreeBSD 7.0-RC1 amd64

# /usr/local/sbin/lighttpd -v lighttpd-1.4.18 (ssl) - a light and fast webserver Build-Date: Nov 23 2007 14:39:40

02/16/2008 02:28:51 PM changed by stbuehler

  • keywords set to patch.
  • summary changed from SSL write bug... to SSL write bug... (SSL3_WRITE_PENDING:bad write retry).
  • pending deleted.
  • milestone set to 1.4.19.

This bug is about "SSL3_WRITE_PENDING:bad write retry", not "handshake failure".

I think the problem is the "evil hack"/workaround for opera in network_openssl.c:
It modifies c->mem, which could have already been used for SSL_write with an SSL_ERROR_WANT_WRITE error, so we could get a new c->mem->ptr which results in the "bad write retry" error.

2 possible solutions:

  • Remove the hack
  • Delay every chunk till the next is available or connection is closed

I prefer the first - Opera users just have to update their browsers. (Bug in <= 9.01 / 8.54)

It would be nice, if someone could test the patch/give information on how to reproduce the bug.

02/16/2008 02:29:12 PM changed by stbuehler

  • attachment Fix-285-Remove-workaround-for-buggy-Opera-version.patch added.

02/26/2008 04:29:24 PM changed by stbuehler

  • status changed from reopened to closed.
  • resolution set to fixed.

Fixed in [2084]

03/12/2008 04:05:36 PM changed by mstemp5@iastate.edu

  • version changed from 1.4.18 to 1.4.19.

A fresh lighty 1.4.19 installation has what looks to be the same problem described above, if I'm reading correctly. Error log excerpt:

2008-03-12 09:40:27: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10 
2008-03-12 09:43:15: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 09:54:44: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 09:54:44: (connections.c.614) connection closed: write failed on fd 8 
2008-03-12 09:54:44: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:00:03: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:01:40: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:02:24: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:02:28: (connections.c.279) SSL: 1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 
2008-03-12 10:05:09: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 
2008-03-12 10:11:42: (connections.c.279) SSL: 1 error:140780E5:SSL routines:SSL23_READ:ssl handshake failure 

This installed from source on red hat enterprise 3 (yeah, I know). I've switched it back to run apache for now but do have a non-production server I can do further testing on if it's helpful.

03/22/2008 11:22:49 AM changed by marton.illes@balabit.com

  • status changed from closed to reopened.
  • resolution deleted.

I run into the same problem as bug 258 with the SSL write errors.

Finally I could track down it to the following situation. Start two parallel downloads using SSL in two different connections. (You can also download through x-sendfile, or php output. I used wget for easier reproduce and with large files to have long-lasting connections.)

Now terminate one of them and the other would be closed a bit latter. It is very annoying as this way large downloads would probably terminate before finishing.

The log would contain something like this: 2008-03-12 09:41:42: (network_openssl.c.130) SSL: 1 -1 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry 2008-03-12 09:41:42: (connections.c.614) connection closed: write failed on fd 10

The log refers to the second connection which is closed by lighty.

I started debugging the situation an it looked like the SSL error is generated in ssl3_write_pending function, which happens when the repeated SSL_write does not have the same arguments as the previous one, or an other ssl_write is called in between. I checked these, but everything seemed to be fine.

Also tried a fix from openssl, but without any success: http://rt.openssl.org/Ticket/Display.html?id=598

However after careful gdb magic the back-trace showed me that the error function was called from SSL_shutdown and not from SSL_write. The SSL_shutdown was also called from the connection_state_machine function on the CON_STATE_ERROR state. Hmm, strange, according to the logs the error occurred somewhere else...

The SSL_write failed in network_write_chunkqueue_openssl, I realized the in reality the SSL_write was OK, it only returned SSL_ERROR_WANT_WRITE, but the SSL error queue contained an other error from an earlier SSL_* call, in our case from SSL_shutdown.

In connection_state_machine:

1663 case CON_STATE_ERROR: /* transient */ 1664 1665 /* even if the connection was drop we still have to write it to the access log */ 1666 if (con->http_status) { 1667 plugins_call_handle_request_done(srv, con); 1668 } 1669 #ifdef USE_OPENSSL 1670 if (srv_sock->is_ssl) { 1671 int ret; 1672 switch ((ret = SSL_shutdown(con->ssl))) { 1673 case 1: 1674 /* ok */ 1675 break; 1676 case 0: 1677 SSL_shutdown(con->ssl); 1678 break; 1679 default: 1680 log_error_write(srv, FILE, LINE, "sds", "SSL:", 1681 SSL_get_error(con->ssl, ret), 1682 ERR_error_string(ERR_get_error(), NULL)); 1683 return -1; 1684 } 1685 } 1686 #endif

On line 1677 SSL_shutdown is called again, because the connection is in non-blocking mode where the first SSL_shutdown can require an other call. The problem that the return value of SSL_shutdown is not checked and in case of error the error queue is not cleared.

When the SSL_write in network_write_chunkqueue_openssl returned a simple WANT_WRITE error it got the error code from the previous SSL_shutdown call.

To fix the problem simply we need to check the return value of SSL_shutdown in 1677 and call ERR_get_error() to remove the error code from the queue.

An other possible place is in connections.c:1557, but there is no SSL_shutdown just a FIXME to put it there sometimes when fdevent show that connection is writeable. (This part is a more frequently run one, so would have caused more trouble...)

Here is a patch for 1.4.19 r2135, but would be obvious to port to 1.5 series:

Index: connections.c =================================================================== --- connections.c (revision 2135) +++ connections.c (working copy) @@ -1674,7 +1674,15 @@

/* ok */ break;

case 0:

- SSL_shutdown(con->ssl); + /* + * We need to get the error after SSH_shutdown, otherwise it remains + * on the error queue and causes latter false-alerts. Usually around + * SSL_write methods in network_openssl.c which results to shutdown + * of connections. + */ + if (SSL_shutdown(con->ssl) <= 0) { + ERR_get_error(); + }

break;

default:

log_error_write(srv, FILE, LINE, "sds", "SSL:",

At least now I learned a lot about lightty and openssl internals. :)

cheers,

Marton

PS: According to google it looks like CUPS has also similar problems...

03/22/2008 11:23:46 AM changed by marton.illes@balabit.com

  • attachment lightty-ssl_shutdown-fix.patch added.

03/26/2008 11:52:30 AM changed by stbuehler

  • status changed from reopened to closed.
  • resolution set to fixed.

Good catch!
Fixed in [2136].

I just added some ERR_clear_error() before ssl_write and sll_read to make really sure no old errors are hanging in the queue.

03/26/2008 12:39:17 PM changed by mstemp5@iastate.edu

Excellent news!

It's early since 1.4.19, but if this tests out well, might I suggest that the importance of the fix justifies another release?

Thanks for your good work.

03/26/2008 10:53:22 PM changed by hoffie

  • keywords changed from patch to patch security.
  • status changed from closed to reopened.
  • resolution deleted.
  • milestone changed from 1.4.19 to 1.4.20.

This is actually a DoS problem, I requested a CVE for it.

The fix does not properly work for me. Lighty no longer drops SSL connections, but it tries to properly close the broken connection, leading to lots of SSL error messages and very high CPU consumption. I'll try to post an updated patch in a minute.

03/26/2008 11:13:10 PM changed by hoffie

  • attachment 06_all_lighttpd-1.4.19-closing_foreign_ssl_connections-dos.diff added.

alternative, hopefully better patch (against 1.4.19, not svn!)

03/26/2008 11:24:32 PM changed by hoffie

I attached a patch, which works without problems for me now (no drop of foreign SSL connections, no endless loop, no countless SSL errors).

I'm not completely sure whether it is correct regarding to logging -- according to the man page of SSL_Shutdown a bidirectional SSL shutdown (that's what this is all about) is optional. With my patch applied, no logging takes place if the second part of the shutdown fails (the "ok, i'll shutdown" from the client). IMO that's fine, but as I said, I'm not sure and it is not me who has to decide.

Function-wise, the patch should be correct, but further testing is certainly appreciated.

03/26/2008 11:30:57 PM changed by hoffie

  • attachment lighttpd-1.5-ssl-dos.patch added.

similar patch against svn trunk (1.5)

03/26/2008 11:32:23 PM changed by hoffie

Attached a patch for 1.5 as well, the logging "problem" is not present there as bidirectional SSL shutdown hasn't been implemented yet, it seems (see the FIXME comment in src/connections.c).

03/27/2008 11:17:40 PM changed by stbuehler

  • attachment fix-ssl-again.patch added.

against lighty-1.4 svn; hopefully fixed the error handling for ssl-shutdown in a clean way

03/28/2008 02:55:32 PM changed by hoffie

  • summary changed from SSL write bug... (SSL3_WRITE_PENDING:bad write retry) to active SSL connection loss (SSL3_WRITE_PENDING:bad write retry) (CVE-2008-1531).

CVE-2008-1531 got assigned to this issue. I'll try the patch later.

03/28/2008 04:04:40 PM changed by hoffie

Patch looks fine and appears to work properly. Attaching the same patch against 1.4.19 (distributions might want it).

03/28/2008 04:07:14 PM changed by hoffie

  • attachment fix-ssl-again-1.4.19.patch added.

same patch as fix-ssl-again.patch (by stbuehler) against 1.4.19

03/28/2008 04:35:49 PM changed by stbuehler

  • version changed from 1.4.19 to 1.5.x-svn.
  • milestone changed from 1.4.20 to 1.5.0.

Ok, i hope the ssl error handling is ok in svn now for 1.4.x; i'll leave the bug open for the 1.5.x fix.

03/28/2008 05:00:57 PM changed by hoffie

  • attachment committed-patch-1.4.19.patch added.

backport to 1.4.19 of the patch which actually got committed

(follow-up: ↓ 28 ) 03/30/2008 03:19:38 PM changed by stbuehler

  • status changed from reopened to closed.
  • resolution set to fixed.

Ok, summary for now:

CVE-2008-1531 (http://nvd.nist.gov/nvd.cfm?cvename=CVE-2008-1531)

  • lighttpd-1.4.x: Fixed in [2136], [2139], [2141], [2142] (the first two are the real fixes, the other two change the NEWS file to contain the CVE)
  • lighttpd-1.5.x: Fixed in [2140]

The problem was: if a user killed his ssl connection, lighttpd would kill another ssl connection as it didn't clear the ssl error queue.

(in reply to: ↑ 27 ) 04/07/2008 11:12:41 AM changed by darix

[2144] fixes a small typo in the patch.


Add/Change #285 (active SSL connection loss (SSL3_WRITE_PENDING:bad write retry) (CVE-2008-1531))




Change Properties
Action