1. Please take a little time for this simple survey! Thank you for participating!
    Dismiss Notice
  2. Dear Pleskians, please read this carefully! New attachments and other rules Thank you!
    Dismiss Notice
  3. Dear Pleskians, I really hope that you will share your opinion in this Special topic for chatter about Plesk in the Clouds. Thank you!
    Dismiss Notice

error_log Segmentation fault (11)

Discussion in 'Plesk for Linux - 8.x and Older' started by ldejager, Sep 23, 2006.

  1. ldejager

    ldejager Guest

    0
     
    Hi,

    We have a FC3 box running Plesk 7.5.4 with approximately 600 domains hosted on it that have been migrated from Ensim.

    When I installed the box, I recompiled apache, php etc with a higher FD_SETSIZE as per instructions found on the swsoft KB, however I keep getting these error messages below in the error_log and a few clients have reported that when they visit their webmail login page, it would give them a "Page cannot be found" error but it would show the login page if they refresh the page... also have a couple of clients on my back saying when someone from google enters their site it gives the same error (not sure why if php-imap is the culprit? if it is...)

    I've found a link on the swsoft KB saying:

    The reason is that phpimap module has been built with __FD_SETSIZE=1024.

    and that one would have to recompile a few packages... which was done already when the box was freshly installed.

    Does anyone perhaps know how I can get around these errors? Recompiling php/php-imap with the correct FD_SETSIZE?

    Below is a snip from error_log

    [Sun Sep 24 00:23:54 2006] [notice] child pid 4027 exit signal Segmentation fault (11)
    [Sun Sep 24 00:25:49 2006] [notice] child pid 4024 exit signal Segmentation fault (11)
    [Sun Sep 24 00:25:50 2006] [notice] child pid 5193 exit signal Segmentation fault (11)
    [Sun Sep 24 00:25:52 2006] [notice] child pid 4023 exit signal Segmentation fault (11)
    [Sun Sep 24 00:26:00 2006] [notice] child pid 4030 exit signal Segmentation fault (11)
    [Sun Sep 24 00:30:52 2006] [notice] child pid 5186 exit signal Segmentation fault (11)
    [Sun Sep 24 00:34:16 2006] [notice] child pid 5192 exit signal Segmentation fault (11)

    many thanks,
     
  2. wagnerch

    wagnerch Guest

    0
     
    It would seem to make sense to me that you need to rebuild php-imap, considering php and php-imap will operate in the same process space as the child Apache process. Which it sounds like you already rebuilt Apache with a larger FD_SETSIZE.

    What is likely happening is the file handles that Apache already has open would remain open when php-imap is trying to connect to the imap server and it is getting an fd > 1024 but has only allocated 1024 slots because that is what was done at link time.

    Unfortuantely this is a very very ugly road that you are travelling, you will need to hunt down everything that is using the FD_* macros and select/pselect calls.

    I would bet what is eating up most of the file descriptors is the open access_log and access_ssl_log files from Apache (you can see via lsof -Pn, if you or anyone else is interested). 600 domains, that is costing you 1200 fd's I bet.
     
  3. ldejager

    ldejager Guest

    0
     
    Hi, thanks for the reply...

    I recompiled openssl, curl, apache, libc-client and php when the box was freshly installed.

    lsof -Pn | grep httpd returns 27558
    and a normal lsof -Pn returns 31042
    lsof -Pn | grep log returns 24244

    I've noticed that lsof reports some logs files being open 44 times, any reason why???

    The box was rebuild with #define __FD_SETSIZE 131072

    when recompiling php, it looks for the FD_SETSIZE in /usr/include/bits/typesizes.h correct?

    any pointers appreciated very much,

    thanks

    edit-

    I have checked another server with approximately 500 domains on it, and I see the same issue... logs files being open 36 - 40 times...
     
  4. ldejager

    ldejager Guest

    0
     
    after a fresh reboot...

    lsof -Pn | grep httpd | wc -l
    13765
    lsof -Pn | wc -l
    18200
     
  5. wagnerch

    wagnerch Guest

    0
     
    They are open for each httpd process running, parent & children. You would notice that you probably have approximately 44 httpd's running.

    As for rebuilding php, did you also rebuild the imap.so shared library in /usr/lib/php4?

    And you did customize the startup script to do ulimit -n 131072?
     
  6. ldejager

    ldejager Guest

    0
     
    Hi,

    Yes, I did the standard rebuild of the php*src.rpm and (force) installed all new rpms created in /usr/src/redhat/RPMS/i386/.

    I have also checked that the spec file contains the imap configure flag and confirmed that when rebuilding the src.rpm it is included...

    Also did add the ulimit -n 131072 to both /etc/init.d/httpd and /usr/sbin/apachectl..

    Not sure where to go from here as i am 100% sure that php is compiled with the bigger FD_SETSIZE. Is ther perhaps a limit of the FD_SETSIZE PHP can handle? Should I lower the FD_SETSIZE?

    any pointers appreciated.

    thanks
     
  7. wagnerch

    wagnerch Guest

    0
     
    Perhaps it is the mysql client library? It will be tough to trace it down, unless Apache is dumping core files. Try setting "ulimit -c unlimited" and see if you can get it to puke out core files. Then do:

    $ gdb /usr/sbin/httpd core.<xyz>
    (gdb) bt

    More or less what you need to do is trace down every httpd dependency.
     
  8. ldejager

    ldejager Guest

    0
     
    Hi,

    I've added the ulimit and CoreDumpDirectory to apache and fired it up... within a few minutes there were about 5 core dumps ready to be examined, below is the the first and the last (the rest all come down to libmysqlclient)

    #0 0xb6d04dac in ?? () from /usr/lib/mysql/libmysqlclient.so.10
    #1 0xb6d06537 in mysql_real_connect () from /usr/lib/mysql/libmysqlclient.so.10
    #2 0xb6d3651e in php_mysql_do_connect (ht=Variable "ht" is not available.
    ) at /usr/src/redhat/BUILD/php-4.3.11/ext/mysql/php_mysql.c:778
    #3 0xb7725047 in execute (op_array=0xbc0d1864) at /usr/src/redhat/BUILD/php-4.3.11/Zend/zend_execute.c:1654
    #4 0xb7251f8e in phpd_encrypt_op_array () from /usr/lib/php4/php_ioncube_loader_lin_4.3.so
    #5 0xbc0d1864 in ?? ()
    #6 0x00000147 in ?? ()
    #7 0x037f0f7f in ?? ()
    #8 0xb76fe32d in _efree (ptr=0xbc0d1644) at /usr/src/redhat/BUILD/php-4.3.11/Zend/zend_alloc.c:227
    #9 0xb7251f8e in phpd_encrypt_op_array () from /usr/lib/php4/php_ioncube_loader_lin_4.3.so
    #10 0xbc0d1644 in ?? ()
    #11 0x00000000 in ?? ()

    =========================

    #0 0xb6d04dac in ?? () from /usr/lib/mysql/libmysqlclient.so.10
    #1 0xb6d06537 in mysql_real_connect () from /usr/lib/mysql/libmysqlclient.so.10
    #2 0xb6d3701b in php_mysql_do_connect (ht=Variable "ht" is not available.
    ) at /usr/src/redhat/BUILD/php-4.3.11/ext/mysql/php_mysql.c:673
    #3 0xb7725047 in execute (op_array=0xbc1382d4) at /usr/src/redhat/BUILD/php-4.3.11/Zend/zend_execute.c:1654
    #4 0xb7251f8e in phpd_encrypt_op_array () from /usr/lib/php4/php_ioncube_loader_lin_4.3.so
    #5 0xbc1382d4 in ?? ()
    #6 0x00000000 in ?? ()

    I've googled to shed some light on the above but no luck... the above means the libmysqlclient cannot create a new connection?

    thanks and any pointers appreciated,
     
  9. wagnerch

    wagnerch Guest

    0
     
    Yep, it is dumping in mysql_real_connect(). It looks like you need to rebuild your mysql client libraries, etc.
     
  10. Artur

    Artur Guest

    0
     
    A quick question about this, is there a way to find the culprit? I have a similar problem and suspect it's just one or two websites that are abusing resources.
     
  11. wagnerch

    wagnerch Guest

    0
     
    If it is core dumping, it is probably not related to someone abusing resources. Do you know if your FD_SETSIZE was changed without rebuilding everything? Did you take a look at the core dumps (via gdb) to find out what happened?


    Here is a test program to prove my point:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <errno.h>

    int
    main (int argc, char **argv)
    {
    void *ptr = malloc (32 * 1024 * 1024 * sizeof(char));
    if (ptr == NULL)
    {
    fprintf(stderr, "malloc() failed: %s\n", strerror(errno));
    exit(0);
    }
    else
    fprintf(stderr, "malloc() success.\n");

    if (memset (ptr, 0x80, 32 * 1024 * 1024 * sizeof(char)) == NULL)
    fprintf(stderr, "memset() failed: %s\n", strerror(errno));
    else
    fprintf(stderr, "memset() success.\n");
    }


    $ gcc -o limit limit.c
    $ /bin/bash
    $ ./limit
    malloc() success.
    memset() success.
    $ ulimit -v 16384
    $ ./limit
    malloc() failed: Cannot allocate memory
    $ exit


    Here is a socket limit:

    #include <stdio.h>
    #include <string.h>
    #include <errno.h>

    FILE *fp[4096];

    int
    main (int argc, char **argv)
    {
    int j;

    for (j=0; j<4096; j++)
    {
    fp[j] = fopen("/dev/null", "r");
    if (fp[j] == NULL)
    {
    fprintf(stderr, "fopen() failed: %s\n", strerror(errno));
    break;
    }
    }

    printf ("Opened %d sockets.\n", j);
    }

    $ gcc -o limit2 limit2.c
    $ ./limit2
    fopen() failed: Too many open files
    Opened 1021 sockets.

    Remember that the limit is normally 1024, and three are already open (stdin, stdout, and stderr).
     
  12. Artur

    Artur Guest

    0
     
    i did have to raise the "ulimit" inside /etc/init.d/httpd because apache would not start otherwise.
     
  13. Artur

    Artur Guest

    0
     
    I think what happened is that one of our clients has an application that opens too many files, in order to get around it, i raised the ulimit and now it's breaking something else. Does this make sense.

    I'm 99% certain this is what is happening.
     
  14. wagnerch

    wagnerch Guest

    0
     
    Yes, it makes sense. Even though you can have more than 1024 files open, the issue comes down to select(2) and the fact that fd_set is an array of FD_SETSIZE elements. When you try to stick more than FD_SETSIZE the results can be ominous or may result in a core dump.

    One of the things you can do is use "lsof -Pn" and try to identify what filehandles are leaked (what files or sockets are left open). From there you can use grep/xargs/etc to try and locate what script it is in.

    Another option is to attach strace to one of the child httpd processes. strace is quite noisy, but is very good, you can filter it down by doing "strace -e trace=open,close" to track only open/close calls.

    And yet another option is to increase your httpd recycle rate by reducing MaxRequestsPerChild if you are using a prefork Apache (normally the default). This will free up the leaked filehandles when the child dies.
     
Loading...