Forum archive 2000-2006

James - Apache Memory Problems

James - Apache Memory Problems

by Arnold Pizer -
Number of replies: 0
inactiveTopicApache Memory Problems topic started 12/3/2006; 9:46:01 PM
last post 12/12/2006; 11:07:42 PM
userJames - Apache Memory Problems  blueArrow
12/3/2006; 9:46:01 PM (reads: 358, responses: 3)
Greetings all,

Brian Camp and I have set up a webwork server using Webwork 2.3 ( patched branch ). We use Apache 2.0.54, Perl 5.8.6, on Fedora Core 4.

We have been having issues with Apache and Memory.

Our server is dual processor, dual core Xeons, with 1GB RAM and 3.5GB swap.

On two occasions we have checked the server and found apache misbehaving.

Saturday, an instructor called saying WW was not working. We checked into it, and apache had used all available RAM, as well as all available swap. The professor had been using Course administration to select problems for a homework set. He had selected 'show all' from the drop down, so he was trying to view all 500 problems in the library. Apparently the server had not responded so he tried again and again until he contacted us. When we checked apache was using all available RAM as well as nearly all available swap. We restarted apache. And then I followed the advice in the below threads:

http://webhost.math.rochester.edu/webworkdocs/discuss/msgReader$3168#3189 http://webhost.math.rochester.edu/webworkdocs/discuss/msgReader$4481

I changed our httpd.conf to:

KeepAlive On



MaxKeepAliveRequests 10



KeepAliveTimeout 15



<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 70
MaxClients 160
MaxRequestsPerChild 100
</IfModule>



<IfModule worker.c>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 100
</IfModule>



PerlSetEnv PERL_RLIMIT_AS 100:120
PerlModule Apache2::Resource

Today ( Sunday ) our server became unusable. After logging in to the server we restarted apache again. It was using all available RAM, and was using about half available swap.

Looking through the apache error log, we were able to identify when the problem started, and from that point on the logs are littered with: [Sun Dec 03 18:39:01 2006] [info] (104)Connection reset by peer: core_output_filter: writing data to the network [Sun Dec 03 18:39:02 2006] [info] (32)Broken pipe: core_output_filter: writing data to the network [Sun Dec 03 18:39:02 2006] [info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 16 children, there are 0 idle, and 42 total children

We then checked the webwork logs and found one user had managed to submit an answer to a problem 405 times in about 2 minutes. We tried recreating the error by logging in and doing the same problem, with the answer they submited but never caused the server to go crazy.

Any idea what might be the problem? Any recommendations on apache configuration?

Thanks for your time,

James Elliott & Brian Camp

<| Post or View Comments |>


userMichael Gage - Re: Apache Memory Problems  blueArrow
12/3/2006; 10:34:53 PM (reads: 404, responses: 0)
I would put the max clients down to around 20. We keep it at 10 on hosted, but we only have 500 Meg of memory for that machine.

 

Timeout 1200
KeepAlive On
#MaxKeepAliveRequests 100
MaxKeepAliveRequests 50
KeepAliveTimeout 10
StartServers 5
MinSpareServers 5
MaxSpareServers 10
#sam# Default is 150, but was previously set to 10. Trying 50 for now to
#sam# see if there'll be any problems.
#mike# there were some problems -- trying 25 clients -- (see Oct 22, 2006)-- some problems timed out
#Mike# still problems -- reducing it to 15
#mike# now to 10
MaxClients 10
MaxRequestsPerChild 300

As you can see some experimentation is necessary. A webwork child takes about 50Meg of memory, and it appears that it works best if you keep the total amount of memory that can be claimed by the children to less than the physical RAM available.

Hope this helps.

<| Post or View Comments |>


userBrian Camp - Re: Apache Memory Problems  blueArrow
12/3/2006; 11:54:03 PM (reads: 406, responses: 0)
Mike,

Thanks for the quick response!

I had another apache question that I think is related to this. What sorts of modules should apache be loaded with? For example, have you tinkered with this on hosted or hosted2 to try and make your server more efficient for webwork?

Here are the modules that we currently are loading:

> LoadModule access_module modules/mod_access.so
> LoadModule auth_module modules/mod_auth.so
> LoadModule auth_anon_module modules/mod_auth_anon.so
> LoadModule auth_dbm_module modules/mod_auth_dbm.so
> LoadModule auth_digest_module modules/mod_auth_digest.so
> LoadModule ldap_module modules/mod_ldap.so
> LoadModule auth_ldap_module modules/mod_auth_ldap.so
> LoadModule include_module modules/mod_include.so
> LoadModule log_config_module modules/mod_log_config.so
> LoadModule logio_module modules/mod_logio.so
> LoadModule env_module modules/mod_env.so
> LoadModule mime_magic_module modules/mod_mime_magic.so
> LoadModule cern_meta_module modules/mod_cern_meta.so
> LoadModule expires_module modules/mod_expires.so
> LoadModule deflate_module modules/mod_deflate.so
> LoadModule headers_module modules/mod_headers.so
> LoadModule usertrack_module modules/mod_usertrack.so
> LoadModule setenvif_module modules/mod_setenvif.so
> LoadModule mime_module modules/mod_mime.so
> LoadModule dav_module modules/mod_dav.so
> LoadModule status_module modules/mod_status.so
> LoadModule autoindex_module modules/mod_autoindex.so
> LoadModule asis_module modules/mod_asis.so
> LoadModule info_module modules/mod_info.so
> LoadModule dav_fs_module modules/mod_dav_fs.so
> LoadModule vhost_alias_module modules/mod_vhost_alias.so
> LoadModule negotiation_module modules/mod_negotiation.so
> LoadModule dir_module modules/mod_dir.so
> LoadModule actions_module modules/mod_actions.so
> LoadModule speling_module modules/mod_speling.so
> LoadModule userdir_module modules/mod_userdir.so
> LoadModule alias_module modules/mod_alias.so
> LoadModule rewrite_module modules/mod_rewrite.so
> LoadModule proxy_module modules/mod_proxy.so
> LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
> LoadModule proxy_http_module modules/mod_proxy_http.so
> LoadModule proxy_connect_module modules/mod_proxy_connect.so
> LoadModule cache_module modules/mod_cache.so
> LoadModule suexec_module modules/mod_suexec.so
> LoadModule disk_cache_module modules/mod_disk_cache.so
> LoadModule file_cache_module modules/mod_file_cache.so
> LoadModule mem_cache_module modules/mod_mem_cache.so
> LoadModule cgi_module modules/mod_cgi.so
Also, we decided to keep temporary copies of images generated by webwork on the fly. Do you have any suggestions on how long images should be kept for? For example, do you run a nightly cron job to delete images that are stale (i.e. perhaps older than x hours or x days)? Is there even a way to do this?

Initially we had not been storing images generated on the fly so that they would have to be regenerated. I think this may have led to our difficulty with the instructor browsing the problem library.

They had picked something like 518 problems they wanted to view. Images was their default display option. This by itself was not a problem but then they would then click on something else causing the page to want to redraw all 518 problems. Is this a bug in the problem library? It would seem that there might be a better way to select libraries and problem sets and so on without the display needing to refresh every time a button is clicked. On the other hand, I am trying to discourage the instructor from having to view such massive numbers of problems all at the same time :)

Thanks again for all of the help, Brian

<| Post or View Comments |>


userWilliam Wheeler - Re: Apache Memory Problems  blueArrow
12/12/2006; 11:07:42 PM (reads: 315, responses: 0)
Dear James and Brian,

Re: >We then checked the webwork logs and found one user had managed to submit an answer to a problem 405 times in about 2 minutes. We tried recreating the error by logging in and doing the same problem, with the answer they submited but never caused the server to go crazy.

I see this situation several times each semester. This is an inadvertent Denial of Service (DoS) attack caused by a "bug" in the operating system and browser combination on the computer the student is using. What seems to be the case is that the "Submit" button/mouse combination and/or the "Enter" key act as "repeating" keys, like the letter keys on the keyboard. (Hold down a letter key and watch how fast it repeats.) So if the student holds down the mouse button while the mouse pointer is on the Submit key or holds down the "Enter" key, then the browser will repeatedly submit the form as rapidly as possible. I've seen submission rates approaching 20 submissions per second. I've seen these attacks with both "GET" and "POST" requests.

Each resubmission breaks the computer's network connection to the Apache server. That's the source of the "Connection reset" and "Broken pipe" messages.

These attacks rapidly overwhelm Apache, because it has to assign each request to a new child or an old one that isn't otherwise assigned. (Note: When a connection is broken, it doesn't stop the WeBWorK process that is running; that process will run to completion, at which point the Apache child tries to send the WeBWorK output back to the student's computer; but the child discovers that the "pipe" is broken when it tries to write to the pipe.) Because the rate of submissions is faster than WeBWorK can process the submissions, WeBWorK falls behind and Apache has to create one new child after another. If the attack lasts several minutes, then Apache will overflow both RAM and Swap. The server will appear to be non-responsive, because it is spending almost all of its CPU cycles swapping.

If left alone for a long time, the server will eventually recover on its own. But that may take hours. So the timely response is to stop and restart Apache. (This may take several minutes.)

These inadvertent DoS attacks were a frequent problem with the old Windows Millenium version of Windows. I see the problem less frequently now. Last month I saw one attack from a Mac and one attack from a Windows PC.

I've never been able to recreate this phenomenon on my workstations. But I've spoken with students who were sources of attacks. They usually describe the computer's screen as appearing to "shake rapidly". I usually caution the students to stop using the computers that generated the attacks.

Sincerely,

Bill Wheeler, Indiana University, Bloomington

<| Post or View Comments |>