Does this scale well? I'm running a web based proxy that generates an
absolute ton of log files. Easily 40gb / week / server, with around 20
servers. I'm looking to be able to store and search up to 7 days of
logs. Currently, I only move logs from the individual servers onto a
central server when I get a complaint, import it into mysql, and
search it. The entire process, even for just one server, takes
forever.
On Thu, Apr 16, 2009 at 7:37 PM, W. Andrew Loe III <andrew@andrewloe.com> wrote:
> Its commercial, but Splunk is amazing at this. I think you can process
> a few hundred MB/day on the free version. http://splunk.com/
>
> You set up a light-weight forwarder on every node you are interested
> in, and then it slurps the files up and relays them to a central
> splunk installation. It will queue internally if the master goes away.
> Tons of support for sending different files different directions etc.
> We have it setup in the default Puppet payload so every log on every
> server is always centralized and searchable.
>
> On Wed, Apr 15, 2009 at 8:44 AM, Michael Shadle <mike503@gmail.com> wrote:
>> On Wed, Apr 15, 2009 at 7:06 AM, Dave Cheney <dave@cheney.net> wrote:
>>
>>> What about
>>>
>>> cat *.log | sort -k 4
>>
>> or just
>>
>> cat *whatever.log >today.log
>>
>> I assume the processing script can handle out-of-order requests. but I
>> guess that might be an arrogant assumption. :)
>>
>> I do basically the same thing igor does, but would love to simplify it
>> by just having Host: header counts for bytes (sent/received/total
>> amount of bytes used, basically) and how many http requests. Logging
>> just enough of that to a file and parsing it each night seems kinda
>> amateur...
>>
>>
>
>