W. Andrew Loe III
April 16, 2009 11:55PM
I'm by no means a splunk expert, you should ask them, but I think it
scales pretty well. You can use multiple masters to receive and
load-balance logs, and you can distribute the searching map/reduce
style to leverage more cores. Search speed seems to be much more CPU
bound than I/O bound, the logs are pretty efficiently packed. *Works
for me* with ~ 15-20 EC2 instances and one central logging server. It
also keeps logs in tiered buckets, so things from 30 days ago are
available, but slower to search on where as yesterday's logs are
'hotter'.

On Thu, Apr 16, 2009 at 8:41 PM, Gabriel Ramuglia <gabe@vtunnel.com> wrote:
> Does this scale well? I'm running a web based proxy that generates an
> absolute ton of log files. Easily 40gb / week / server, with around 20
> servers. I'm looking to be able to store and search up to 7 days of
> logs. Currently, I only move logs from the individual servers onto a
> central server when I get a complaint, import it into mysql, and
> search it. The entire process, even for just one server, takes
> forever.
>
> On Thu, Apr 16, 2009 at 7:37 PM, W. Andrew Loe III <andrew@andrewloe.com> wrote:
>> Its commercial, but Splunk is amazing at this. I think you can process
>> a few hundred MB/day on the free version. http://splunk.com/
>>
>> You set up a light-weight forwarder on every node you are interested
>> in, and then it slurps the files up and relays them to a central
>> splunk installation. It will queue internally if the master goes away.
>> Tons of support for sending different files different directions etc.
>> We have it setup in the default Puppet payload so every log on every
>> server is always centralized and searchable.
>>
>> On Wed, Apr 15, 2009 at 8:44 AM, Michael Shadle <mike503@gmail.com> wrote:
>>> On Wed, Apr 15, 2009 at 7:06 AM, Dave Cheney <dave@cheney.net> wrote:
>>>
>>>> What about
>>>>
>>>> cat *.log | sort -k 4
>>>
>>> or just
>>>
>>> cat *whatever.log >today.log
>>>
>>> I assume the processing script can handle out-of-order requests. but I
>>> guess that might be an arrogant assumption. :)
>>>
>>> I do basically the same thing igor does, but would love to simplify it
>>> by just having Host: header counts for bytes (sent/received/total
>>> amount of bytes used, basically) and how many http requests. Logging
>>> just enough of that to a file and parsing it each night seems kinda
>>> amateur...
>>>
>>>
>>
>>
>
>
Subject Author Posted

Centralized logging for multiple servers

Kingsley Foreman April 12, 2009 11:26PM

Re: Centralized logging for multiple servers

Anton Yuzhaninov April 13, 2009 02:47PM

Re: Centralized logging for multiple servers

Gena Makhomed April 13, 2009 03:31PM

Re: Centralized logging for multiple servers

Igor Sysoev April 15, 2009 09:12AM

Re: Centralized logging for multiple servers

Glen Lumanau April 15, 2009 09:23AM

Re: Centralized logging for multiple servers

Dave Cheney April 15, 2009 09:32AM

Re: Centralized logging for multiple servers

Kingsley Foreman April 15, 2009 09:35AM

Re: Centralized logging for multiple servers

Glen Lumanau April 15, 2009 09:36AM

Re: Centralized logging for multiple servers

Dave Cheney April 15, 2009 10:06AM

Re: Centralized logging for multiple servers

mike April 15, 2009 11:44AM

Re: Centralized logging for multiple servers

W. Andrew Loe III April 16, 2009 10:37PM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 16, 2009 11:41PM

Re: Centralized logging for multiple servers

W. Andrew Loe III April 16, 2009 11:55PM

Re: Centralized logging for multiple servers

mike April 17, 2009 12:09AM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 17, 2009 06:32AM

Re: Centralized logging for multiple servers

mike April 17, 2009 11:43AM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 17, 2009 01:36PM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 17, 2009 08:07PM

Re: Centralized logging for multiple servers

Kon Wilms April 17, 2009 09:03PM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 18, 2009 05:22AM

Re: Centralized logging for multiple servers

Kon Wilms April 18, 2009 12:16PM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 18, 2009 03:03PM

Re: Centralized logging for multiple servers

Kon Wilms April 15, 2009 11:40AM

Re: Centralized logging for multiple servers

Gabriel Ramuglia April 17, 2009 01:53PM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 287
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready