Обсуждение: Database and OS monitoring

Поиск
Список
Период
Сортировка

Database and OS monitoring

От
Edson Carlos Ericksson Richter
Дата:
Dear list,

I've been searching in web for guidelines on OS (Linux) and PostgreSQL
(9.3.5) active monitoring best practices.
Can someone share experiences?
I'm inclined to look at Cacti and Nagios. Any other experiences?
Recommended books?
I don't want to use SaaS for monitoring - I'll have a cloud server hired
specifically for this purpose, outside my main data center infrastructure.

Thanks in advance,

Edson


Re: Database and OS monitoring

От
John R Pierce
Дата:
On 12/13/2014 10:55 AM, Edson Carlos Ericksson Richter wrote:
> I've been searching in web for guidelines on OS (Linux) and PostgreSQL
> (9.3.5) active monitoring best practices.
> Can someone share experiences?
> I'm inclined to look at Cacti and Nagios. Any other experiences?
> Recommended books?
> I don't want to use SaaS for monitoring - I'll have a cloud server
> hired specifically for this purpose, outside my main data center
> infrastructure.

Munin is another good choice, its like a much better implementation of
Cacti.   It also comes with quite a few postgres monitoring graphs
already setup, you just have to enable it to connect to your postgres
server.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Database and OS monitoring

От
Andy Colson
Дата:
On 12/13/2014 12:55 PM, Edson Carlos Ericksson Richter wrote:
> Dear list,
>
> I've been searching in web for guidelines on OS (Linux) and PostgreSQL (9.3.5) active monitoring best practices.
> Can someone share experiences?
> I'm inclined to look at Cacti and Nagios. Any other experiences? Recommended books?
> I don't want to use SaaS for monitoring - I'll have a cloud server hired specifically for this purpose, outside my
maindata center infrastructure. 
>
> Thanks in advance,
>
> Edson
>
>

Stats are one thing, but errors are another.  I've found my best monitor is rsyslog and a perl script.

rsyslog.conf contains:

   local0.* action(type="omprog"
     binary="/usr/local/bin/logMonitor.pl"
     template="RSYSLOG_TraditionalFileFormat")

the perl script is sort of like:

while (<>)
{
     emailme() if (/error/);
}

-Andy


Re: Database and OS monitoring

От
Vick Khera
Дата:

On Sat, Dec 13, 2014 at 1:55 PM, Edson Carlos Ericksson Richter <edsonrichter@hotmail.com> wrote:
I've been searching in web for guidelines on OS (Linux) and PostgreSQL (9.3.5) active monitoring best practices.

Recent trends are more toward monitoring response latency by first establishing a baseline level of activity and latency, then alerting when those numbers get out of acceptable range.

There are some open source tools to collect and sort and report this way (see Kibana and Grafana and their underlying data stores). I've not seen alerting tools based on this that are non-commercial, though. Two services I know of are Ruxit and Circonus.

Personally I still use Nagios to tell my staff when things are down or not responding, but often that is too late to proactively fix things.

One thing that'd be really cool is to use the new binary JSON storage in the upcoming Pg release to store the time series data for use with Grafana... but then you'd have a chicken/egg problem with monitoring itself. :)

Re: Database and OS monitoring

От
Tim Smith
Дата:
Try http://brendangregg.com/

Lots of great tidbits there from a guy who really knows his performance stuff (ex-Sun, now Netflix)

On Sunday, 14 December 2014, Vick Khera <vivek@khera.org> wrote:

On Sat, Dec 13, 2014 at 1:55 PM, Edson Carlos Ericksson Richter <edsonrichter@hotmail.com> wrote:
I've been searching in web for guidelines on OS (Linux) and PostgreSQL (9.3.5) active monitoring best practices.

Recent trends are more toward monitoring response latency by first establishing a baseline level of activity and latency, then alerting when those numbers get out of acceptable range.

There are some open source tools to collect and sort and report this way (see Kibana and Grafana and their underlying data stores). I've not seen alerting tools based on this that are non-commercial, though. Two services I know of are Ruxit and Circonus.

Personally I still use Nagios to tell my staff when things are down or not responding, but often that is too late to proactively fix things.

One thing that'd be really cool is to use the new binary JSON storage in the upcoming Pg release to store the time series data for use with Grafana... but then you'd have a chicken/egg problem with monitoring itself. :)

Re: Database and OS monitoring

От
Joseph Kregloh
Дата:
I use Zabbix a lot. There is very nice template for Postgres http://pg-monz.github.io/pg_monz/index-en.html

On Sun, Dec 14, 2014 at 12:13 PM, Tim Smith <randomdev4+postgres@gmail.com> wrote:
Try http://brendangregg.com/

Lots of great tidbits there from a guy who really knows his performance stuff (ex-Sun, now Netflix)

On Sunday, 14 December 2014, Vick Khera <vivek@khera.org> wrote:

On Sat, Dec 13, 2014 at 1:55 PM, Edson Carlos Ericksson Richter <edsonrichter@hotmail.com> wrote:
I've been searching in web for guidelines on OS (Linux) and PostgreSQL (9.3.5) active monitoring best practices.

Recent trends are more toward monitoring response latency by first establishing a baseline level of activity and latency, then alerting when those numbers get out of acceptable range.

There are some open source tools to collect and sort and report this way (see Kibana and Grafana and their underlying data stores). I've not seen alerting tools based on this that are non-commercial, though. Two services I know of are Ruxit and Circonus.

Personally I still use Nagios to tell my staff when things are down or not responding, but often that is too late to proactively fix things.

One thing that'd be really cool is to use the new binary JSON storage in the upcoming Pg release to store the time series data for use with Grafana... but then you'd have a chicken/egg problem with monitoring itself. :)