prometheus基础部署使用

关于prometheus的介绍网上有很多详细的资料和整套的书籍，因为接触的晚，偶有用到，所以整理一下最基础的部署和使用，以便归档

一、安装prometheus

1、根据平台类型直接下载二进制包
2、解压后将包中的所有文件都移动到/usr/local/prometheus便于统一管理
3、启动应用
可以直接手动拉起，或者配置systemctl服务来监管，根据使用环境选择
（1）通过systemctl启动：

vi /etc/systemd/system/prometheus.service

[Unit]

Description=Prometheus Server

After=network.target

[Service]

WorkingDirectory=/usr/local/prometheus

Restart=on-failure

ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=:8080 --web.enable-admin-api --storage.tsdb.path=/data/prometheus/data

[Install]

WantedBy=multi-user.target

systemctl daemon-reload

systemctl enable prometheus

systemctl start prometheus

（2）手工启动

# 使用非默认的9090端口，开启web admin

/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.listen-address=:8080 --web.enable-admin-api &

下面几个模块的启动方式类似，后面就不再单独说吗，只提供一个配置文件便于copy

4、prometheus的其他功能

# 检查配置是否正确
./promtool check config /usr/local/prometheus/prometheus.yml

# 加载修改后的配置
（1）热加载
curl -X POST http://localhost:8080/-/reload
（2）重启
systemctl restart prometheus

5、配置文件模版

# my global config

global:

scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.

evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# scrape_timeout is set to the global default (10s).

# Alertmanager configuration

alerting:

alertmanagers:

- static_configs:

- targets: ['9.30.2.8:9093']

# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files:

- /usr/local/prometheus/alterRules/*.rules

# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:

# Here it's Prometheus itself.

scrape_configs:

# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

- job_name: "prometheus"

# metrics_path defaults to '/metrics'

# scheme defaults to 'http'.

static_configs:

- targets: ["9.30.2.8:9090"]

- job_name: "linux"

static_configs:

- targets: ["9.30.2.8:9100","9.30.2.9:9100"]

- job_name: "process"

static_configs:

- targets: ["9.30.2.8:9256","9.30.2.9:9256"]

这里都是使用的静态配置，因为机器个数比较少，临时用一下。

告警规则配置模版

groups:

- name: mysql

rules:

- alert: MemoryIncrease

expr: delta(namedprocess_namegroup_memory_bytes{groupname="map[:mysql]",memtype="resident"}[3h]) > 1024*1024*1024

for: 1s

labels:

time: 3h

mem_size: 1GB

annotations:

summary: mysql memory abnormal

description: the total resident memory release more than 1G for all mysql in all db machines

这里的规则文件名称可以自由定义，文件路径需要根据prometheus配置文件中的rule路径放置，例如：/usr/local/prometheus/alterRules/*.rules。规则的写法需要了解PromSQL，这又是一名学问，此处涉入不深，这里示例的规则触发条件是：监控过去3个小时内存变化幅度超过1G

二、安装node_exporter

最常用的监控整个系统状态的exporter

1、搜索并下载node_exporter，直接将解压包中的所有文件都移动到/user/local/node_exporter

2、配置systemctl启动

vi /etc/systemd/system/node-exporter.service

[Unit]

Description=Prometheus Node Exporter

After=network.target

[Service]

WorkingDirectory=/usr/local/node_exporter

Restart=on-failure

ExecStart=/usr/local/node_exporter/node_exporter

[Install]

WantedBy=multi-user.target

systemctl daemon-reload

systemctl enable node-exporter

systemctl start node-exporter

systemctl status node-exporter

三、安装process-exporter

可以用来监控进程状态的一个exporter

1、安装方式同上面node-exporter

2、配置systemctl启动

vi /etc/systemd/system/process-exporter.service

[Unit]

Description=Prometheus Process Exporter

After=network.target

[Service]

WorkingDirectory=/usr/local/process_exporter

Restart=on-failure

ExecStart=/usr/local/process_exporter/process-exporter -config.path=/usr/local/process_exporter/process-exporter.yaml

[Install]

WantedBy=multi-user.target

systemctl daemon-reload

systemctl enable process-exporter

systemctl start process-exporter

systemctl status process-exporter

配置文件模版

process_names:

- name: "{{.Matches}}"

cmdline:

- 'mysqld'

- name: "{{.Matches}}"

cmdline:

- 'nginx'

配置配置匹配的模式有多种，这里只使用了{{.Matches}}

四、altermanager

altermanager是一个可选项，默认prometheus的web端也是可以看到告警信息的，临时使用可以不用安装

1、安装方式同上

2、配置systemctl启动

vim /etc/systemd/system/alertmanager.service

[Unit]

Description=alertmanager

After=network.target

[Service]

WorkingDirectory=/usr/local/alertmanager

Restart=on-failure

ExecStart=/usr/local/alertmanager/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml --log.level=debug --log.format=json

[Install]

WantedBy=multi-user.target

配置文件模版：

route:

group_by: ['alertname']

group_wait: 30s

group_interval: 5m

repeat_interval: 1h

receiver: 'web.hook'

#receiver: 'wechat'

receivers:

- name: 'web.hook'

webhook_configs:

- url: 'http://127.0.0.1:5001/'

inhibit_rules:

- source_match:

severity: 'critical'

target_match:

severity: 'warning'

equal: ['alertname', 'dev', 'instance']

五、Grafana

这个比较重要，可视化数据监控主要还是看grafana

1、安装

有源的可以直接安装
sudo yum install grafana-7.1.5-1.x86_64.rpm
没有源的下载rpm包进行安装

2、修改端口

某些情况下遇到默认的3000端口未开放，此时需要修改默认端口，方法如下

setcap 'cap_net_bind_service=+ep' /usr/sbin/grafana-server

vim /etc/grafana/grafana.ini

http_port = 80

systemctl edit grafana-server.service

[Service]

# Give the CAP_NET_BIND_SERVICE capability

CapabilityBoundingSet=CAP_NET_BIND_SERVICE

AmbientCapabilities=CAP_NET_BIND_SERVICE

# A private user cannot have process capabilities on the host's user

# namespace and thus CAP_NET_BIND_SERVICE has no effect.

PrivateUsers=false

3、启动

systemctl start grafana-server

4、接入数据源

打开grafana的web端，默认用户密码是admin/admin，进入之后在configuration->Data Sources中添加prometheus的源即可

5、下载安装各种DashBoard

grafana官网有很多人贡献dashboard，可以根据exporter的类型搜索自己钟意的模版，下载后在DashBoard->Import中导入json文件，然后就可以查看了

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

mowblog

一群散乱的记忆

prometheus基础部署使用

一、安装prometheus

二、安装node_exporter

三、安装process-exporter

四、altermanager

五、Grafana

相关

发表回复取消回复

一、安装prometheus

二、安装node_exporter

三、安装process-exporter

四、altermanager

五、Grafana

分享到：

相关

发表回复 取消回复

发表回复取消回复