iSt0ne's Notes

Saltstack:自动化监控

本节参考了绿肥的《记saltstack和zabbix的一次联姻》,对zabbix添加监控脚本(add_monitors.py)进行部分修改而成,此脚本基于@超大杯摩卡星冰乐 同学的zapi进行更高级别的封装而成,在此表示感谢。

自动化监控的过程如下:

  1. 通过Saltstack部署Zabbix server、Zabbix web、Zabbix api;
  2. 完成安装后需要手动导入Zabbix监控模板;
  3. 通过Saltstack部署服务及Zabbix agent;
  4. Saltstack在安装完服务后通过Salt Mine将服务角色汇报给Salt Master;
  5. Zabbix api拿到各服务角色后添加相应监控到Zabbix server。

Salt Mine用于将Salt Minion的信息存储到Salt Master,供其他Salt Minion使用。

下面以对nginx模块的监控为例讲述整个监控过程,其中Zabbix服务(Zabbix server、Zabbix web、Zabbix api)安装使用/srv/salt/zabbix进行管理,服务部署在admin.grid.mall.com上。Zabbix agent使用/srv/salt/zabbix进行管理。nginx使用/srv/salt/nginx模块进行管理。

安装完nginx和php后定义相应的角色:

nginx-role:  
  file.append:  
    - name: /etc/salt/roles  
    - text:  
      - 'nginx'  
    - require:  
      - file: roles  
      - service: nginx  
      - service: salt-minion  
    - watch_in:  
      - module: sync_grains  

php-fpm-role:  # 定义php-fpm角色  
  file.append:  
    - name: /etc/salt/roles  
    - text:  
      - 'php-fpm'  
    - require:  
      - file: roles  
      - service: php-fpm  
      - service: salt-minion  
    - watch_in:  
      - module: sync_grains 

/srv/salt/nginx/monitor.sls用于配置zabbix agent和监控脚本:

include:  
  - zabbix.agent  
  - nginx  

nginx-monitor:  
  pkg.installed:  # 安装脚本依赖的软件包  
    - name: perl-libwww-perl  

php-fpm-monitor-script:  # 管理监控脚本,如果脚本存放目录不存在自动创建  
  file.managed:  
    - name: /etc/zabbix/ExternalScripts/php-fpm_status.pl  
    - source: salt://nginx/files/etc/zabbix/ExternalScripts/php-fpm_status.pl  
    - user: root  
    - group: root  
    - mode: 755  
    - require:  
      - service: php-fpm  
      - pkg: nginx-monitor  
      - cmd: php-fpm-monitor-script  
  cmd.run:  
    - name: mkdir -p /etc/zabbix/ExternalScripts  
    - unless: test -d /etc/zabbix/ExternalScripts  

php-fpm-monitor-config:  # 定义zabbix客户端用户配置文件  
  file.managed:  
    - name: /etc/zabbix/zabbix_agentd.conf.d/php_fpm.conf  
    - source: salt://nginx/files/etc/zabbix/zabbix_agentd.conf.d/php_fpm.conf  
    - require:  
      - file: php-fpm-monitor-script  
      - service: php-fpm  
    - watch_in:  
      - service: zabbix-agent  

nginx-monitor-config:  # 定义zabbix客户端用户配置文件  
  file.managed:  
    - name: /etc/zabbix/zabbix_agentd.conf.d/nginx.conf  
    - source: salt://nginx/files/etc/zabbix/zabbix_agentd.conf.d/nginx.conf  
    - template: jinja  
    - require:  
      - service: nginx  
    - watch_in:  
      - service: zabbix-agent  

Salt Minion收集各个角色到/etc/salt/roles中,并生成grains,Salt Mine通过grains roles获取角色信息,当roles改变后通知Salt Mine更新。

roles:  
  file.managed:  
    - name: /etc/salt/roles  

sync_grains:  
  module.wait:  
    - name: saltutil.sync_grains  

mine_update:  
  module.run:  
    - name: mine.update  
    - require:  
      - module: sync_grains  

/srv/pillar/salt/minion.sls 定义Salt Mine functions:

mine_functions:
  test.ping: []
  grains.item: [id, hostgroup, roles, ipv4]

grains类似puppet facer,用于收集客户端相关的信息。本文grains脚本(/srv/salt/_grains/roles.py)通过读取/etc/salt/roles文件生成grains roles:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import os.path
def roles():
'''define host roles'''
roles_file = "/etc/salt/roles"
roles_list = []
if os.path.isfile(roles_file):
roles_fd = open(roles_file, "r")
for eachroles in roles_fd:
roles_list.append(eachroles[:-1])
return {'roles': roles_list}
if __name__ == "__main__":
print roles()

Zabbix api的配置通过/srv/salt/zabbix/api.sls进行管理,主要完成对zapi的安装、Zabbix api角色的添加、Zabbix api配置文件的管理、添加监控脚本的管理以及更新监控配置并添加监控。此配置未实现zabbix模板的自动导入,所以需要手动导入模板(/srv/salt/zabbix/files/etc/zabbix/api/templates/zbx_export_templates.xml)。

zabbix api 1
zabbix api 2

上面配置读取/srv/pillar/zabbix/api.sls配置文件:

zabbix api pillar

zabbix-api中定义zabbix url、用户名、密码以及监控配置目录和模板目录等。zabbix-base-templates定义基本监控模板,基本监控模板是需要加到所有机器上的。zabbix-templates定义角色与模板的对应关系。

添加监控脚本(/srv/salt/zabbix/files/etc/zabbix/api/add_monitors.py )如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
#!/bin/env python
#coding=utf8
##########################################################
# Add Monitor To Zabbix
##########################################################
import sys, os.path
import yaml
from zabbix.zapi import *
def _config(config_file):
'''get config'''
config_fd = open(config_file)
config = yaml.load(config_fd)
return config
def _get_templates(api_obj, templates_list):
'''get templates ids'''
templates_id = {}
templates_result = api_obj.Template.getobjects({"host": templates_list})
for each_template in templates_result:
template_name = each_template['name']
template_id = each_template['templateid']
templates_id[template_name] = template_id
return templates_id
def _get_host_templates(api_obj, hostid):
'''get the host has linked templates'''
templates_id = []
templates_result = api_obj.Template.get({'hostids': hostid})
for each_template in templates_result:
template_id = each_template['templateid']
templates_id.append(template_id)
return templates_id
def _create_hostgroup(api_obj, group_name):
'''create hostgroup'''
##check hostgroup exists
hostgroup_status = api_obj.Hostgroup.exists({"name": "%s" %(group_name)})
if hostgroup_status:
print "Hostgroup(%s) is already exists" %(group_name)
group_id = api_obj.Hostgroup.getobjects({"name": "%s" %(group_name)})[0]["groupid"]
else:
hostgroup_status = api_obj.Hostgroup.create({"name": "%s" %(group_name)})
if hostgroup_status:
print "Hostgroup(%s) create success" %(group_name)
group_id = hostgroup_status["groupids"][0]
else:
sys.stderr.write("Hostgroup(%s) create failed, please connect administrator\n" %(group_name))
exit(2)
return group_id
def _create_host(api_obj, hostname, hostip, group_ids):
'''create host'''
##check host exists
host_status = api_obj.Host.exists({"name": "%s" %(hostname)})
if host_status:
print "Host(%s) is already exists" %(hostname)
hostid = api_obj.Host.getobjects({"name": "%s" %(hostname)})[0]["hostid"]
##update host groups
groupids = [group['groupid'] for group in api_obj.Host.get({"output": ["hostid"], "selectGroups": "extend", "filter": {"host": ["%s" %(hostname)]}})[0]['groups']]
is_hostgroup_update = 0
for groupid in group_ids:
if groupid not in groupids:
is_hostgroup_update = 1
groupids.append(groupid)
if is_hostgroup_update == 1:
groups = []
for groupid in groupids:
groups.append({"groupid": "%s" %(groupid)})
host_status = api_obj.Host.update({"hostid": "%s" %(hostid), "groups": groups})
if host_status:
print "Host(%s) group update success" %(hostname)
else:
sys.stderr.write("Host(%s) group update failed, please connect administrator\n" %(hostname))
exit(3)
else:
groups = []
for groupid in group_ids:
groups.append({"groupid": "%s" %(groupid)})
host_status = api_obj.Host.create({"host": "%s" %(hostname), "interfaces": [{"type": 1, "main": 1, "useip": 1, "ip": "%s" %(hostip), "dns": "", "port": "10050"}], "groups": groups})
if host_status:
print "Host(%s) create success" %(hostname)
hostid = host_status["hostids"][0]
else:
sys.stderr.write("Host(%s) create failed, please connect administrator\n" %(hostname))
exit(3)
return hostid
def _create_host_usermacro(api_obj, hostname, usermacro):
'''create host usermacro'''
for macro in usermacro.keys():
value = usermacro[macro]
##check host exists
host_status = api_obj.Host.exists({"name": "%s" %(hostname)})
if host_status:
hostid = api_obj.Host.getobjects({"name": "%s" %(hostname)})[0]["hostid"]
##check usermacro exists
usermacros = api_obj.Usermacro.get({"output": "extend", "hostids": "%s" %(hostid)})
is_macro_exists = 0
if usermacros:
for usermacro in usermacros:
if usermacro["macro"] == macro:
is_macro_exists = 1
if usermacro["value"] == str(value):
print "Host(%s) usermacro(%s) is already exists" %(hostname, macro)
hostmacroid = usermacro["hostmacroid"]
else:
##usermacro exists, but value is not the same, update
usermacro_status = api_obj.Usermacro.update({"hostmacroid": usermacro["hostmacroid"], "value": "%s" %(value)})
if usermacro_status:
print "Host(%s) usermacro(%s) update success" %(hostname, macro)
hostmacroid = usermacro_status["hostmacroids"][0]
else:
sys.stderr.write("Host(%s) usermacro(%s) update failed, please connect administrator\n" %(hostname, macro))
exit(3)
break
if is_macro_exists == 0:
usermacro_status = api_obj.Usermacro.create({"hostid": "%s" %(hostid), "macro": "%s" %(macro), "value": "%s" %(value)})
if usermacro_status:
print "Host(%s) usermacro(%s) create success" %(hostname, macro)
hostmacroid = usermacro_status["hostmacroids"][0]
else:
sys.stderr.write("Host(%s) usermacro(%s) create failed, please connect administrator\n" %(hostname, macro))
exit(3)
else:
sys.stderr.write("Host(%s) is not exists" %(hostname))
exit(3)
return hostmacroid
def _link_templates(api_obj, hostname, hostid, templates_list, donot_unlink_templates):
'''link templates'''
all_templates = []
clear_templates = []
##get templates id
if donot_unlink_templates is None:
donot_unlink_templates_id = {}
else:
donot_unlink_templates_id = _get_templates(api_obj, donot_unlink_templates)
templates_id = _get_templates(api_obj, templates_list)
##get the host currently linked tempaltes
curr_linked_templates = _get_host_templates(api_obj, hostid)
for each_template in templates_id:
if templates_id[each_template] in curr_linked_templates:
print "Host(%s) is already linked %s" %(hostname, each_template)
else:
print "Host(%s) will link %s" %(hostname, each_template)
all_templates.append(templates_id[each_template])
##merge templates list
for each_template in curr_linked_templates:
if each_template not in all_templates:
if each_template in donot_unlink_templates_id.values():
all_templates.append(each_template)
else:
clear_templates.append(each_template)
##convert to zabbix api style
templates_list = []
clear_templates_list = []
for each_template in all_templates:
templates_list.append({"templateid": each_template})
for each_template in clear_templates:
clear_templates_list.append({"templateid": each_template})
##update host to link templates
update_status = api_obj.Host.update({"hostid": hostid, "templates": templates_list})
if update_status:
print "Host(%s) link templates success" %(hostname)
else:
print "Host(%s) link templates failed, please contact administrator" %(hostname)
##host unlink templates
if clear_templates_list != []:
clear_status = api_obj.Host.update({"hostid": hostid, "templates_clear": clear_templates_list})
if clear_status:
print "Host(%s) unlink templates success" %(hostname)
else:
print "Host(%s) unlink templates failed, please contact administrator" %(hostname)
def _main():
'''main function'''
hosts = []
if len(sys.argv) > 1:
hosts = sys.argv[1:]
config_dir = os.path.dirname(sys.argv[0])
if config_dir:
config_file = config_dir+"/"+"config.yaml"
else:
config_file = "config.yaml"
###get config options
config = _config(config_file)
Monitor_DIR = config["Monitors_DIR"]
Zabbix_URL = config["Zabbix_URL"]
Zabbix_User = config["Zabbix_User"]
Zabbix_Pass = config["Zabbix_Pass"]
Zabbix_Donot_Unlink_Template = config["Zabbix_Donot_Unlink_Template"]
if not hosts:
hosts = os.listdir(Monitor_DIR)
###Login Zabbix
zapi = ZabbixAPI(url=Zabbix_URL, user=Zabbix_User, password=Zabbix_Pass)
zapi.login()
for each_host in hosts:
each_config_fd = open(Monitor_DIR+"/"+each_host)
each_config = yaml.load(each_config_fd)
##Get config options
each_ip = each_config["IP"]
hostgroups = each_config["Hostgroup"]
each_templates = each_config["Templates"]
each_usermacros = each_config["Usermacros"]
###Create Hostgroup
groupids = []
for each_hostgroup in hostgroups:
group_id = _create_hostgroup(zapi, each_hostgroup)
groupids.append(group_id)
##Create Host
hostid = _create_host(zapi, each_host, each_ip, groupids)
if each_usermacros:
##Create Host Usermacros
for usermacro in each_usermacros:
if usermacro:
usermacrosid = _create_host_usermacro(zapi, each_host, usermacro)
if each_templates:
##Link tempaltes
_link_templates(zapi, each_host, hostid, each_templates, Zabbix_Donot_Unlink_Template)
if __name__ == "__main__":
_main()

参考:zabbix api

此脚本读取的配置文件(/srv/salt/zabbix/files/etc/zabbix/api/config.yaml):

Monitors_DIR: {{Monitors_DIR}}  
Templates_DIR: {{Templates_DIR}}  
Zabbix_URL: {{Zabbix_URL}}  
Zabbix_User: {{Zabbix_User}}  
Zabbix_Pass: {{Zabbix_Pass}}  
Zabbix_Donot_Unlink_Template:  # zabbix自动维护连接的模板,手动连接到主机上的模板需要在此处列出
  - 'Template OS Linux'