OpenStack Heat AutoScale Juno

Материал из noname.com.ua
Перейти к навигацииПерейти к поиску

OpenStack Heat AutoScaling (Juno)

Это короткая заметка про использование автоскейлинша с хитом и цейлометром в Juno (сборка Mirantis и в ванильном джуно может не хватать каких-то бекпортов, я это специально не проверял)

Введение

Дальше по тексту я буду приводить только части шаблонов, полные шаблоны - в конце, отдельным разделом. Это нужно для простоты - читать "простыню" не удобно.
Основная задача формулируется так:

  • Есть абстрактное приложение которое надо масштабировать от нагрузки
  • При превышении порога по какой-либо метрике добавлять +N экземпляров приложения (виртуальную машину)
  • При отсутствии нагрузки удалять -M экземпляров приложения (виртуальных машин)
  • M и N - целые числа (в моем примере они будут равны 1)

Создание стека

Команду создания стека я вынес в отдельный скрипт для простоты

D=` date  +%Y%m%d%H%M%S`

heat stack-create ABC-${D} -f asglb.yaml  \
-P "key_name=demo;net=net04;subnet=net04__subnet;public_net=net04_ext;app_lb_port=80;timeout=600;min_asg_size=1;max_asg_size=3;launch_asg_size=3"

Описание параметров

asg

asg:
    type: OS::Heat::AutoScalingGroup
    properties:
      desired_capacity: { get_param: launch_asg_size }
      max_size: { get_param: max_asg_size }
      min_size: { get_param: min_asg_size }
      cooldown: { get_param: cooldown }
      resource:
        type: app_server_neutron.yaml
        properties:
          mdata: {"metering.stack": {get_param: "OS::stack_id"}}
          image: { get_param: image }
          flavor: { get_param: flavor }
          key_name: { get_param: key_name }
          net: { get_param: net}
          app_sec_group: { get_resource: app_sec_group }
          ssh_sec_group: { get_resource: ssh_sec_group }
          app_port: { get_param: app_port }
          app_pool_id: { get_resource: app_pool }
          ssh_pool_id: { get_resource: ssh_pool }
          timeout: { get_param: timeout }

Обратить внимание на

  • desired_capacity - какой размер будет при старте
  • max_size, min_size - максимальный и минимальный размеры больше или меньше которых скелиться не будем.
  • cooldown - время после изменения размера группы в течении которого игнорируем другие запросы на изменение. Другими словами, если пришел запрос на увеличение, а потом почти срразу (за промежуток времени меньше cooldown ) - то второй запрос будет проигнорирован.
  • resource:
    • type: app_server_neutron.yaml - ссылка на описание сервера который собственно и есть инстансом аппликейшена.

Alarms

Ниже идет описание алармов. Так как это все таки тестовый стек - часть "кода" оставлена в комментариях. Мы создаем три аларма

  • Тестовый
  • загрузка процессора (хорошо описан в документации)
  • комбинированный

(кое-где описание в поле description может не соответвовать - копипаста все-таки)

  test_alarm_high:
    Type: OS::Ceilometer::Alarm
    Properties:
      description: 
      meter_name: test_meter
      statistic: max
      period: 60
      evaluation_periods: 1
      threshold: 2
#      alarm_actions:
#        - {"Fn::GetAtt": [scale_up, alarm_url]}
      matching_metadata: {"metadata.user_metadata.stack": {Ref: "AWS::StackId"}}
      comparison_operator: gt
      repeat_actions: true
  cpu_alarm_high:
    Type: OS::Ceilometer::Alarm
    Properties:
      description: Scale-up if the average CPU > 50% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 50
#      alarm_actions:
#        - {"Fn::GetAtt": [scale_up, alarm_url]}
      matching_metadata: {"metadata.user_metadata.stack": {Ref: "AWS::StackId"}}
      comparison_operator: gt
      repeat_actions: true
  up_alarm:
    Type: OS::Ceilometer::CombinationAlarm
    Properties:
      alarm_ids:
        - {"Ref" : "test_alarm_high"}
        - {"Ref" : "cpu_alarm_high"}
      alarm_actions:
        - {"Fn::GetAtt": [scale_up, alarm_url]}
      repeat_actions: true
      operator: or

Отправка данных

В юзердате пробрасываем такую строчку (напоминаю это просто тест потому тут реально не вычисляется израсходованная память а только захардкоженная константа)

        "ceilometer sample-create -r 6c3fd865-a478-4afc-909e-eced382de432  -m test_meter --meter-type gauge  --meter-unit percents --sample-volume 99 --resource-metadata '{\"metering.metadata.user_metadata.stack\": \"", { "Ref": "stack_id" }, "\"}'\n",
        { "Fn::GetAtt": ["handle", "curl_cli" ] }, " --data-binary '{\"status\": \"SUCCESS\"}'\n

на инстансе это выглядит так:

ceilometer sample-create  \
-r 6c3fd865-a478-4afc-909e-eced382de432  \
-m test_meter \
--meter-type gauge  \
--meter-unit percents \ 
--sample-volume 99 \
--resource-metadata '{"metering.metadata.user_metadata.stack": "arn:openstack:heat::4b8d0460864740d69624bd2035f30cf6:stacks/AWS-test-only-20151016141348/48230b63-7b6b-42ad-a690-370b6b1aa39b"}' \

asglb.yaml

HeatTemplateFormatVersion: 2012-12-12

Description: |
  Template which tests Neutron load balancing requests to members of
  Heat AutoScalingGroup.
  Instances must be running some webserver on a given app_port
  producing HTTP response that is different between servers
  but stable over time for given server.
  Auto-scaling is driven by Ceilometer alarms.
  Both HTTP and SSH access are load-balanced.

Parameters:
  flavor:
    Type: String
    Default: m1.compact
  image:
    Type: String
    Default: fedora-heat-test
  username:
    Type: String
    Default: root
  key_name:
    Type: String
    Default: ericsson
  net:
    Type: String
    Default: net04
  subnet:
    Type: String
    Default: net04__subnet
  public_net:
    Type: String
    Default: net04_ext
  app_port:
    Type: Number
    Default: 1026
  app_lb_port:
    Type: Number
    Default: 80
  timeout:
    Type: Number
    Default: 600
  min_asg_size:
    Type: Number
    Default: 1
  max_asg_size:
    Type: Number
    Default: 3
  launch_asg_size:
    Type: Number
    Default: 2
  cooldown:
    Type: Number
    Default: 60

Resources:
  CfnLBUser:
    Type: AWS::IAM::User
  CfnLBAccessKey:
    Type: AWS::IAM::AccessKey
    Properties:
      "UserName" : {"Ref": "CfnLBUser"}

  app_sec_group:
    Type: OS::Neutron::SecurityGroup
    Properties:
      rules:
      - remote_ip_prefix: 0.0.0.0/0
        protocol: tcp
        port_range_min: { Ref: app_port }
        port_range_max: { Ref: app_port }

  ssh_sec_group:
    Type: OS::Neutron::SecurityGroup
    Properties:
      rules:
      - remote_ip_prefix: 0.0.0.0/0
        protocol: tcp
        port_range_min: 22
        port_range_max: 22

  asg:
    Type: OS::Heat::AutoScalingGroup
    Properties:
      desired_capacity: { Ref: launch_asg_size }
      max_size: { Ref: max_asg_size }
      min_size: { Ref: min_asg_size }
      cooldown: { Ref: cooldown }
      resource:
#        type: https://raw.githubusercontent.com/olguncengiz/hot/master/app_server_neutron.yaml
        type: APP_server_neutron.yaml
        properties:
          mdata: {"metering.stack": {Ref: "AWS::StackId"}}
          image: { Ref: image }
          flavor: { Ref: flavor }
          key_name: { Ref: key_name }
          net: { Ref: net}
          app_sec_group: { Ref: app_sec_group }
          ssh_sec_group: { Ref: ssh_sec_group }
          app_port: { Ref: app_port }
          app_pool_id: { Ref: app_pool }
          ssh_pool_id: { Ref: ssh_pool }
          timeout: { Ref: timeout }
          #mem_alarm_low: { "Ref" : "mem_alarm_low" }
          #mem_alarm_high: { "Ref" : "mem_alarm_high" }
          access_key: { "Ref" : "CfnLBAccessKey" }
          secret_key: { "Fn::GetAtt": ["CfnLBAccessKey","SecretAccessKey"] }
          stack_id: { "Ref" : "AWS::StackId" }
  scale_up:
    Type: OS::Heat::ScalingPolicy
    Properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { Ref: asg }
      scaling_adjustment: 1

  scale_down:
    Type: OS::Heat::ScalingPolicy
    Properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { Ref: asg }
      scaling_adjustment: -1


  test_alarm_high:
    Type: OS::Ceilometer::Alarm
    Properties:
      description: Scale-up if the average CPU > 50% for 1 minute
      meter_name: test_meter
      statistic: max
      period: 60
      evaluation_periods: 1
      threshold: 2
#      alarm_actions:
#        - {"Fn::GetAtt": [scale_up, alarm_url]}
      matching_metadata: {"metadata.user_metadata.stack": {Ref: "AWS::StackId"}}
      comparison_operator: gt
      repeat_actions: true



  cpu_alarm_high:
    Type: OS::Ceilometer::Alarm
    Properties:
      description: Scale-up if the average CPU > 50% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 50
#      alarm_actions:
#        - {"Fn::GetAtt": [scale_up, alarm_url]}
      matching_metadata: {"metadata.user_metadata.stack": {Ref: "AWS::StackId"}}
      comparison_operator: gt
      repeat_actions: true


  up_alarm:
    Type: OS::Ceilometer::CombinationAlarm
    Properties:
      alarm_ids:
        - {"Ref" : "test_alarm_high"}
        - {"Ref" : "cpu_alarm_high"}
      alarm_actions:
        - {"Fn::GetAtt": [scale_up, alarm_url]}
      repeat_actions: true
      operator: or


  cpu_alarm_low:
    Type: OS::Ceilometer::Alarm
    Properties:
      description: Scale-down if the average CPU < 15% for 1 minute
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 15
      alarm_actions:
        - {"Fn::GetAtt": [scale_down, alarm_url]}
      matching_metadata: {"metadata.user_metadata.stack": {Ref: "AWS::StackId"}}
      comparison_operator: lt
      repeat_actions: true

#  mem_alarm_high:
#    Type: OS::Heat::CWLiteAlarm
#    Properties:
#      AlarmDescription: ""
#      MetricName: MemoryUtilization
#      Namespace: "system/linux"
#      Statistic: Maximum
#      Period: "60"
#      EvaluationPeriods: "1"
#      Threshold: "50"
#      AlarmActions: [ { "Ref": "scale_up" } ]
#      ComparisonOperator: GreaterThanThreshold
#      Dimensions:
#      - Name: AutoScalingGroupName
#        Value: { Ref: asg }
#
#  mem_alarm_low:
#    Type: OS::Heat::CWLiteAlarm
#    Properties:
#      AlarmDescription: ""
#      MetricName: MemoryUtilization
#      Namespace: "system/linux"
#      Statistic: Maximum
#      Period: "60"
#      EvaluationPeriods: "1"
#      Threshold: "10"
#      AlarmActions: [ { "Ref": "scale_down" } ]
#      ComparisonOperator: LessThanThreshold
#      Dimensions:
#      - Name: AutoScalingGroupName
#        Value: { Ref: asg }


  app_health_monitor:
    Type: OS::Neutron::HealthMonitor
    Properties:
      delay: 3
      type: HTTP
      timeout: 3
      max_retries: 3

  app_pool:
    Type: OS::Neutron::Pool
    Properties:
      lb_method: ROUND_ROBIN
      protocol: HTTP
      subnet: { Ref: subnet }
      monitors:
      - { Ref: app_health_monitor }
      vip:
        protocol_port: { Ref: app_lb_port }

  app_floating_ip:
     Type: OS::Neutron::FloatingIP
     Properties:
       floating_network: { Ref: public_net }
       port_id:
         {  "Fn::Select": [ "port_id", { "Fn::GetAtt": [app_pool, vip] } ]  }

  ssh_pool:
    Type: OS::Neutron::Pool
    Properties:
      lb_method: ROUND_ROBIN
      protocol: TCP
      subnet: { Ref: subnet }
      vip:
        protocol_port: 22

  ssh_floating_ip:
     Type: OS::Neutron::FloatingIP
     Properties:
       floating_network: { Ref: public_net }
       port_id:
         {  "Fn::Select": [ "port_id",  {"Fn::GetAtt": [ssh_pool, vip] } ]  }

Outputs:
#  pool_vip:
#    "Value": { "Fn::GetAtt": [ssh_pool, vip] }
  test:
    "Value":  {  "Fn::Select": [ "port_id", {  "Fn::GetAtt": [ssh_pool, vip] } ]  }
  AWSAccessKey:
    Value: { "Ref" : "CfnLBAccessKey" }
  AWSSecretKey:
    Value: { "Fn::GetAtt": ["CfnLBAccessKey","SecretAccessKey"] }
  Stack:
    Value: { "Ref" : "AWS::StackId" }
  Region:
    Value: { "Ref" : "AWS::Region" }
#  AlarmMemHigh:
#    Value: { "Ref" : "mem_alarm_high" }
#  AlarmMemLow:
#    Value: { "Ref" : "mem_alarm_low" }
  WaitNotify:
    Value: { "Fn::GetAtt": ["asg", "WaitNotify"]  }

#  app_lb_url:
#    Description: URL of the loadbalanced app
#    Value:
#      str_replace:
#        template: http://IP_ADDRESS:PORT
#        params:
#          IP_ADDRESS: { "Fn::GetAtt": [ app_floating_ip, floating_ip_address ] }
#          PORT: { Ref: app_lb_port }
#
#  ssh_lb_url:
#    Description: command for the loadbalanced SSH access
#    Value:
#      str_replace:
#        template: ssh -i KEY.pem USER@IP_ADDRESS
#        params:
#          IP_ADDRESS: { "Fn::GetAtt": [ ssh_floating_ip, floating_ip_address ] }
#          KEY: { Ref: key_name }
#          USER: { Ref: username }
#
#  scale_up_hook:
#    Description: POST to this URL for manual scale up
#    Value: {"Fn::GetAtt": [scale_up, alarm_url]}
#
#  scale_down_hook:
#    Description: POST to this URL for manual scale up
#    Value: {"Fn::GetAtt": [scale_down, alarm_url]}

APP_server_neutron.yaml

HeatTemplateFormatVersion: 2012-12-12

Description: |
  App server that is a member of Neutron Pool.

Parameters:
  mdata:
    Type: Json
  image:
    Type: String
  flavor:
    Type: String
  key_name:
    Type: String
  net:
    Type: String
  app_sec_group:
    Type: String
  ssh_sec_group:
    Type: String
  app_pool_id:
    Type: String
  ssh_pool_id:
    Type: String
  app_port:
    Type: Number
    Default: 1026
  timeout:
    Type: Number
#  mem_alarm_low:
#    Type: String
#  mem_alarm_high:
#    Type: String
  secret_key:
    Type: String
  access_key:
    Type: String
  stack_id:
    Type: String

Resources:

  server:
    Type: OS::Nova::Server
    Properties:
      metadata: { Ref: mdata }
      image: { Ref: image }
      flavor: { Ref: flavor }
      key_name: { Ref: key_name }
      networks:
        - network: { Ref: net }
      security_groups:
        - { Ref: app_sec_group }
        - { Ref: ssh_sec_group }
      user_data:
        "Fn::Base64": {
        "Fn::Join" : ["",
        [
        "#!/bin/bash -v\n",
        "\n",
        "#yum -y install heat-cfntools-1.2.6-4.el6.noarch\n",
        "#/usr/bin/cfn-create-aws-symlinks\n",
        "\n",
        "mkdir  -p   \"/etc/cfn/\"\n ",
        "touch /etc/cfn/cfn-credentials\n",
        "echo   \"AWSAccessKeyId=\"",  { "Ref" : "access_key" } , ">>/etc/cfn/cfn-credentials\n",
        "echo   \"AWSSecretKey=\"", {"Ref" : "secret_key" }, ">> /etc/cfn/cfn-credentials\n" ,
        "\n",
        "service crond restart \n",
        "yum -y install cpulimit stress screen \n",
        "cd /tmp ; git clone https://github.com/julman99/eatmemory.git; cd eatmemory; make;  make install \n",
        "pip install python-ceilometerclient \n",
        "echo \"export LC_ALL=C\" >> /root/openrc \n",
        "echo \"export OS_NO_CACHE='true'\" >> /root/openrc \n",
        "echo \"export OS_TENANT_NAME='admin'\" >> /root/openrc \n",
        "echo \"export OS_USERNAME='admin'\" >> /root/openrc \n",
        "echo \"export OS_PASSWORD='admin'\" >> /root/openrc \n",
        "echo \"export OS_AUTH_URL='http://159.8.10.162:5000/v2.0/'\" >> /root/openrc \n",
        "echo \"export OS_AUTH_STRATEGY='keystone'\" >> /root/openrc \n",
        "echo \"export OS_REGION_NAME='RegionOne'\" >> /root/openrc \n",
        "echo \"export CINDER_ENDPOINT_TYPE='publicURL'\" >> /root/openrc \n",
        "echo \"export GLANCE_ENDPOINT_TYPE='publicURL'\" >> /root/openrc \n",
        "echo \"export KEYSTONE_ENDPOINT_TYPE='publicURL'\" >> /root/openrc \n",
        "echo \"export NOVA_ENDPOINT_TYPE='publicURL'\" >> /root/openrc \n",
        "echo \"export NEUTRON_ENDPOINT_TYPE='publicURL'\" >> /root/openrc \n",
        "ceilometer sample-create -r 6c3fd865-a478-4afc-909e-eced382de432  -m test_meter --meter-type gauge  --meter-unit percents --sample-volume 99 --resource-metadata '{\"metering.metadata.user_metadata.stack\": \"", { "Ref": "stack_id" }, "\"}'\n",
        { "Fn::GetAtt": ["handle", "curl_cli" ] }, " --data-binary '{\"status\": \"SUCCESS\"}'\n"
         ] ]
        }


  handle:
    Type: OS::Heat::WaitConditionHandle

  waiter:
    Type: OS::Heat::WaitCondition
    DependsOn: server
    Properties:
      timeout: { Ref: timeout }
      handle: { Ref: handle }

  app_pool_member:
    Type: OS::Neutron::PoolMember
    DependsOn: waiter
    Properties:
      address: { "Fn::GetAtt": [ server, first_address ] }
      pool_id: { Ref: app_pool_id }
      protocol_port: { Ref: app_port }

  ssh_pool_member:
    Type: OS::Neutron::PoolMember
    DependsOn: waiter
    Properties:
      address: { "Fn::GetAtt": [ server, first_address ] }
      pool_id: { Ref: ssh_pool_id }
      protocol_port: 22

Outputs:
  WaitNotify:
    Value: { "Fn::GetAtt": ["handle", "curl_cli" ]  }